笔记:Mining of Massive Datasets
朋友推荐,于是最近在看 Mining of Massive Datasets - Jure Leskovec, Anand Rajaraman, Jeff Ullman,此书资源在官网可自由获取。
第一次尝试将笔记整理成 iPython notebook 形式,效果很棒!开始打算是将练习题尽量全做完的,但发现挺费时,后面主要是笔记整理为主,感兴趣的习题会尝试去解。
目录
1 Data Mining
2 Map-Reduce and the New Software Stack
3 Finding Similar Items
4 Mining Data Streams
5 Link Analysis
6 Frequent Itemsets
7 Clustering
8 Advertising on the Web
9 Recommendation Systems
10 Mining Social-Network Graphs
11 Dimensionality Reduction
12 Large-Scale Machine Learning
后记:
花费6个月时间,断断续续看完,哈希和近似的想法真是开阔了眼界。第一回看比较急促,此书值得反复看,多实践。