笔记：Mining of Massive Datasets

Oct 9, 2015

朋友推荐，于是最近在看 Mining of Massive Datasets - Jure Leskovec, Anand Rajaraman, Jeff Ullman，此书资源在官网可自由获取。

第一次尝试将笔记整理成 iPython notebook 形式，效果很棒！开始打算是将练习题尽量全做完的，但发现挺费时，后面主要是笔记整理为主，感兴趣的习题会尝试去解。

目录

1 Data Mining

2 Map-Reduce and the New Software Stack

3 Finding Similar Items

4 Mining Data Streams

5 Link Analysis

6 Frequent Itemsets

7 Clustering

8 Advertising on the Web

9 Recommendation Systems

11 Dimensionality Reduction

12 Large-Scale Machine Learning

后记:
花费6个月时间，断断续续看完，哈希和近似的想法真是开阔了眼界。第一回看比较急促，此书值得反复看，多实践。