Tag Archives: Advertising on the Web

[book ]Mining of Massive Datasets


The book has now been published by Cambridge University Press. A hardcopy can be obtained Here. By agreement with the publisher, you can still download it free from this page. Cambridge Press does, however, retain copyright on the work, and we expect that you will acknowledge our authorship if you republish parts or all of it. We are sorry to have to mention this point, but we have evidence that other items we have published on the Web have been appropriated and republished under other names. It is easy to detect such misuse, by the way, as you will learn in Chapter 3.

— Anand Rajaraman (@anand_raj) and Jeff Ullman


Download the Complete Book (340 pages, approximately 2MB)

Download chapters of the book:

Preface and Table of Contents
Chapter 1 Data Mining
Chapter 2 Large-Scale File Systems and Map-Reduce
Chapter 3 Finding Similar Items
Chapter 4 Mining Data Streams
Chapter 5 Link Analysis
Chapter 6 Frequent Itemsets
Chapter 7 Clustering
Chapter 8 Advertising on the Web
Chapter 9 Recommendation Systems

Gradiance Support

If you are an instructor interested in using the Gradiance Automated Homework System with this book, start by creating an account for yourself at www.gradiance.com/services. Then, email your chosen login and the request to become an instructor for the MMDS book to support@gradiance.com You will then be able to create a class using these materials. Manuals explaining the use of the system are atwww.gradiance.com/info.html.

Students who want to use the Gradiance system for self-study can register at www.gradiance.com/services. Then, use the class token 1EDD8A1D to join the “omnibus class” for the MMDS book. SeeThe Student Guide for more information.

Other Stuff

  • Slides and Course Material from old CS345A. Like the book, you are welcome to use these as you like, but please preserve our authorship. 
  • The Errata Sheet for the hardcopy version. We shall endeavor to keep the downloads up to date. Note that the pagination is different on the version we maintain, but you can check whether your download is up-to date from the hardcopy errata. Please report errata to ullman a t gmail.com.