CS425 – Algorithms for Web-Scale Data

Spring 2019

 

Instructor: Mustafa Ozdal (EA420).
TA: Selcuk Gulcan (selcuk.gulcan _at_ bilkent.)

 

Textbook: A. Rajaraman and J. D. Ullman, Mining of Massive Datasets, Cambridge University Press, 2011. Online free version available at: http://www.mmds.org

 

Schedule: Tue: 13:40 - 15:30;  Thu: 15:40 - 17:30
Office Hour: Tue: 15:40 - 16:30 (EA420)


Syllabus: syllabus.pdf


Course Project:

 

The project description can be found here.

 

Lectures

Note: Some lecture notes provided below contain slides from the course textbook. Some of these slides have been modified for the purpose of this class. The original slides from the textbook can be accessed here.

 

Lecture 1: PageRank Formulation and Algorithm (slides: ppt, pdf; reading material: Chapter 5)

Lecture 2: PageRank Extensions (slides: ppt, pdf; reading material: Chapter 5)

Lecture 3: Shingling, Min-Hashing, and LSH (slides: ppt, pdf; reading material: Chapter 3)

Lecture 4: LSH Applications (slides: ppt, pdf; reading material: Chapter 3)

Lecture 5: MapReduce Model and Examples (slides: ppt, pdf; reading material: Chapter 2)

Lecture 6: MapReduce Complexity Analysis and Improved Algorithms (slides: ppt, pdf; reading material: Chapter 2)

Lecture 7: Web Advertising (slides: ppt, pdf; reading material: Chapter 8)

Lecture 8: Recommendation Systems: Content-Based and Collaborative Filtering (slides: ppt, pdf; reading material: Chapter 9)

Lecture 9: Recommendation Systems: Latent Factor Models and Netflix Challenge (slides: ppt, pdf; reading material: Chapter 9)