CS425 – Algorithms for Web-Scale Data

Fall 2015

 

Instructor: Mustafa Ozdal (EA420)
TA: Huseyin Gokhan Akcay (EA-427)
Tutor: Orcun Gumus

 

Textbook: A. Rajaraman and J. D. Ullman, Mining of Massive Datasets, Cambridge University Press, 2011. Online free version available at: http://www.mmds.org

 

Schedule: Tue. 15:40-16:30, Fri. 13:40-15:30 (EA-Z03)

Spare Hour: Tue. 16:40-17:30 (EA-Z03)

Syllabus: syllabus.pdf


Announcements:

·         24/11/2015: You should have received the project presentation schedule by now. If not, contact the TA or tutor immediately.

·       26/10/2015: Midterm topics will cover lectures 1-6 (WebAdvertising will not be included). You will be allowed to bring a single-sided A4-size cheatsheet with you. You are supposed to turn in your cheatsheet with your exam.

·       5/10/2015: Midterm will be on 13/11/2015 during lecture hours (13:30-15:30). The classrooms reserved for the midterm are EA-Z03, EB-102, and EB-103.

·       5/10/2015: You can find the project description document here. Note that you are supposed to send the TA and tutor your project proposals by October 16. See the pdf file for details.

·       18/09/2015: No class on 22/09/2015. Instead, we will use the back-up hour on 29/09/2015.

·       10/09/2015: Students are expected to check this page regularly for important announcements.

 

Course Project

 

The project description with the upcoming deadlines can be found here.

 

Lectures

Note: Some lecture notes provided below contain slides from the course textbook. Some of these slides have been modified for the purpose of this class. The original slides from the textbook can be accessed here.

 

Lecture 1: PageRank Formulation and Algorithm (slides: ppt, pdf; reading material: Chapter 5)

Lecture 2: PageRank Extensions (slides: ppt, pdf; reading material: Chapter 5)

Lecture 3: Shingling, Min-Hashing, and LSH (slides: ppt, pdf; reading material: Chapter 3)

Lecture 4: LSH Applications (slides: ppt, pdf; reading material: Chapter 3)

Lecture 5: MapReduce Model and Examples (slides: ppt, pdf; reading material: Chapter 2)

Lecture 6: MapReduce Complexity Analysis and Improved Algorithms (slides: ppt, pdf; reading material: Chapter 2)

Lecture 7: Web Advertising (slides: ppt, pdf; reading material: Chapter 8)

Lecture 8: Recommendation Systems: Content-Based and Collaborative Filtering (slides: ppt, pdf; reading material: Chapter 9)

Lecture 9: Recommendation Systems: Latent Factor Models and Netflix Challenge (slides: ppt, pdf; reading material: Chapter 9)