CS 550: Machine Learning
Spring '17

Course Description

Instructor:Cigdem Gunduz Demir
EA 423 (Engineering Building), x3443
gunduz at cs bilkent edu tr
Lectures: Tue 13:40-15:30, Thu 15:40-17:30, EB 102
Office hours:Tue 10:40-12:30 or by appointment
Website: http://www.cs.bilkent.edu.tr/~gunduz/teaching/cs550
References: R.O. Duda, P.E. Hart, D.G, Stork, Pattern Classification, Wiley-Interscience, 2001.
E. Alpaydin, Introduction to Machine Learning, MIT Press, 2004.
T.M. Mitchell, Machine Learning, McGraw-Hill, 1997.
P.-N. Tan, M. Steinbach, V. Kumar, Introduction to Data Mining, Addison-Wesley, 2005.

This course has two parts. The first part includes an introduction to the basic machine learning concepts and algorithms, which will also provide the basis for the second part of the course. The second part covers selected recent topics in machine learning. In particular, the course will cover the following main topics:

Part 1:
  • Bayesian decision theory
  • Decision trees
  • Artificial neural networks
  • Unsupervised learning and clustering
  • Reinforcement learning
  • Genetic algorithms
Part 2:
  • Ensemble learning
  • Cost-sensitive learning
  • Active learning
  • Deep learning

Grading

Homework (30%)
Midterm (35%)
Project (25%)
Presentation (10%)

Homework assignments and late policy

Homework assignments will be posted on this web site. Assignments will have some programming and non-programming parts and you are expected to work individually for the assignments. For the late assignments, 10 percent of the grade will be deducted per day after the assignment's due date.

Midterm

The midterm date is April 18, class time. You may use one A4 cheat sheet for the midterm. You should prepare this cheat sheet by your handwriting. No photocopy is allowed. I will collect these cheat sheets with your exam papers.

Projects

The term project is in the form of a competition. You will be provided with an annotated dataset, including many different feature sets. You are asked to design a classification system on this dataset using the algorithms/approaches covered by this course. At the end of the semester, your classification system will be evaluated on another dataset, which will not be provided you for your design. The classification accuracy obtained on this dataset will also affect your grade.

You will conduct this project in a group of two. You should send me an email if you prefer to form a group with a particular class mate of yours. The deadline for sending this email is February 16. Otherwise, I will form the groups.

After forming the groups, the first deadline is March 28. For this, you are expected to start analyzing the dataset and applying some learning algorithms/approaches on it. You need to write a short report (maximum of 3 pages) describing what algorithms/approaches you used, what analyses you did, what results you obtained, and how you interpreted these results. Then, you should submit your final code of your system and a final report (maximum of 6 pages) on May 9. Your report should include all details of your system design, the methods and the experimental results you used to decide on this design, and how you interpreted those results.

You are expected to write your reports neatly and properly. The format, structure, and writing style of your reports as well as the quality of the tables and figures in your reports will be a part of your grade.

After submitting your final report, I will schedule time for each group in order for you to explain me your design and your experiments. Additionally, you will answer my questions about your project, if I have any. Before coming to this scheduled meeting, you should run your code on the new data I will send you (without sending you their actual labels), and email me the labels that your designed system predicts. I will compare the actual and predicted labels to see how your system works on the unseen data.

In this project, you may use third-party codes, for example, some from Weka. However, you may lose a significant amount of points if your design mainly relies on the use of such third-party codes. In the demo, you should explain exactly what parts of the code were implemented by you and what parts were the third-party codes.

Presentation

You should find a paper and present it in a group of two. The paper you will find should be related to one of the following topics: deep learning or active learning. Additionally, this paper should be published in between 2014-2017 or in press and in one of the following journals: IEEE Transactions on Pattern Analysis and Machine Intelligence, Pattern Recognition, IEEE Transactions on Neural Networks, or Journal of Machine Learning Research. You can also select your paper from the proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR) or the proceedings of the Conference on Neural Information Processing Systems (NIPS) published in between 2014-2017. Each group should make a 15-20 minutes presentation.

You should send me an email if you have any preference of your groups. The deadline for sending this email is also February 16. Otherwise, I will form the groups. After forming the groups, you should email me the paper you want to present. Its deadline is March 14. Each group should select the paper they want to present. The quality of the paper you select will also affect your grade.

Each student should attend at least 50 percent of all presentations. If you miss more presentations, you will lose 1 point (from the overall 10 points) for each presentation you miss.

Academic integrity

This course follows the Bilkent University Code of Academic Integrity, as explained in the Student Disciplinary Rules and Regulation. Violations of the rules will not be tolerated.