CS 550: Machine Learning
Spring '18

Course Description

Instructor:Cigdem Gunduz Demir
EA 423 (Engineering Building), x3443
gunduz at cs bilkent edu tr
Lectures: Tue 15:40-17:30, Fri 13:40-15:30, EB 204
Office hours:Tue 12:40-14:30 or by appointment
Website: http://www.cs.bilkent.edu.tr/~gunduz/teaching/cs550
References: R.O. Duda, P.E. Hart, D.G, Stork, Pattern Classification, Wiley-Interscience, 2001.
E. Alpaydin, Introduction to Machine Learning, MIT Press, 2004.
T.M. Mitchell, Machine Learning, McGraw-Hill, 1997.
P.-N. Tan, M. Steinbach, V. Kumar, Introduction to Data Mining, Addison-Wesley, 2005.

This course has two parts. The first part includes an introduction to the basic machine learning concepts and algorithms, which will also provide the basis for the second part of the course. The second part covers selected recent topics in machine learning. In particular, the course will cover the following main topics:

Part 1:
  • Introduction
  • CS550_Introduction.pdf
    CS550_BayesianDecisionTheory.pdf
    CS550_AlgorithmIndependentIssues.pdf
    CS550_DimensionalityReduction.pdf
  • Decision trees
  • CS550_DecisionTrees.pdf
  • Artificial neural networks
  • CS550_NeuralNetworks.pdf
  • Unsupervised learning and clustering
  • CS550_Clustering.pdf
  • Reinforcement learning
  • CS550_ReinforcementLearning.pdf
  • Genetic algorithms
  • CS550_GeneticAlgorithms.pdf
    Part 2:
  • Ensemble learning
  • CS550_EnsembleLearning.pdf
  • Cost-sensitive learning
  • CS550_CostSensitiveLearning.pdf
  • Active learning
  • Deep learning
  • CS550_DeepNeuralNetworks.pdf

    Grading

    Homework (20%)
    Final (35%)
    Survey (15%)
    Presentation (10%)
    Project (20%)

    Homework assignments and late policy

    Homework assignments will be posted on this web site. Assignments will have some programming and non-programming parts and you are expected to work individually for the assignments. For the late assignments, 10 percent of the grade will be deducted per day after the assignment's due date.

  • Homework 1, due by 15:40 on Tuesday, March 13th
       You can download the data sets from here: ann-train.data ann-test.data
  • Homework 2, due by 15:40 on Tuesday, April 17th
  • Homework 3, due by 17:30 on Monday, May 21st
       You can download the sample image from here
  • Final

    The final will be at the end of the semester. Its date will be announced later. You may use one A4 cheat sheet for the final. You should prepare this cheat sheet by your handwriting. No photocopy is allowed. I will collect these cheat sheets with your exam papers.

    Survey

    You will work in a group of two. (If there is an odd number of students, one group will be of three.) Each group will prepare a survey on the topic of their interest by reading at least 15-20 scientific papers and writing a short report (maximum of 3 pages including citations).

    In your report, give the problem/topic definition, discuss the motivation behind the studies working on this problem/topic (just try to answer the question of "why have all these studies worked on this problem? is it really important?"), and then explain the studies. While explaining the studies, do NOT list the studies and do NOT explain them one by one. Instead, understand the contribution and methodology of each study, try to group the studies according to their contributions and methodologies, and then explain/discuss the studies as groups (like writing a good introduction section to a scientific paper). In your discussion, do not forget to give the common approach followed by each group also discussing the variations that exist within the studies of that group, give the advantages and disadvantages of each group's approach, and discuss the similarities and differences in between the approaches followed by different groups. The quality of the survey as well as those of the selected papers will affect your grade (select good papers published in prestigious conferences and journals). Addionally, the format, structure, and writing style of your report (including writing the citations properly) will be a part of your grade.

    Although there is no restriction for the topic that you will select (of course as long as it is related with the course contents), you should take my consent for your topic selection. Since you will make a presentation at the end and since we want minimum overlaps in between these presentations, I will not allow two groups selecting a very similar topic; I will approve the selections on a first-come-first-serve basis. Examples of the topics include but are not limited to
  • Deep learning for medical image segmentation (or for something else)
  • Machine learning for telecommunication networks (or for something else)
  • Active learning for remote sensing (or for something else)
  • Reinforcement learning for computer games (or for something else)
  • Deep learning in robotics (or in something else)
  • Machine learning for computer security (or for something else)
  • Ensemble methods in text retrieval (or in something else)
  • Machine learning in finance (or in something else)
  • ...
  • Presentation

    Each group will make a presentation on their surveys in class. Every group member should take a part in presentation. The presentations should be in parallel with your report. You will have approximately 15-20 minutes for your presentation; I will let you know the exact duration after the add-drop period.

    The presentation content, its format and layout, and the way that you present it will affect your grade. Prepare your slides neatly and properly. Do not copy and paste any text/equation/table from a paper (if necessary, type them). If you need to use a figure (or an image) of a paper, take it but give a credit to this paper (so that we can understand how much afford you put in preparing your presentation).

    I will not take attendance in these presentations. However, I may ask some questions related to the presentations in the final exam.

    Project

    The term project is the continuation of your survey. You (as an individual student not as a group) will choose one of the papers that your group will have selected and then implement the algorithm proposed by this paper. You need to take my consent for your selection.

    After implementing the originally proposed algorithm, you need to propose and implement a slight modification of this originally proposed algorithm. Thus, as a group of two, you will have four different algorithms on the same topic (two are from the papers and two are the slight modifications). Now, as a group, run all these four algorithms on the same dataset and compare their results. In your comparison, also use statistical tests.

    At the end, as a group, you will write a report (maximum of 6 pages). In your report, explain the originally proposed algorithms and your proposed modifications, give your experimental settings (datasets, parameter selection, etc), present the results of each algorithm as well as the comparison results, and discuss all these results. Prepare your report neatly and properly. The content of your report as well as its format, structure, and writing style will affect your grade. Similarly, do not copy and paste any text/equation/table from a paper (if necessary, type them). If you need to use a figure (or an image) of a paper, take it but give a credit to this paper.

    In the implementation of the originally proposed paper, do not use the codes provided by its authors (if any). Here your grade will be based on what you will have implemented. I expect you to write a reasonable amount of your own code (if you mostly used the third-party codes, you would lose a significant amount of points).

    In the proposal of a slight modification, the important thing for me is the interestingness of what you will have proposed and whether or not you will have implemented it. I will not grade your proposal with respect to its performance. Thus, feel free to propose adventurous modifications. Of course, I expect you to compare the algorithms properly, by doing proper experiments.

    About forming groups

    You should let me know your group as soon as possible, if you have any preference (see the deadlines below). Otherwise, I will form the groups by myself. If your group-mate is not involved in the survey/presentation/project, please let me know this as quick as possible so I can take necessary actions for you not to be affected from this undesired situation.

    If you are in a group of three, I expect you to review more papers for the survey and compare six algorithms (one original and one modification per student) for the project.

    Deadlines for the survey/presentation/project

    You will lose points if you miss these deadlines.

  • Feb 12:
  • Your group preference, if any
  • Feb 26:
  • Topic selection for the survey (as a group)
  • Mar 12:
  • Paper selection for the project (as an individual)
  • Apr 10 (class time):
  • Final report for the survey
  • Apr 17 - May 11:
  • Presentations
  • Apr 24 (class time):
  • Progress report for the project
  • May 28 (5 pm):
  • Final report for the project

    Academic integrity

    This course follows the Bilkent University Code of Academic Integrity, as explained in the Student Disciplinary Rules and Regulation. Violations of the rules will not be tolerated.