CS 550: Machine Learning
Fall '18

Course Description

Instructor:Cigdem Gunduz Demir
EA 423 (Engineering Building), x3443
gunduz at cs bilkent edu tr
Lectures: Tue 10:40-12:30, Fri 8:40-10:30, EB 202
Office hours:Tue 12:40-14:30 or by appointment
Website: http://www.cs.bilkent.edu.tr/~gunduz/teaching/cs550
References: R.O. Duda, P.E. Hart, D.G, Stork, Pattern Classification, Wiley-Interscience, 2001.
E. Alpaydin, Introduction to Machine Learning, MIT Press, 2004.
T.M. Mitchell, Machine Learning, McGraw-Hill, 1997.
P.-N. Tan, M. Steinbach, V. Kumar, Introduction to Data Mining, Addison-Wesley, 2005.

This course has two parts. The first part includes an introduction to the basic machine learning concepts and algorithms, which will also provide the basis for the second part of the course. The second part covers selected recent topics in machine learning. In particular, the course will cover the following main topics:

Part 1:
  • Introduction
  • CS550_Introduction.pdf
    CS550_BayesianDecisionTheory.pdf
    CS550_AlgorithmIndependentIssues.pdf
    CS550_DimensionalityReduction.pdf
  • Decision trees
  • CS550_DecisionTrees.pdf
  • Artificial neural networks
  • CS550_NeuralNetworks.pdf
  • Unsupervised learning and clustering
  • CS550_Clustering.pdf
  • Reinforcement learning
  • CS550_ReinforcementLearning.pdf
  • Genetic algorithms
  • CS550_GeneticAlgorithms.pdf
    Part 2:
  • Ensemble learning
  • CS550_EnsembleLearning.pdf
  • Cost-sensitive learning
  • CS550_CostSensitiveLearning.pdf
  • Active learning
  • Deep learning
  • CS550_DeepNeuralNetworks.pdf

    Grading

    Homework (25%)
    Midterm (35%)
    Survey (15%)
    Presentation (10%)
    Project (15%)

    Homework assignments and late policy

    Homework assignments will be posted on this web site. Assignments will have some programming and non-programming parts and you are expected to work individually for the assignments. For the late assignments, 10 percent of the grade will be deducted per day after the assignment's due date.

  • Homework 1, due by 10:40 on Tuesday, November 13th
       You can download the first dataset from here: ann-train.data ann-test.data
  • Homework 2, due by 17:30 on Wednesday, November 28th
       You can download the sample image from here
  • Homework 3, due by 10:40 on Tuesday, December 18th
       You can download the dataset from here: ann-train.data ann-test.data ann-thyroid.cost
  • Midterm

    The midterm date is December 4, class time. You may use one A4 cheat sheet for the midterm. You should prepare this cheat sheet by your handwriting. No photocopy is allowed. I will collect these cheat sheets with your exam papers.

    Survey

    You will work in a group of two. (If there is an odd number of students, one group will be of three.) Each group will prepare a survey on the topic of their interest by reading at least 15-20 scientific papers and writing a short report (maximum of 3 pages including citations).

    In your report, give the problem/topic definition, discuss the motivation behind the studies working on this problem/topic (just try to answer the question of "why have all these studies worked on this problem? is it really important?"), and then explain the studies. While explaining the studies, do NOT list the studies and do NOT explain them one by one. Instead, understand the contribution and methodology of each study, try to group the studies according to their contributions and methodologies, and then explain/discuss the studies as groups (like writing a good introduction section to a scientific paper). In your discussion, do not forget to give the common approach followed by each group also discussing the variations that exist within the studies of that group, give the advantages and disadvantages of each group's approach, and discuss the similarities and differences in between the approaches followed by different groups. The quality of the survey as well as those of the selected papers will affect your grade (select good papers published in prestigious conferences and journals). Addionally, the format, structure, and writing style of your report (including writing the citations properly) will be a part of your grade.

    Although there is no restriction for the topic that you will select (of course as long as it is related with the course contents), you should take my consent for your topic selection. Since you will make a presentation at the end and since we want minimum overlaps in between these presentations, I will not allow two groups selecting a very similar topic; I will approve the selections on a first-come-first-serve basis. Examples of the topics include but are not limited to
  • Deep learning for medical image segmentation (or for something else)
  • Deep learning in robotics (or in something else)
  • Machine learning for telecommunication networks (or for something else)
  • Machine learning in finance (or in something else)
  • Machine learning for computer security (or for something else)
  • Active learning for remote sensing (or for something else)
  • Ensemble methods in text retrieval (or in something else)
  • Reinforcement learning for computer games (or for something else)
  • ...
  • Presentation

    Each group will make a presentation on their surveys in class. Every group member should take a part in presentation. The presentations should be in parallel with your report. You will have approximately 20-25 minutes for your presentation; I will let you know the exact duration after the add-drop period.

    The presentation content, its format and layout, and the way that you present it will affect your grade. Prepare your slides neatly and properly. Do not copy and paste any text/equation/table from a paper (if necessary, type them). If you need to use a figure (or an image) of a paper, take it but give a credit to this paper (so that we can understand how much afford you put in preparing your presentation).

    I will NOT take attendance in these presentations.

    Project

    You will also work in a group of two, with the same group-mate. You will have three options for the term project:

  • Choose one of the papers that your group will select. Then implement the algorithm proposed by this paper and also implement one of the comparison algorithms used by this paper. Do not use any codes provided by the authors of the paper, if they are available. Run these two algorithms on the dataset you will select and compare their results, also using statistical tests. Additionally, follow a proper way of selecting the algorithms' parameters and also conduct parameter analysis. I expect you to select a recent paper that explains a not-so-straightforward algorithm.
  • Run a deep learning model for the dataset you will select. Here you may use the third-party codes, but you CANNOT select any dataset that was used to pretrain any of the deep learning models (e.g., you cannot use the ImageNet dataset to conduct your experiments). In this option, you are expected to get the model trained for your dataset and obtain reasonable test set accuracies. Additionally, explore the effects of different parameters in a deep learning model. Then, select two different models (deep neural networks) that you will have explored and compare their results, also using statistical tests. I expect you to select a not-so-easy dataset.
  • If you have a specific term project that you want to work on, please let me know. We need to talk the details.
  • Here I expect you to select a paper (for the first option) and a dataset (for the second one) by yourselves. The quality/difficulty of your selection will affect your grade. Of course, if you want to consult me on your selection, I will always give you a feedback.

    At the end, as a group, you will write a report (maximum of 4 pages). Give the details of the methodology you will have followed and present your experimental results. The content of your report as well as its format, structure, and writing style will affect your grade. Similarly, do not copy and paste any text/equation/table from a paper (if necessary, type them). If you need to use a figure (or an image) of a paper, take it but give a credit to this paper.

    About forming groups

    You should let me know your group as soon as possible, if you have any preference (see the deadlines below). Otherwise, I will form the groups by myself. If your group-mate is not involved in the survey/presentation/project, please let me know this as quick as possible so I can take necessary actions for you not to be affected from this undesired situation.

    If you are in a group of three, I expect you to put more effort for the survey and for the project.

    Deadlines for the survey/presentation/project

    You will lose points if you miss these deadlines.

  • Oct 9:
  • Your group preference, if any
  • Oct 23:
  • Topic selection for the survey (as a group)
  • Nov 6:
  • Term project selection (as a group)
  • Dec 4:
  • Midterm
  • Dec 11:
  • Final report for the survey
  • Dec 7 - Dec 28:
  • Presentations
  • Jan 8:
  • Final report for the project

    Academic integrity

    This course follows the Bilkent University Code of Academic Integrity, as explained in the Student Disciplinary Rules and Regulation. Violations of the rules will not be tolerated.