CS 550: Machine Learning
Spring '19

Course Description

Instructor:Cigdem Gunduz Demir
EA 423 (Engineering Building), x3443
gunduz at cs bilkent edu tr
Lectures: Tue 15:40-17:30, Fri 13:40-15:30, EB 203
Office hours:Tue 12:40-14:30 or by appointment
Website: http://www.cs.bilkent.edu.tr/~gunduz/teaching/cs550
References: R.O. Duda, P.E. Hart, D.G, Stork, Pattern Classification, Wiley-Interscience, 2001.
E. Alpaydin, Introduction to Machine Learning, MIT Press, 2004.
T.M. Mitchell, Machine Learning, McGraw-Hill, 1997.
P.-N. Tan, M. Steinbach, V. Kumar, Introduction to Data Mining, Addison-Wesley, 2005.

This course has two parts. The first part includes an introduction to the basic machine learning concepts and algorithms, which will also provide the basis for the second part of the course. The second part covers selected recent topics in machine learning. In particular, the course will cover the following main topics:

Part 1:
  • Introduction
  • CS550_Introduction.pdf
    CS550_BayesianDecisionTheory.pdf
    CS550_AlgorithmIndependentIssues.pdf
    CS550_DimensionalityReduction.pdf
  • Decision trees
  • CS550_DecisionTrees.pdf
  • Artificial neural networks
  • CS550_NeuralNetworks.pdf
  • Unsupervised learning and clustering
  • CS550_Clustering.pdf
  • Reinforcement learning
  • CS550_ReinforcementLearning.pdf
  • Genetic algorithms
  • CS550_GeneticAlgorithms.pdf
    Part 2:
  • Ensemble learning
  • CS550_EnsembleLearning.pdf
  • Cost-sensitive learning
  • CS550_CostSensitiveLearning.pdf
  • Active learning
  • Deep learning
  • CS550_DeepNeuralNetworks.pdf

    Grading

    Homework (25%)
    Midterm (35%)
    Survey (10%)
    Presentation (15%)
    Project (15%)

    Due to the YOK (Higher Education Council) regulations, I am taking attendance and will report it to the Department at the end of the semester.

    Homework assignments and late policy

    Homework assignments will be posted on this web site. Assignments will have some programming and non-programming parts and you are expected to work individually for the assignments. For the late assignments, 10 percent of the grade will be deducted per day after the assignment's due date.

  • Homework 1, due by 15:40 on Tuesday, April 2nd.
       You can download the datasets from here: train1 test1 train2 test2
  • Homework 2, due by 17:30 on Wednesday, April 24th
       You can download the sample image from here
  • Homework 3, due by 17:30 on Monday, May 27th
       You can download the dataset from here: ann-train.data ann-test.data ann-thyroid.cost
  • Midterm

    The midterm date is April 16, class time. You may use one A4 cheat sheet for the midterm. You should prepare this cheat sheet by your handwriting. No photocopy is allowed.

    Survey

    You will work individually. You will prepare a survey on the topic of your interest by reading at least 10-15 scientific papers and writing a short report (maximum of 3 pages including citations).

    In your report, give the problem/topic definition, discuss the motivation behind the studies working on this problem/topic (just try to answer the question of "why have all these studies worked on this problem? is it really important?"), and then explain the studies. While explaining the studies, do NOT list the studies and do NOT explain them one by one. Instead, understand the contribution and methodology of each study, try to group the studies according to their contributions and methodologies, and then explain/discuss the studies as groups (like writing a good introduction section to a scientific paper). In your discussion, do not forget to give the common approach followed by each group also discussing the variations that exist within the studies of that group, give the advantages and disadvantages of each group's approach, and discuss the similarities and differences in between the approaches followed by different groups. The quality of the survey as well as those of the selected papers will affect your grade (select good papers published in prestigious conferences and journals). Addionally, the format, structure, and writing style of your report (including writing the citations properly) will be a part of your grade.

    Although there is no restriction for the topic that you will select (of course as long as it is related with the course contents), you should take my consent for your topic selection. Since you will make a presentation at the end and since we want minimum overlaps in between these presentations, I will not allow two students selecting a very similar topic; I will approve the selections on a first-come-first-serve basis. Examples of the topics include but are not limited to
  • Deep learning for medical image segmentation (or for something else)
  • Deep learning in robotics (or in something else)
  • Machine learning for telecommunication networks (or for something else)
  • Machine learning in finance (or in something else)
  • Machine learning for computer security (or for something else)
  • Active learning for remote sensing (or for something else)
  • Ensemble methods in text retrieval (or in something else)
  • Reinforcement learning for computer games (or for something else)
  • ...
  • Presentation

    You will make a presentation on your survey in class. The presentation should be in parallel with your report. You will have 15 minutes to present your survey; we will have a discussion period of 5-10 minutes after the presentation.

    The presentation content, its format and layout, and the way that you present it will affect your grade. Prepare your slides neatly and properly. Do not copy and paste any text/equation/table from a paper (if necessary, type them). If you need to use a figure (or an image) of a paper, take it but give a credit to this paper (so that we can understand how much afford you put in preparing your presentation).

    Project

    You will also work individually. You will have three options for the term project:

  • Choose one of the papers that you will select for your presentation. Then implement the algorithm proposed by this paper and also implement one of the comparison algorithms used by this paper. Do not use any codes provided by the authors of the paper, if they are available. Run these two algorithms on the dataset you will select and compare their results, also using statistical tests. Additionally, follow a proper way of selecting the algorithms' parameters and also conduct parameter analysis. I expect you to select a recent paper that explains a not-so-straightforward algorithm.
  • Run a deep learning model for the dataset you will select. Here you may use the third-party codes, but you CANNOT select any dataset that was used to pretrain any of the deep learning models (e.g., you cannot use the ImageNet dataset to conduct your experiments). In this option, you are expected to get the model trained for your dataset and obtain reasonable test set accuracies. Additionally, explore the effects of different parameters in a deep learning model. Then, select two different models (deep neural networks) that you will have explored and compare their results, also using statistical tests. I expect you to select a not-so-easy dataset.
  • If you have a specific term project that you want to work on, please let me know. We need to talk the details.
  • Here I expect you to select a paper (for the first option) and a dataset (for the second one) by yourselves. The quality/difficulty of your selection will affect your grade. Of course, if you want to consult me on your selection, I will always give you a feedback.

    At the end, you will write a report (maximum of 4 pages). Give the details of the methodology you will follow and present your experimental results. The content of your report as well as its format, structure, and writing style will affect your grade. Similarly, do not copy and paste any text/equation/table from a paper (if necessary, type them). If you need to use a figure (or an image) of a paper, take it but give a credit to this paper.

    Deadlines for the survey/presentation/project

    You will lose points if you miss these deadlines.

  • Mar 5:
  • Topic selection for the survey
  • Mar 19:
  • Term project selection
  • Apr 16:
  • Midterm
  • Apr 26:
  • Final report for the survey
  • Apr 30 - May 17:
  • Presentations
  • May 21:
  • Final report for the project

    Academic integrity

    This course follows the Bilkent University Code of Academic Integrity, as explained in the Student Disciplinary Rules and Regulation. Violations of the rules will not be tolerated.