CS 481: Bioinformatics Algorithms

...  more like:

(bio)informatics ALGORITHMS

Fall 2016

NOTE:  Some biology, molecular biology, genetics background would help, but not required. Basics regarding the topic will be covered in class.

Prerequisites: Elementary discrete mathematics, basic algorithms and data structures, and programming proficiency with, e.g., C/C++/Java will be expected. Knowledge of elementary combinatorics and probability, fundamental algorithms for sorting, searching, hashing and dynamic programming, elementary graph algorithms would be very helpful.

Textbook: An Introduction to Bioinformatics Algorithms (Computational Molecular Biology), Neil Jones and Pavel Pevzner, MIT Press, 2004

Grading: There will be one midterm exam (25%) and a final exam (35%). A further 20% of the final grade will be based on homework assignments, and 20% will be based on quizzes.

  • Planned Lecture Cancellations

  • The following lectures are canceled. If a canceled lecture is on a Monday, we will have 2 hours on the Thursday of the same week.

  • <none yet>

  • Course material (updated as of Fall 2016)

  • Note: Some slides are adapted from http://www.bioalgorithms.info

  • 0- Introduction
  • 1- Motif Finding
  • 2- Exact String Matching - Knuth-Morris-Pratt and Boyer-Moore
  • 3- Exact String Matching - Rabin-Karp and Finite Automata
  • 4- Pattern Matching - Hashing, keyword trees, and suffix trees
  • 5- Pattern Matching - Shift/And, Aho-Corasick, and suffix arrays
  • 6- Burrows-Wheeler Transformation and Ferragina-Manzini index
  • 7- Dynamic programming, Manhattan grid, edit distance, longest common subsequence
  • 8- Global alignment, affine gap penalties, local alignment, Needleman-Wunsch and Smith-Waterman
  • 9- Banded alignment, linear space alignment
  • 10- Block edit distance, Four Russians method
  • 11- Multiple sequence alignment
  • 12- Guide trees, evolutionary trees - Neighbor-Joining and UPGMA
  • 13- Parsinony - Sankoff and Fitch algorithms
  • 14- Clustering - hierarchical, k-means, CAST
  • 15- Genome rearrangements - sorting by reversals, pancake flipping
  • 16- Sequence similarity search - BLAST, PatternHunter
  • 17- RNA folding
  • 18- Introduction to DNA sequencing
  • 19- High throughput sequencing, compression, read mapping
  • 20- Introduction to genome assembly

  • Archived slides

  • Lecture slides
    Recommended reading
    A brief introduction to molecular biology concepts and
    algorithmic problems in computational biology.
    Week 1, Lectures 1-2
        (topic removed from the course as of Fall 2016 - recommended reading is still recommended)

    A brief introduction to computational complexity and
    algorithm design techniques
    Week 1, Lecture 3
    EBI's Introduction to Biology
    JGI's Introduction to Genomics
    Chapter 3 of the textbook

    Chapter 2 of the text book

    DNA mapping & motif search
    Week 2, Lectures 1-2

    Exact sequence search algorithms
    Week 2, Lecture 3
    Chapter 4 of the text book
    Rabin-Karp algorithm, finite automata, pattern matching, suffix trees
    Week 3, Lectures 1-2

    Shift-and algorithm, Aho-Corasick algorithm, suffix arrays
    Week 3, Lecture 3
    Chapter 9 of the text book
    Elements of dynamic programming, Manhattan tourist problem, introduction
    to sequence alignment
    Week 4, Lectures 1-2

    Global alignment, local alignment, affine gap penalties
    Week 4, Lecture 3
    Chapter 6 of the text book
    Approximate string matching, divide and conquer algorithms, Four-Russians trick
    Week 5, Lectures 1-2
    Week 5, Lecture 3: Quiz 2
    Chapters 7 and 9 of the text book
    Multiple sequence alignment
    Week 6, Lectures 1-2
    Week 6, Lecture 3 canceled due to holidays
    Chapter 6 of the text book
    Gene finding  (topic removed from the course as of Fall 2013)
    Week 7, Lectures 1-2

    Heuristic sequence search
    Week 7, Lecture 3
    Chapter 6 of the text book
    Spaced seeds for heuristic spaced seeds
    Week 8, Lectures 1-2

    Chapters 9 and 11 of the text book

    (topic removed from the course as of Fall 2016)
     Hidden Markov Models, Viterbi training, Baum-Welch algorithm
    Week 9, Lecture 3

    Chapter 11 of the text book
    Phylogenetic tree construction
    Week 10, Lectures 1-2
    Week 10, Lecture 3
    Chapter 10 of the text book
    Clustering algorithms
    Week 11, Lectures 1-2

    CAST algorithm, introduction to genome rearrangements
    Week 11, Lecture 3
    Chapter 10 of the text book
    Genome rearrangements, sorting by reversal
    Week 12, Lectures 1-2
    Chapter 5 of the text book
    RNA folding
    Week 13, Lectures 1-2
    Week 13, Lecture 3 is canceled.