Bilkent University
Department of Computer Engineering


Computational Methods for Detecting Mapping-Hidden Variants in NGS Data and Applications on Complex Diseases.


Emre Karakoç, Ph.D
University of Washington

Detecting genomic variants is becoming essential for understanding the etiology of complex diseases. Although the single nucleotide polymorphisms (SNPs) and large copy number variation (CNV) have received considerable attention, the small insertions and deletions (INDELs) remain largely under-discovered and the methods that are discovering INDELs are lagging behind. In this talk I am going to present a computational methods to detect structural variation and INDELs ranging in size from 1 base pair to 1 Mbp within exome sequence data sets as well as whole genome sequencing data sets. Our method is based on split-read approach and it can identify the size, content and location of variants with high specificity and sensitivity. Our algorithm discovers genomic variation including copy number polymorphic processed pseudogenes missed by other methods.

We applied our method for detecting de novo disruptive variants from 209 families with Autism Spectrum Disorder. Based on the human protein-protein interactions networks we also developed methods to rank genes for their topological similarity to the previously identified autism candidate genes. We found that 39% (49 of 126) of the most severe or disruptive de novo mutations map to a highly interconnected ß-catenin/chromatin remodelling protein network ranked significantly for autism candidate genes.

Bio: Emre Karakoc is a Postdoctoral Fellow at the Department of Genome Sciences at the University of Washington. He graduated from Bilkent University Dept. of Computer Engineering in 2002, and obtained his Ph.D. in Computer Science from Simon Fraser University in 2007. He worked at the University of Waterloo before he joined the University of Washington. His current research interest is algorithm development to analyze a broad range of biological data, including protein and RNA structures, protein-protein interactions, and high throughput DNA sequencing.


DATE: 25 March, 2013, Monday @ 13:40