Bilkent University
Department of Computer Engineering


Diverse SNP Selection for Epistasis Test Prioritization


Gizem Çaylak
MS Student
Computer Engineering Department
Bilkent University

Genome-wide association studies explain a fraction of the underlying heritability of genetic diseases. Epistatic interactions between two or more loci help closing the gap and identifying those complex interactions provides a promising road to a better understanding of complex traits. Unfortunately, sheer number of loci combinations to process and hypotheses to test prohibit the process both computationally and statistically. This is true even if only pairs of loci are considered. Epistasis prioritization algorithms have proven useful for reducing the computational burden and limiting the number of tests to perform. While current methods aim at avoiding linkage disequilibrium and covering the case cohort, none aims at diversifying the topological layout of the selected SNPs which can detect complementary variants. We propose an epistasis test prioritization algorithm which optimizes a submodular set function to select a diverse set of SNPs that span the underlying genome to (i) avoid linkage disequilibrium and (ii) pair SNPs that relate to complementary functions. We compare our algorithm with the state-of-the-art on Wellcome Trust Case Control Consortium datasets and show that we drastically reduce the number of tests to perform to discover epistatic pairs. Moreover, obtained significant pairs are more significant.


DATE: 22 April 2019, Monday, CS590 presentations begin at @ 15:40