Department of Computer Engineering
CS 690 SEMINAR
Enabling Fast and Accurate Read Alignment
Computer Engineering Department
Motivation: Genome sequencing helps reveal evolution of species and genomic variants that cause diseases. However, high throughput DNA sequencing (HTS) technologies generate excessive number of small DNA segments -called short reads- that incur significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and candidate locations in that reference genome. The similarity measurement, called alignment, is the bottleneck for the following reasons: (1) Read alignment is formulated as an approximate string-matching problem, which is solved using quadratic-time dynamic programming algorithms. (2) In practice, the majority of candidate locations do not align with a given read due to high dissimilarity. Calculating the alignment of such candidate locations occupies most of a modern read mappers execution time. Therefore, it is crucial to develop a fast and effective filter that can detect these candidate locations and eliminate them from consideration.
Results: We propose GateKeeper, a new architecture that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. The main idea of GateKeeper is to filter out the incorrect mappings in a streaming fashion using Field-Programmable Gate Arrays (FPGAs). By excluding the incorrect mappings at an early stage, we reduce the number of candidate verifications in the rest of the execution, and thus accelerate the read mapping. GateKeeper is the first design to accelerate prealignment using new hardware technologies. On a single FPGA chip, it provides up to 215-fold speedup over the state-of-the-art pre-alignment techniques, Shifted Hamming Distance (SHD) and Adjacency Filter. GateKeeper can provide more than two orders of magnitude speedup over other FPGA-based accelerations of mapping tools such as BWA and BFAST
DATE: 10 October, 2016, Monday @ 16:00