Bilkent University
Department of Computer Engineering
S E M I N A R

 

Parallel Counting of k-mers in DNA Sequences

 

Erkan Okuyan
Ph.D. Student
Computer Engineering Department
Bilkent University

Counting k-mers (which are defined as k base long DNA segment data) is a preprocessing step for many algorithms in bioinformatics. Since current sequencing technologies produces very big datasets, high memory usage of counters should be mitigated and speedy completion of counting software should be enforced. Thus parallelism is a needed element for k-mer counting software to achieve these goals. In this work we propose three parallel k-mer counting strategies using distributed memory clusters to solve counting problem in an optimal manner in terms of memory usage, speed and cost of the architecture. Experiments carried out so far supports the idea that parallelism can be very beneficial for k-mer counting.

 

DATE: 05 November, 2012, Monday @ 16:50
PLACE: EA-409