Bilkent University
Department of Computer Engineering
CS 590/690 SEMINAR

 

Sequence-to-Graph Alignment on a Processing-In-Memory System

 

Ömer Yavuz Öztürk
Master Student
(Supervisor: Assoc.Prof.Can Alkan)
Computer Engineering Department
Bilkent University

Abstract: Genome graphs are used to represent the genetic information and variation of a population rather than a single individual. Sequence-to-graph (S2G) alignment problem can be defined as finding the best match between a query sequence and a genome graph. S2G algorithms are expected to suffer from memory bottleneck due to the irregularity of graph representation, and benchmarks confirm over 50% memory-boundness found in some applications. Therefore, S2G problem can benefit from processing-in-memory (PIM) technologies. PIM is an upcoming non-von Neumann architecture that allows computing near the main memory without the need to utilize the memory bus for data transfer to the CPU and back. One of the currently available PIM technologies developed by UPMEM is an architecture consisting of thousands of DPUs (DRAM Processing Units) that allow much faster and energy-efficient memory access. Our contribution includes implementing a lossless partition of the genome graph across DPUs, efficiently directing queries to relevant DPUs according to their seed locations, calculating the alignment score for each seed of each query, and then gathering and finalizing the results in the host CPU. The accuracy and runtime of our implementation for graphs with varying sizes and complexities will then be compared to state-of-the-art tools. Outline: 1. Pangenomics & Motivation 2. Sequence-to-Graph Alignment 3. Analysis of State-of-the-Art Tools 4. Why Processing-In-Memory? 5. Solution Design & Workflow 6. What is next?

 

DATE: March 25, Monday @ 13:50 Place: EA 502