Bilkent University
Department of Computer Engineering
CS 590/690 SEMİNAR

 

Whole Genome Alignment via Alternating Lyndon Factorazation Tree Traversal

 

Mahmud Sami Aydın

Master Student
(Supervisor: Assoc. Prof. Can Alkan )
Computer Engineering Department
Bilkent University

Abstract: In recent years, different genome assembly technologies have emerged. Homology mapping on reference genomes is detecting location of approximate substring matching in those genomes. Because of hugeness of the data, string comparison between those reference genomes is challenging problem, and so homology mapping. Lyndon Factorization is a parsing method which provides substrings which don't have suffix less than that substrings and are monotonically non-increasing in lexicographical order for a string. We propose a tree structure, similar to min-max heap, based on alternating Lyndon factorization facilitate approximate string comparison between two assemblies in order to construct homology map, much faster than naive string comparison. It is expected that algorithm easily parallelizable and more scalable than current works.

 

DATE: 5 December, Monday @ 15:50 Place: EA 502