Bilkent University
Department of Computer Engineering


Clustering Protein-Protein Interactions Based on Conserved Domain Similarities


Aslı Ayaz

Master Thesis Presentation

 Supervisor: Assist. Prof. Dr. Uğur Doğrusöz


Protein interactions govern most cellular processes, including signal transduction, transcriptional regulation and metabolism. Saccharomyces ceravisae is estimated to have 16 000 protein interactions. Appereantly only a small number of these interactions were formed ab initio (invention), rest of them formed through gene duplications and exon shuffling (birth). Domains form functional units of a protein and are responsible for most of the interaction births, since they can be recombined and rearranged much easily compared to innovation. Therefore groups of functionally similar, homologous interactions evolved through births are expected to have a certain domain signature. Several high throughput techniques can detect interacting protein pairs, resulting in a rapidly growing corpus of protein interactions. Although there are several efforts for computationally integrating this data with literature and other high throughput data such as gene expression, annotation of this corpu s is inadaquate for deriving interaction mechanism and function. Finding interaction homologies would allow us to annotate an unknown interaction based on already annotated known interactions, or predict new ones. In this study we propose a probabilistic model for assigning interactions to homologous groups, according to their conserved domain similarities. Based on this model we have developed and implemented an Expectation-Maximization algorithm for finding the most likely grouping. We tested our algorithm with synthetic and real data, and showed that our initial results are very promising. Finally we propose several directions to improve this work.


DATE: August 19, 2004, Tuesday @ 13:30