Bilkent University
Department of Computer Engineering


  Recent Advances in Life Sciences and Challenges for Computer Scientists


Tolga Can
Department of Computer Science
University of California, Santa Barbara

We are witnessing a deluge of new biological data from different high-throughput pipelines (genomic, proteomics, microarrays, bio-images). There is a need for new computational methods for complete biological understanding of the data in a scalable manner. The current realization is that a single data source is seldom sufficient; therefore, methods that can integrate data from multiple heterogeneous sources should be developed. Working with biological data is challenging, because the data is often very noisy, incomplete, heterogeneous, and biased. In this talk, I will present some of the methods we have developed to address these problems. First, I will describe two methods, CTSS and ProGreSS, that we have developed for efficiently querying protein sequences and structures. Then, I will present a decision tree based ensemble classifier for automatically classifying protein structures with high accuracy. I will then give another solution to the same problem by utilizing the global view of similarity relations between proteins. I will conclude with a discussion of how similar techniques can be used to solve different problems in systems biology, such as prediction of protein-protein interactions and prediction of new members of a core complex.


DATE: November 28, 2005, Monday@ 13:40