Bilkent University
Department of Computer Engineering
M.S.THESIS PRESENTATION
A Framework for Collaborative Integration and Effective Querying of Biological Pathways in a Graph Database
Noor Muhammad
Master Student
(Supervisor: Prof.Dr.Uğur Doğrusöz)
Computer Engineering Department
Bilkent University
Abstract: Biological pathways are used to represent molecular interactions and cellular processes. They serve as an important medium for communicating and organizing biological knowledge in a structured manner. These pathway models are typically curated independently by researchers and focus on specific processes and disease contexts. As the biological knowledge continues to expand, researchers require methods to combine and analyze multiple pathway models effectively in order to obtain broader, system-level insights. However, integrating pathway data from different sources remains challenging due to differences in semantics, structural organization, and level of detail across independently developed models. While standard pathway representation formats like SBGN Process Description (SBGN-PD) provide a common visual and semantic language, these formats are designed for individual pathway modeling and visualization. They offer limited inherent support for incremental integration and performing expressive queries across integrated pathway models remains difficult. This thesis presents a graph-based unified pathway model designed to support the incremental integration of biological pathway data and enable graph traversal-based queries. The proposed model focuses on preserving essential semantics of SBGN-PD pathways while allowing pathway data to be incorporated piece by piece into a unified graph. A key aspect of the model is its support for matching the incoming pathway entities (nodes and edges) with the existing graph content, where the matching behavior is customizable through user-defined thresholds. Building on this model, the system supports traversal-based queries such as neighborhood exploration, common stream detection, paths between entities. The model and its database design are realized using the Neo4j graph database and are integrated into the Newt Pathway editor, providing an end-to-end system for pathway integration, querying, and visualization. Keywords: Bioinformatics, software tools, graph databases, SBGN, SBML, pathway integration, Neo4j, visualization.
DATE: February 20, Friday @ 11:30 Place: EA 409