Bilkent University
Department of Computer Engineering


Scalable Disk Layout Techniques for Graphs


Abdurrahman Yaar

In recent years the world has witnessed an enormous growth in social networks and the volume of data generated by them. As a consequence, processing this massive amounts of data has become a major problem. An important portion of this data is represented as graphs. In recent years, several graph processing and management systems have been developed to handle large-scale graphs. The primary goal of these systems is to run graph algorithms in an efficient and scalable manner. Unlike relational data, graphs are semi-structured in nature. Thus, storing and accessing graph data using secondary storage requires new solutions that can provide locality of access for graph processing workloads. In this work, we propose a novel scalable disk layout technique for graphs, which aims at reducing the I/O cost of disk-based graph processing algorithms. To achieve this goal, we designed a scalable MapReduce style method called ICBP, which can divide the graph into a series of disk blocks that contain sub-graphs with high locality, as well as order these blocks on disk to reduce non-local accesses. In this presentation, we describe the ICBP method, including the challenges that arose in applying ICBP in practice, the solutions applied, and an empirical evaluation.


DATE: 15 December, 2014, Monday @ 16:15