Bilkent University
Department of Computer Engineering
S E M I N A R

 

Scalable Disk Layout Techniques for Graphs

 

Abdurrahman Yaar

In recent years the world has witnessed an enormous growth in social networks and the volume of data generated by them. As a consequence, processing this massive amounts of data has become a major problem. An important portion of this data is represented as graphs. In recent years, several graph processing and management systems have been developed to handle large-scale graphs. The primary goal of these systems is to run graph algorithms in an efficient and scalable manner. Unlike relational data, graphs are semi-structured in nature. Thus, storing and accessing graph data using secondary storage requires new solutions that can provide locality of access for graph processing workloads. In this work, we propose a novel scalable disk layout technique for graphs, which aims at reducing the I/O cost of disk-based graph processing algorithms. To achieve this goal, we designed a scalable MapReduce style method called ICBP, which can divide the graph into a series of disk blocks that contain sub-graphs with high locality, as well as order these blocks on disk to reduce non-local accesses. In this presentation, we describe the ICBP method, including the challenges that arose in applying ICBP in practice, the solutions applied, and an empirical evaluation.

 

DATE: 15 December, 2014, Monday @ 16:15
PLACE: EA-409