We are living in an ever more connected world, where the data generated by people, software systems, and the physical world is more accessible than before and is much larger in volume, variety, and velocity. In many application domains, processing this data as it gets generated and driving richer information and value from it is of great importance. For instance, in the domain of communication networks, network packets contain massive amounts of data, often arriving at wire-speeds. In social media, tweets, forum entries, and status updates are generated on a continuous basis at a large scale. In summary, there is an emergence of large number of online and high volume data streams, and being able to process them in near real-time can bring competitive advantage to businesses and improve services.
The Bil-DIDS lab conducts research in the general area of data-intensive distributed systems, with a particular focus on fast data and big data.
The lab's research areas include distributed stream processing systems, MapReduce systems, distributed key-value stores, large-scale graph processing, and hardware acceleration for data management. Our research addresses issues such as parallelization, fault-tolerance, profiling, optimization, and debugging in data-intensive systems. The lab's interests extend to scalable data mining as well, especially in the context of social graphs.