I conduct research in the general area of data-intensive distributed systems, with a particular focus on fast data and big data. My research interests include distributed stream processing systems, MapReduce systems, distributed key-value stores, large-scale graph processing, and hardware acceleration for data management. My research addresses issues such as parallelization, fault-tolerance, profiling and optimization, and debugging in data-intensive systems. Recently, my research interests have extended to scalable data mining as well, especially in the context of social graphs.
I manage the Bil-DIDS research group.
I am also affiliated with the Bilvea research lab.
In the past I have been heavily involved in the System S project and have served as the Chief Architect for the IBM InfoSphere Streams product. I am the co-inventor of the SPADE and SPL stream processing languages.
My dissertation research focused on developing architectures and techniques to address scalability problems in large-scale distributed data intensive systems and applications, and support for distributed information monitoring services.