In this project, we are looking into automatically parallelizing stream processing applications. The goal is to take a sequential version of a stream program and produce a functionally equivalent version that is distributed and parallel. We then provide runtime mechanisms to fine tune the parallelization.
In this project, we aim at locating pipeline parallelization opportunities in a data flow graph and performing runtime profiling and adaptation to exploit these opprtunities to achieve better throughput. An important challenge is to find a good setting among a combinatorially large number of choices.
In this project, we aim at locating data parallelization opportunities in a data flow graph and perform fission to exploit these opportunities. An important aspect of this work is to ensure safety in the presence of selective and stateful operators, which require special runtime mechanisms.
In this project, we extend our work on auto data-parallelization with the aim of enabling runtime adaptation to changes in workload and resource availability. One particular challenge is to come up with an effective control algorithm. Another challenge is to manage partial state migration in the presence of stateful operators.
In this project, we look at the challenging problem of managing multiple parallel segments in a distributed system to optimize the throughput of auto data-parallelized applications.