Faculty Candidate Seminar
Building the Next Generation of Data-Intensive Systems: From Complex Event Processing to Large-Scale Analytics
Add to Google Calendar
Databases have been a successful abstraction for accessing and managing data in traditional workloads. However, the rapid growth of data and the demand for more complex analytics have significantly hindered the scalability and applicability of these systems beyond classic business data processing scenarios. In my talk, I will explain how my research addresses these two challenges. First, I will introduce a system that I have built for supporting complex event processing over both stored and streaming data. This system extends existing database query languages with minimal but powerful constructs that enable a wide range of advanced applications, such as high-frequency trading, click-stream analysis, and the analysis of function-call traces. Using the recently proposed Visibly Pushdown Automata as the underlying model of this system, I will present several optimization techniques for efficient implementation of these languages, leading to higher throughput than its predecessors by several orders of magnitude. In the second part of my talk, I will turn to the scalability challenges, and briefly introduce a parallel query engine called BlinkDB that enables interactive, ad-hoc queries over massive volumes of data in a MapReduce cluster. I will demonstrate how BlinkDB employs sophisticated optimization and sampling strategies to achieve sub-second latency on tens of terabytes to petabytes of data.
Barzan Mozafari is currently a Postdoctoral Associate at Massachusetts Institute of Technology. He earned his PhD in Computer Science from the University of California at Los Angeles. He is passionate about building practical, large-scale data-intensive systems, with a particular interest in database-as-a-service, distributed systems, and the integration of machine learning and crowdsourcing into database systems. He has received several fellowships and awards, including SIGMOD 2012's best paper award for his work on high-performance complex event processing.