Communication, Co-design and the Making of Modern Database Storage Engines
Add to Google Calendar
Recent and ongoing trends in the hardware world are driving database storage engines back to the drawing board on multiple fronts. Main-memory and NVM-optimized systems promise blistering transaction rates by simplifying the software architecture; partitioned database architectures sidestep many of the challenges posed by massively multi-core systems; and accelerators such as GPU, FPGA, and even custom processor extensions are being explored in hopes of balancing power with performance. The database engine must adapt or risk becoming an obstacle to performance rather than an asset. This talk will examine the challenge of creating a database system that helps user maximize the use of their highly parallel hardware and other modern goodies. We will begin by examining the database log—a central figure in most database engines—and its communication with the rest of the system. We will see that the underlying hardware can be both a hinderance and a help when it comes to high-performance logging. Then, faced with a scheduling dilemma, we will briefly step into the world of operating systems to tame the OS scheduler. Finally, we will explore in some detail how co-design techniques—long used in hardware and embedded systems design—can be leveraged to great effect inside a database engine as well. The talk will conclude with a sketch of a future "bionic" database system that relies on carefully staged communication as well as both hardware and software co-design techniques, to produce maximum parallelism and performance while retaining a robust feature set.
Ryan Johnson is an assistant professor of Computer Science at the University of Toronto, specializing in optimizing the interface between database engines and the hardware they utilize. He also sees operating systems, embedded systems, compilers, and hardware design as fair game in his quest to create better database engines. His research drove the development of Shore-MT, a highly parallel database storage manager that is used by both academic and industry researchers to inspire and prototype their own research. His work has also influenced the design of both commercial and open source database engines. Prof. Johnson has received faculty awards from both IBM and HP Labs, and his PhD work received the 2011 ACM SIGMOD Jim Gray Doctoral Dissertation Award.