Faculty Candidate Seminar
Overcoming the Deep Learning Power Wall with Principled Unsafe Optimization
Add to Google Calendar
The computing industry has a power problem: the days of ideal power-process scaling are over, and chips now have more devices than can be fully powered simultaneously, limiting performance. To continue scaling performance in light of these power-constraints requires creative solutions. Specialized hardware accelerators are one viable solution. While accelerators promise to provide orders of magnitude more performance per watt, several challenges have limited their wide-scale adoption and fueled skepticism.
Deep learning has emerged as a sort of proving ground for hardware acceleration. With extremely regular compute patterns and wide-spread use, if accelerators can't work here, there's little hope elsewhere. One way to motivate the need for accelerators is to demonstrate the efficiency benefits they provide in the era of power limited computing. In this talk I will use deep learning to study the efficiency gap between standard ASIC design practices and full-stack co-design to enable these powerful models to be used with little restriction. To push the efficiency limits of deep learning inference this talk will introduce principled unsafe optimizations. A principled unsafe optimization changes how a program executes without impacting accuracy. By breaking the contract between the algorithm, architecture, and circuits, efficiency can be greatly improved. To conclude, future research directions centering around hardware specialization will be presented: accelerator-centric architectures and privacy-preserving cloud computing.
Brandon Reagen is a computer architect with a focus on specialized hardware (i.e., accelerators) and low-power design with applications in deep learning. He received his PhD from Harvard in May of 2018. Over the course of his PhD, Brandon made several research contributions to lower the barrier of using accelerators as general architectural constructs including benchmarking, simulation infrastructure, and SoC design. Using his knowledge of accelerator design, he led the way in highly-efficient and accurate deep learning accelerator design with his work on principled unsafe optimizations. In his thesis, he found that for DNN inference intricate, full-stack co-design between the robust nature of the algorithm and the circuits they execute on can result in nearly an order of magnitude more power-efficiency compared to standard ASIC design practices. His work has been published in conferences ranging from architecture, ML, CAD, and circuits. Brandon is now a Research Scientist at Facebook in the AI Infrastructure team.