Faculty Candidate Seminar

Pareto-efficient AI systems: Expanding the quality and efficiency frontier of AI

Simran AroraPh.D. CandidateStanford University
WHERE:
3725 Beyster Building
SHARE:
Simran Arora

Zoom link for remote attendees

Meeting ID: 915 9204 0151
Passcode: 123123

Abstract: Foundation models are helping us build useful systems in hours that five years ago would have taken months. This exciting progress stems from a specific AI recipe: training massive models on massive amounts of data center compute. However, with the growing demand for AI, how can we maximize the capabilities we can achieve under any compute constraint? There are three key challenges: understanding (1) how our workload constraints (e.g., throughput, latency, privacy) impact the AI algorithms we should use, (2) how hardware constraints shapes AI and vice-versa, and (3) the fundamental scaling laws that govern how efficiently different algorithms learn useful capabilities. In this talk, I’ll discuss how I’ve addressed these challenges in my research. I’ll focus on my work to expand the Pareto-frontier between language model quality and throughput, where we’ve developed new methods to empirically and theoretically explain the fundamental tradeoffs between model quality and throughput in language modeling, built a new programming library called ThunderKittens to make it easier to develop fast hardware programs for new AI algorithms, and released the state-of-the-art 8B-405B parameter efficient language models on an academic budget.

Bio: Simran Arora is a PhD student at Stanford University advised by Chris Ré. Her research blends machine learning and systems towards expanding the Pareto frontier between AI quality and efficiency. Her machine learning research has appeared as Oral and Spotlight presentations at NeurIPS, ICML, and ICLR, including an Outstanding Paper award at NeurIPS and Best Paper award at ICML ES-FoMo. Her systems work has appeared at VLDB, SIGMOD, CIDR, and CHI, and her systems artifacts are widely used in open-source and industry. In 2023, Simran created and taught the CS229s Systems for Machine Learning course at Stanford. She has also been supported by a SGF Sequoia Fellowship.

Student Host

Jiachen Liu

Faculty Host

Mosharaf Chowdhury