Faculty Candidate Seminar

Fire! Sensing Smoke Across the US using Multiple Linear Regression

Sam LauPh.D. CandidateUniversity of California, San Diego
3725 Beyster BuildingMap



Zoom link for remote participants, passcode:  918770


Data plays a crucial role in important decisions, from urban planners deciding where to build new roads to scientists understanding the spread of a disease. This reality motivates people to learn how to draw conclusions from data using programs. To address this need, I develop curricula and design tools for learning programming and data science. In the first part of this talk, I’ll demonstrate my pedagogical approach through a case study about multiple linear regression, a widely-used machine learning technique. This case study draws from research that applies multiple linear regression for a real-world system (airnow.gov) and has helped millions of people in the US. In the second part of this talk, I’ll present my research in designing interactive tools for teaching. Specifically, I’ll present Pandas Tutor, a system that automatically draws diagrams showing how code transforms data tables. During its first year of deployment, Pandas Tutor served over 40,000 users across 166 countries. Together, this work points towards a future where anyone can learn programming and data science through real-world case studies and tools for visualizing code.
Sam Lau is a PhD candidate in Cognitive Science at the University of California, San Diego. His research uses methods from human-computer interaction to design interactive program visualization tools for teaching programming and data science. He received a BS and MS from UC Berkeley, where he helped launch two of Berkeley’s core data science courses. Sam is also an author of Learning Data Science, a textbook that will be published in 2023.


Cindy Estell

Faculty Host

Drew DeOrio