Holistic System Design for Deterministic Replay
Add to Google Calendar
Deterministic replay systems record and reproduce the execution of a hardware or software system. The ability to reproduce an execution can be used to improve systems along many dimensions, including reliabilty, security, and debuggability. While it is well known how to replay uniprocessor systems, it is much harder to provide determinsitic replay of shared memory multithreaded programs on multiprocessors because shared memory accesses add a high-frequency source of non-determinism.
We introduce a new insight to deterministic replay that it is sufficient to guarantee only the same system output and the final state between the recorded and replayed executions for many replay uses; and thus it is possible to support replay without logging precise shared-memory dependencies. We call this relaxed but sufficient replay guarantee "external determinism" and leverage this observation to build efficient multiprocessor replay systems.
In this thesis, we propose efficient multiprocessor replay systems: Respec, Chimera, and Rosa. Respec enables software-only deterministic replay at low overhead with operating system support. Chimera leverages static data-race analysis to build an efficient software-only replay solution. Rosa provides an ultra-low overhead replay solution with minimal hardware extension.