Improving Software Configuration Troubleshooting with Causality Analysis
Add to Google Calendar
Complex software systems are difficult to configure and manage. When problems inevitably arise, operators spend considerable amount of time troubleshooting those problems by diagnosing the root causes and correcting them. Misconfigurations are problems in which the application code is correct, but the software has been installed, configured, or updated incorrectly so that it does not behave as desired. Such misconfigured software might crash, produce erroneous output, or simply perform poorly. This dissertation focuses on developing methods and tools that automate the troubleshooting process and thereby reduce the time to recovery (TTR) and require less manual effort by users.
The core idea of this thesis is to automate misconfiguration diagnosis by using causality analysis to determine specific inputs to an application that cause that application to produce an undesired output. This thesis shows that we can leverage these causal relationships to determine the root cause of misconfigurations. Further, we demonstrate that it is feasible to automatically infer such relations by analyzing the execution of the application and the interactions between the application and the operating system. Based on the idea of causality analysis, we designed and developed three misconfiguration diagnosis tools: SigConf, ConfAid, and X-ray. SigConf uses coarse-grained causality analysis to compare the state of a sick computer against a bug database. ConfAid and X-ray use fine-grained causality tracking to identify misconfigurations that originate from configuration files. ConfAid focuses on problems that lead to incorrect output, whereas X-ray focuses on misconfigurations that cause performance anomalies.