SPLASH 2026
Sat 3 - Fri 9 October 2026 Oakland, California, United States
co-located with SPLASH/ISSTA 2026

Regression testing is an essential part of software development to ensure high-quality software, but the presence of flaky tests makes the testing outcomes unreliable. It is essential to proactively detect flaky tests, so developers are aware of them early on and can react appropriately in case they fail. While there has been prior work in detecting flaky tests, they either are developed to focus on specific types. We present ChaosAPI, a framework to support detecting a variety of different types of flaky tests. Our insight is that the most effective approach to detecting flaky tests is to target the nondeterministic component that the lead the tests to have the flaky behavior in the first place. In particular, we target specific APIs within the Java Standard Library that all Java code relies on and are known to exhibit nondeterministic behavior, such as those related to system time, concurrency, and environmental factors. During test execution, ChaosAPI modifies the behavior of these API calls, perturbing inputs and return values of these APIs in a systematic manner while still remaining compliant with the API specification. We can detect flaky tests by observing whether the test that previously passed would now fail when run through ChaosAPI. Our evaluation on a prior dataset of known flaky tests as well as running on test suites of other popular open-source projects demonstrates that ChaosAPI not only detects more flaky tests than simple rerunning across a wide range of projects but also detects them more efficiently, making ChaosAPI a practical addition to the toolbox of flaky test detection techniques