A flapping test is a test that sometimes passes and sometimes fails even though the application code being tested hasn’t changed. Flapping tests can be hard to reproduce, diagnose, and fix. Here are some of the common causes I know of for flapping tests. If you know the common causes of flapping tests it can go a long way toward diagnosis. Once diagnosis is out of the way, the battle is half over.
Let’s say you have a Capybara test that clicks a page element that fires off an AJAX request. The AJAX request completes, making some other page element clickable. Sometimes the AJAX request beats the test runner, meaning the test works fine. Sometimes the test runner beats the AJAX request, meaning the test breaks.
“Leaking” tests (non-deterministic test suite)
Sometimes a test will fail to clean up after itself and “leak” state into other tests. To take a contrived but obvious example, let’s say test A checks that the “list customers” page shows exactly 5 customers and test B verifies that a customer can be created successfully. If test B doesn’t clean up after itself, then running test B before test A will cause test A to fail because there are now 6 customers instead of 5. This exact issue is almost never an issue in Rails (because each test runs inside a transaction) but database interaction isn’t the only possible form of leaky tests.
When I’m fixing this sort of issue I like to a) determine an order of running the tests that always causes my flapping test to fail and then b) figure out what’s happening in the earlier test that makes the later test fail. In RSpec I like to take note of the seed number, then run
rspec --seed <seed number> which will re-run my tests in the same order.
Reliance on third-party code
Sometimes tests unexpectedly fail because some behavior of third-party code has changed. I actually sometimes consciously choose to live with this category of flapping test. If an API I rely on goes down and my tests fail, the tests arguably should fail because my application is genuinely broken. But if my tests are flapping in a way that’s not legitimate, I’d probably change my test to use something like VCR instead of hitting the real API.