Category Archives: Programming

RSpec/Capybara integration tests: the ultimate guide

Part 1: Common Integration Test Questions
What exactly are integration tests?
What’s the difference between integration tests, acceptance tests, end-to-end tests, system tests, and feature specs?
What do I write integration tests for, and how?

Part 2: A Rails Integration Test Tutorial
Tutorial overview
Setting up our Rails project
Writing our integration tests
Where to go next

What exactly are integration tests?

The many different types of RSpec specs

When I started learning about Rails testing and RSpec I discovered that there are many different kinds of tests. I encountered terms like “model spec”, “controller spec”, “view spec”, “request spec”, “route spec”, “feature spec”, and more. I asked myself: Why do these different types of tests exist? Do I need to use all of them? Are some more important than others?

As I’ve gained experience with Rails testing I’ve come to believe that yes, some types of tests (or, to use the RSpec terminology, “specs”) are more important than other types of specs and no, I don’t need to use all of types of specs that RSpec offers. Some types of specs I never use at all (e.g. view specs).

There are two types of tests I use way more than any other types of specs: model specs and feature specs. Let’s focus on these two types of specs for a bit. What are model specs and feature specs for, and what’s the difference?

Feature specs and model specs

Speaking loosely, model specs test ActiveRecord models by themselves and feature specs test the whole “stack” including model code, controller code, and any HTML/CSS/JavaScript together. Neither type of spec is better or worse (and it’s not an “either/or” situation anyway), the two types of specs just have different strengths and weaknesses in different scenarios.

The strengths and weaknesses of model specs

An advantage of model specs is that they’re “inexpensive”. Compared to feature specs, model specs are relatively fast to run and relatively fast to write. It tends to be slower to actually load the pages of an application and switch among them than to just test model code by itself with no browser involved. That’s an advantage of model specs. A disadvantage of model specs is that they don’t tell you anything about whether your whole application actually works from the user’s perspective. You can understand what I mean if you imagine a Rails model that works perfectly by itself but doesn’t have have any working HTML/CSS/JavaScript plugged into that model to make the whole application work. In such a case the model specs could pass even though there’s no working feature built on top of the model code.

The strengths and weaknesses of feature specs

Feature specs have basically the opposite pros and cons. Unlike model specs, feature specs do tell you whether all the parts of your application are working together. There’s a cost to this, though, in that feature specs are relatively “expensive”. Feature specs are relatively time-consuming to run because you’re running more stuff than in a model spec. (It’s slower to have to bring up a web browser than to not have to.) Feature specs are also relatively time-consuming to write. The reason feature specs are time-consuming to write is that, unlike a model spec where you can exercise e.g. just one method at a time, a feature spec has to exercise a whole feature at a time. In order to test a feature it’s often necessary to have certain non-trivial conditions set up—if you want to test, for example, the creation of a hair salon appointment you first have to have a salon, a stylist, a client, and maybe more. This necessity makes the feature spec slower to write and slower to run than a model spec where you could test e.g. a method on a `Stylist` class all by itself.

Where integration tests come into the picture, and what integration tests are

If you’re okay with being imprecise, we can say that feature specs and integration tests are roughly the same thing. And I hope you’re okay with being imprecise because when it comes to testing terminology in general there’s very little consensus on what the various testing terms mean, so it’s kind of impossible to be precise all the time. We do have room to be a little more precise in this particular case, though.

In my experience it’s commonly agreed that an integration test is any test that tests two or more parts of an application together. What do we mean when we say “part”? A “part” could be pretty much anything. For example, it has been validly argued that models specs could be considered integration tests because model specs typically test Ruby code and database interaction, not just Ruby code in isolation with no database interaction. But it could also be validly argued that a model spec is not an integration test because a model spec just tests “one” thing, a model, without bringing controllers or HTML pages into the picture. (For our purposes though let’s say model specs are NOT integration tests, which is the basic view held, implicitly or explicitly, by most Rails developers.)

So, while general agreement exists on what an integration test is in a broad sense, there still is a little bit of room for interpretation once you get into the details. It’s kind of like the definition of a sandwich. Most people agree that a sandwich is composed of a piece of food surrounded by two pieces of bread, but there’s not complete consensus on whether e.g. a hamburger is a sandwich. Two different people could have different opinions on the matter and no one could say either is wrong because there’s not a single authoritative definition of the term.

Since feature specs exercise all the layers of an application stack, feature specs fall solidly and uncontroversially into the category of integration tests. This is so true that many developers (including myself) speak loosely and use the terms “feature spec” and “integration test” interchangeably, even though we’re being a little inaccurate by doing so. Inaccurate because all feature specs are integration specs but not all integration tests are feature specs. (For example, a test that exercises an ActiveRecord model interacting with the Stripe API could be considered an integration test even though it’s not a feature spec.)

I hope at this point you have some level of understanding of how feature specs and integration tests are different and where they overlap. Now that we’ve discussed feature specs vs. integration tests, let’s bring some other common testing terms into the picture. What about end-to-end tests and system tests?

What’s the difference between integration tests, acceptance tests, end-to-end tests, system tests, and feature specs?

A useful first step in discussing these four terms may be to name the origin of each term. Is it a Ruby/Rails-specific term or just a general testing term that we happen to use in the Rails world sometimes?

Integration tests	General testing term, not Ruby/Rails-specific
Acceptance tests	General testing term, not Ruby/Rails-specific
End-to-end tests	General testing term, not Ruby/Rails-specific
System tests	Rails-specific, discussed in the official Rails guides
Feature specs	An RSpec concept/term

Integration tests, acceptance tests and end-to-end tests

Before we talk about where system tests and feature specs fit in, let’s discuss integration tests, acceptance tests and end-to-end tests.

Like most testing terms, I’ve heard the terms integration tests, acceptance tests, and end-to-end tests used by different people to mean different things. It’s entirely possible that you could talk to three people and walk away thinking that integration tests, acceptance tests and end-to-end tests mean three different things. It’s also entirely possible that you could talk to three people and walk away thinking all three terms mean the same exact thing.

I’ll tell you how I interpret and use each of these three terms, starting with end-to-end tests.

To me, an end-to-end test is a test that tests all layers of an application stack under conditions that are very similar to production. So in a Rails application, a test would have to exercise the HTML/CSS/JavaScript, the controllers, the models, and the database in order to qualify as an end-to-end test. To contrast end-to-end tests with integration tests, all end-to-end tests are integration tests but not all integration tests are end-to-end tests. I would say that the only difference between the terms “end-to-end test” and “feature spec” are that feature spec is an RSpec-specific term while end-to-end test is a technology-agnostic industry term. I don’t tend to hear the term “end-to-end test” in Rails circles.

Like I said earlier, an integration test is often defined as a test that verifies that two or more parts of an application behave correctly not only in isolation but also when used together. For practical purposes, I usually hear Rails developers say “integration test” when they’re referring to RSpec feature specs.

The purpose of an acceptance tests is quite a bit different (in my definition at least) from end-to-end tests or integration tests. The purpose of an acceptance test is to answer the question, “Does the implementation of this feature match the requirements of the feature?” Like end-to-end tests, I don’t ever hear Rails developers refer to feature specs as acceptance tests. I have come across this usage, though. One of my favorite testing books, Growing Object-Oriented Software, Guided by Tests, uses the term “acceptance test” to refer to what in RSpec/Rails would be a feature spec. So again, there’s a huge lack of consensus in the industry around testing terminology.

System tests and feature specs

Unlike many distinctions we’ve discussed so far, the difference between system tests and feature specs is happily pretty straightforward: when an RSpec user says integration test, they mean feature spec; when a MiniTest user says integration test, they mean system test. If you use RSpec you can focus on feature specs and ignore system tests. If you use MiniTest you can focus on system tests and ignore feature specs.

My usage of “integration test”

For the purposes of this article I’m going to swim with the current and use the terms “integration test” and “feature spec” synonymously from this point forward. When I say “integration test”, I mean feature spec.

What do I write integration tests for, and how?

We’ve discussed what integration tests are and aren’t. We’ve also touched on some related testing terms. If you’re like many developers getting started with testing, your next question might be: what do I write integration tests for, and how?

This question (what do I write tests for) is probably the most common question I hear from people who are new to testing. It was one of my own main points of confusion when I myself was getting started with testing.

What to write integration tests for

I can tell you what I personally write integration tests for in Rails: just about everything. Virtually every time I build a feature that a user will interact with in a browser, I’m going to write at least one integration test for that feature. Unfortunately this answer of mine, while true, is perhaps not very helpful. When someone asks what they should write tests for, that person is probably somewhat lost and that person is probably looking for some sort of toehold. So here’s what’s hopefully a more helpful answer.

If you’ve never written any sort of integration test (which again I’m sloppily using interchangeably with “feature spec”) then my advice would be to first do an integration test “hello world” just so you can see what’s what and get a tiny bit of practice under your belt. I have another post, titled A Rails testing “hello world” using RSpec and Capybara, that will help you do just that.

What if you’re a little further? What if you’re already comfortable with a “hello world” level of integration testing and you’re more curious about how to add real integration tests to your Rails application?

My advice would be to start with what’s easiest. If you’re working with a legacy project (legacy project here could just be read as “any existing project without a lot of test coverage”), it’s often the case that the structure of the code makes it particularly difficult to add tests. So in these cases you’re dealing with two separate challenges at once: a) you’re trying to learn testing (not easy) and b) you’re trying to overcome the obstacles peculiar to adding tests to legacy code (also not easy). To the extent that you can, I would encourage you to try to separate these two endeavors and just focus on getting practice writing integration tests first.

If you’re trying to add tests to an existing project, maybe you can find some areas of the application where adding integration tests is relatively easy. If there’s an area of your application where the UI simply provides CRUD operations on some sort of resource that has few or no dependencies, then that area might be a good candidate to begin with. A good clue would be to look for how many `has_many`/`belongs_to` calls you find in your various models (in other words, look for how many associations your models have). If you have a model that has a lot of associations, the CRUD interface for that model is probably not going to be all that easy to test because you’re going to have to spin up a lot of associated records in order to get the feature to function. If you can find a model with fewer dependencies, a good first integration test to write might be a test for updating a record.

How to write integration tests

Virtually all the integration tests I write follow the same pattern:

Generate some test data
Log into the application
Visit the page I’m interested in
Perform whatever clicks and typing need to happen in order to exercise the feature I’m testing
Perform an assertion

If you can follow this basic template, you can write integration tests for almost anything. The hardest part is usually the step of generating test data that will help get your program/feature into the right state for testing what you want to test.

Most of the Rails features I write most of the time are pretty much just some variation of a CRUD interface for a Rails resource. So when I’m developing a feature, I’ll usually write an integration test for at least creating and updating the resource, and maybe deleting an instance of that resource. (I’ll pretty much never write a test for simply viewing a list of resources, the “R” in CRUD, because that functionality is usually exercised anyway in the course of testing other behavior.) Sometimes I write my integration tests before I write the application code that makes the tests pass. Usually I write generate the CRUD code first using Rails scaffolds and add the tests afterward. To put it another way, I usually don’t write my integration tests in a TDD fashion. In fact, it’s not really possible to use scaffolds and TDD at the same time. In the choice between having the benefits of scaffolds and having the benefits of TDD, I choose having the benefits of scaffolds. (I do, however, practice TDD when writing other types of tests besides integration tests.)

Now that we’ve discussed in English what to write integration tests for and how, let’s continue fleshing out this answer using some actual code examples. The remainder of this article is a Rails integration test tutorial.

Tutorial overview

Project description

In this short tutorial we’re going to create a small Rails application which is covered by a handful of integration tests. The application will be ludicrously small as far as Rails applications go, but a ludicrously small Rails application is all we’re going to need.

Our application will have just a single model, City, which has just a single attribute, name. Since the application has to be called something, we’ll call it “Metropolis”.

Tutorial outline

Here are the steps we’re going to take:

Initialize the application
Create the `city` resource
Write some integration tests for the `city` resource

That’s all! Let’s dive in. The first step will be to set up our Rails project.

Setting up our Rails project

First let’s initialize the application. The `-T` flag means “no test framework”. We need the `-T` flag because, if it’s not there, Rails will assume we want MiniTest, which in this case is not the case. I’m using the `-d postgresql` flag because I like PostgreSQL, but you can use whatever RDBMS you want.

$ rails new metropolis -T -d postgresql

Next let’s `cd` into the project directory and create the database.

$ cd metropolis
$ rails db:create

With that groundwork out of the way we can add our first resource and start adding our first integration tests.

Writing our integration tests

What we’re going to do in this section is generate the city scaffold, then pull up the city index page in the browser, then write some tests for the city resource.

The tests we’re going to write for the city resource are:

Creating a city (with valid inputs)
Creating a city (with invalid inputs)
Updating a city (with valid inputs)
Deleting a city

Creating the city resource

The City resource will have only one attribute, name.

$ rails g scaffold city name:string
$ rails db:migrate

Let’s set our root route to cities#index so Rails has something to show when we visit localhost:3000.

Rails.application.routes.draw do
  resources :cities
  root 'cities#index'
end

If we now run the rails server command and visit http://localhost:3000, we should see the CRUD interface for the City resource.

$ rails server

Integration tests for City

Before we can write our tests we need to install some certain gems.

Installing the necessary gems

Let’s add the following to our Gemfile under the :development, :test group.

group :development, :test do
  # The RSpec testing framework
  gem 'rspec-rails'
 
  # Capybara, the library that allows us to interact with the browser using Ruby
  gem 'capybara'
 
  # This gem helps Capybara interact with the web browser.
  gem 'webdrivers'
end

Remember to bundle install.

$ bundle install

In addition to running bundle install which will install the above gems, we also need to install RSpec into our Rails application which is a separate step. The rails g rspec:install command will add a couple RSpec configuration files into our project.

$ rails g rspec:install

The last bit of plumbing work we have to do before we can start writing integration tests is to create a directory where we can put the integration tests. I tend to create a directory called spec/features and this is what I’ve seen others do as well. There’s nothing special about the directory name features. It could be called anything at all and still work.

$ mkdir spec/features

Writing the first integration test (creating a city)

The first integration test we’ll write will be a test for creating a city. The steps will be this:

Visit the “new city” page
Fill in the Name field with a city name (Minneapolis)
Click the Create City button
Visit the city index page
Assert that the city we just added, Minneapolis, appears on the page

Here are those four steps translated into code.

visit new_city_path
fill_in 'Name', with: 'Minneapolis'
click_on 'Create City'
visit cities_path
expect(page).to have_content('Minneapolis')

Finally, here are the contents of a file called spec/features/create_city_spec.rb with the full working test code.

# spec/features/create_city_spec.rb

require 'rails_helper'

RSpec.describe 'Creating a city', type: :feature do
  scenario 'valid inputs' do
    visit new_city_path
    fill_in 'Name', with: 'Minneapolis'
    click_on 'Create City'
    visit cities_path
    expect(page).to have_content('Minneapolis')
  end
end

Let’s create the above file and then run the test.

rspec spec/features/create_city_spec.rb

The test passes. There’s kind of a problem, though. How can we be sure that the test is actually doing its job? What if we accidentally wrote the test in such a way that it always passes, even if the underlying feature doesn’t work? Accidental “false positives” certainly do happen. It’s a good idea to make sure we don’t fool ourselves.

Verifying that the test actually does its job

How can we verify that a test doesn’t give a false positive? By breaking the feature and verifying that the test no longer passes.

There are two ways to achieve this:

Write the failing test before we write the feature itself (test-driven development)
Write the feature, write the test, then break the feature

Method #1 is not an option for us in this case because we already created the feature using scaffolding. That’s fine though. It’s easy enough to make a small change that breaks the feature.

In our CitiesController, let’s replace the line if @city.save with simply if true. This way the flow through the application will continue as if everything worked, but the city record we’re trying to create won’t actually get created, and so the test should fail when it looks for the new city on the page.

# app/controllers/cities_controller.rb

def create
  @city = City.new(city_params)

  respond_to do |format|
    #if @city.save
    if true
      format.html { redirect_to @city, notice: 'City was successfully created.' }
      format.json { render :show, status: :created, location: @city }
    else
      format.html { render :new }
      format.json { render json: @city.errors, status: :unprocessable_entity }
    end
  end
end

If we run the test again now, it does in fact fail.

F

Failures:

  1) Creating a city valid inputs
     Failure/Error: expect(page).to have_content('Minneapolis')
       expected to find text "Minneapolis" in "Cities\nName\nNew City"
     # ./spec/features/create_city_spec.rb:9:in `block (2 levels) in <top (required)>'

Finished in 0.23035 seconds (files took 0.89359 seconds to load)
1 example, 1 failure

Failed examples:

rspec ./spec/features/create_city_spec.rb:4 # Creating a city valid inputs

Now we can change if true back to if @city.save, knowing that our test really does protect against a regression should this city saving functionality ever break.

We’ve just added a test for attempting (successfully) to create a city when all inputs are valid. Now let’s add a test that verifies that we get the desired behavior when not all inputs are valid.

Integration test for trying to create a city with invalid inputs

In our “valid inputs” case we followed the following steps.

Visit the “new city” page
Fill in the Name field with a city name (Minneapolis)
Click the Create City button
Visit the city index page
Assert that the city we just added, Minneapolis, appears on the page

For the invalid case we’ll follow a slightly different set of steps.

Visit the “new city” page
Leave the Name field blank
Click the Create City button
Assert that the page contains an error

Here’s what these steps might look like when translated into code.

visit new_city_path
fill_in 'Name', with: ''
click_on 'Create City'
expect(page).to have_content("Name can't be blank")

A comment about the above steps: step 2 is actually not really necessary. The Name field is blank to begin with. Explicitly setting the Name field to an empty string is superfluous and doesn’t make a bit of change in how the test actually works. However, I’m including this step just to make it blatantly obvious that we’re submitting a form with an empty Name field. If we were to jump straight from visiting new_city_path to clicking the Create City button, it would probably be less clear what this test is all about.

Here’s the full version of the “invalid inputs” test scenario alongside our original “valid inputs” scenario.

# spec/features/create_city_spec.rb

require 'rails_helper'

RSpec.describe 'Creating a city', type: :feature do
  scenario 'valid inputs' do
    visit new_city_path
    fill_in 'Name', with: 'Minneapolis'
    click_on 'Create City'
    visit cities_path
    expect(page).to have_content('Minneapolis')
  end

  scenario 'invalid inputs' do
    visit new_city_path
    fill_in 'Name', with: ''
    click_on 'Create City'
    expect(page).to have_content("Name can't be blank")
  end
end

Let’s see what we get when we run this test.

$ rspec spec/features/create_city_spec.rb

The test fails.

.F

Failures:

  1) Creating a city invalid inputs
     Failure/Error: expect(page).to have_content('error')
       expected to find text "Name can't be blank" in "City was successfully created.\nName:\nEdit | Back"
     # ./spec/features/create_city_spec.rb:16:in `block (2 levels) in <top (required)>'

Finished in 0.19976 seconds (files took 0.96104 seconds to load)
2 examples, 1 failure

Failed examples:

rspec ./spec/features/create_city_spec.rb:12 # Creating a city invalid inputs

Instead of finding the text Name can't be blank on the page, it found the text City was successfully created.. Evidently, Rails happily accepted our blank Name input and created a city with an empty string for a name.

To fix this behavior we can add a presence validator to the name attribute on the City model.

# app/models/city.rb

class City < ApplicationRecord
  validates :name, presence: true
end

The test now passes.

Integration test for updating a city

The steps for this test will be:

Create a city in the database
Visit the “edit” page for that city
Fill in the Name field with a different value from what it currently is
Click the Update City button
Visit the city index page
Assert that the page contains the city’s new name

Here are these steps translated into code.

nyc = City.create!(name: 'NYC')
visit edit_city_path(id: nyc.id)
fill_in 'Name', with: 'New York City'
click_on 'Update City'
visit cities_path
expect(page).to have_content('New York City')

Here’s the full test file which we can put at spec/features/update_city_spec.rb.

# spec/features/update_city_spec.rb

require 'rails_helper'

RSpec.describe 'Updating a city', type: :feature do
  scenario 'valid inputs' do
    nyc = City.create!(name: 'NYC')
    visit edit_city_path(id: nyc.id)
    fill_in 'Name', with: 'New York City'
    click_on 'Update City'
    visit cities_path
    expect(page).to have_content('New York City')
  end
end

If we run this test (rspec spec/features/update_city_spec.rb), it will pass. Like before, though, we don’t want to trust this test without seeing it fail once. Otherwise, again, we can’t be sure that the test isn’t giving us a false positive.

Let’s change the line if @city.update(city_params) in CitiesController to if true so that the controller continues on without actually updating the city record. This should make the test fail.

# app/controllers/cities_controller.rb

def update
  respond_to do |format|
    #if @city.update(city_params)
    if true
      format.html { redirect_to @city, notice: 'City was successfully updated.' }
      format.json { render :show, status: :ok, location: @city }
    else
      format.html { render :edit }
      format.json { render json: @city.errors, status: :unprocessable_entity }
    end
  end
end

The test does in fact now fail.

F

Failures:

  1) Updating a city valid inputs
     Failure/Error: expect(page).to have_content('New York City')
       expected to find text "New York City" in "Cities\nName NYC Show Edit Destroy\nNew City"
     # ./spec/features/update_city_spec.rb:10:in `block (2 levels) in <top (required)>'

Finished in 0.17966 seconds (files took 0.89554 seconds to load)
1 example, 1 failure

Failed examples:

rspec ./spec/features/update_city_spec.rb:4 # Updating a city valid inputs

Integration test for deleting a city

This will be the last integration test we write. Here are the steps we’ll follow.

Create a city in the database
Visit the city index page
Assert that the page contains the name of our city
Click the “Destroy” link
Accept the “Are you sure?” alert
Assert that the page no longer contains the name of our city

Translated into code:

City.create!(name: 'NYC')
visit cities_path
expect(page).to have_content('NYC')
click_on 'Destroy'
accept_alert
expect(page).not_to have_content('NYC')

Here’s the full test file, spec/features/delete_city_spec.rb.

# spec/features/delete_city_spec.rb

require 'rails_helper'

RSpec.describe 'Deleting a city', type: :feature do
  scenario 'success' do
    City.create!(name: 'NYC')
    visit cities_path
    expect(page).to have_content('NYC')

    click_on 'Destroy'
    accept_alert
    expect(page).not_to have_content('NYC')
  end
end

If we run this test, it will pass.

$ rspec spec/features/delete_city_spec.rb

To protect against a false positive and see the test fail, we can comment out the line @city.destroy in CitiesController.

# app/controllers/cities_controller.rb

def destroy
  #@city.destroy
  respond_to do |format|
    format.html { redirect_to cities_url, notice: 'City was successfully destroyed.' }
    format.json { head :no_content }
  end
end

Now the test fails.

F

Failures:

  1) Deleting a city success
     Failure/Error: expect(page).not_to have_content('NYC')
       expected not to find text "NYC" in "City was successfully destroyed.\nCities\nName NYC Show Edit Destroy\nNew City"
     # ./spec/features/delete_city_spec.rb:9:in `block (2 levels) in <top (required)>'

Finished in 0.18866 seconds (files took 0.88234 seconds to load)
1 example, 1 failure

Failed examples:

rspec ./spec/features/delete_city_spec.rb:4 # Deleting a city success

Remember to change that line back to the test passes again.

At this point we’ve written four test cases:

Creating a city (with valid inputs)
Creating a city (with invalid inputs)
Updating a city (with valid inputs)
Deleting a city

Let’s invoke the rspec command to run all four test cases. Everything should pass.

$ rspec
....

Finished in 0.23001 seconds (files took 0.88377 seconds to load)
4 examples, 0 failures

Where to go next

If you want to learn more about writing integration tests in Rails, here are a few recommendations.

First, I recommend good old practice and repetition. If you want to get better at writing integration tests, write a whole bunch of integration tests. I would suggest building a side project of non-trivial size that you maintain over a period of months. Try to write integration tests for all the features in the app. If there’s a feature you can’t figure out how to write a test for, give yourself permission to skip that feature and come back to it later. Alternatively, since it’s just a side project without the time pressures of production work, give yourself permission to spend as much time as you need in order to get a test working for that feature.

Second, I’ll recommend a couple books.

Growing Object-Oriented Software, Guided by Tests. This is one of the first books I read when I was personally getting started with testing. I recommend it often and hear it recommended often by others. It’s not a Ruby book but it’s still highly applicable.

The RSpec Book. If you’re going to be writing tests with RSpec, it seems like a pretty good idea to read the RSpec book. I also did a podcast interview with one of the authors where he and I talk about testing. I’m recommending this book despite the fact that it advocates Cucumber, which I am pretty strongly against.

Effective Testing with RSpec 3. I have not yet picked up this book although I did do a podcast interview with one of the authors.

My last book recommendation is my own book, Rails Testing for Beginners. I think my book offers a pretty good mix of model-level testing and integration-level testing with RSpec and Capybara.

Common causes of flickering/flapping/flaky tests

1 Reply

A flapping test is a test that sometimes passes and sometimes fails even though the application code being tested hasn’t changed. Flapping tests can be hard to reproduce, diagnose, and fix. Here are some of the common causes I know of for flapping tests. If you know the common causes of flapping tests it can go a long way toward diagnosis. Once diagnosis is out of the way, the battle is half over.

Race conditions

Let’s say you have a Capybara test that clicks a page element that fires off an AJAX request. The AJAX request completes, making some other page element clickable. Sometimes the AJAX request beats the test runner, meaning the test works fine. Sometimes the test runner beats the AJAX request, meaning the test breaks.

There’s a common race condition I’ve experienced in Capybara. A page will load, then Capybara will visit a subsequent page to verify some result. Unfortunately, the visit happens before the prior page gets a chance to fully do its thing, resulting in a failing test. An easy way to fix this is to add an expect(page).to have_content('some content') just before the visit. This way Capybara will wait for the first page to fully load before it tries to visit the second page.

“Leaking” tests (non-deterministic test suite)

Sometimes a test will fail to clean up after itself and “leak” state into other tests. To take a contrived but obvious example, let’s say test A checks that the “list customers” page shows exactly 5 customers and test B verifies that a customer can be created successfully. If test B doesn’t clean up after itself, then running test B before test A will cause test A to fail because there are now 6 customers instead of 5. This exact issue is almost never an issue in Rails (because each test runs inside a transaction) but database interaction isn’t the only possible form of leaky tests.

When I’m fixing this sort of issue I like to a) determine an order of running the tests that always causes my flapping test to fail and then b) figure out what’s happening in the earlier test that makes the later test fail. In RSpec I like to take note of the seed number, then run rspec --seed <seed number> which will re-run my tests in the same order.

Reliance on third-party code

Sometimes tests unexpectedly fail because some behavior of third-party code has changed. I actually sometimes consciously choose to live with this category of flapping test. If an API I rely on goes down and my tests fail, the tests arguably should fail because my application is genuinely broken. But if my tests are flapping in a way that’s not legitimate, I’d probably change my test to use something like VCR instead of hitting the real API.

A Rails model test “hello world”

What we’re going to do

What we’re going to do in this post is:

Initialize a new Rails application
Install RSpec using the rspec-rails gem
Generate a User model
Write a single test for that User model

The test we’ll write will be a trivial one. The user will have both a first and last name. On the User model we’ll define a full_name method that concatenates first and last name. Our test will verify that this method works properly. Let’s now begin the first step, initializing the Rails application.

Initializing the Rails application

Let’s initialize the application using the -T flag meaning “no test framework”. (We want RSpec in this case, not the Rails default of MiniTest.)

$ rails new model_test_hello_world -T
$ cd model_test_hello_world

Now we can install RSpec.

Installing RSpec

To install RSpec, all we have to do is add rspec-rails to our Gemfile and then do a bundle install.

# Gemfile

group :development, :test do
  gem 'rspec-rails'
end

$ bundle install

We’ve installed the RSpec gem but we still have to run rails g rspec:install which will create a couple configuration files for us.

$ rails g rspec:install

Now we can move on to creating the User model.

Creating the User model

Let’s create a User model with just two attributes: first_name and last_name.

$ rails g scaffold user first_name:string last_name:string
$ rails db:migrate

Now we can write our test.

The User test

Writing the test

Here’s the test code:

# spec/models/user_spec.rb

require 'rails_helper'

RSpec.describe User, type: :model do
  describe '#full_name' do
    it 'concatenates first and last name' do
      user = User.new(first_name: 'Abraham', last_name: 'Lincoln')
      expect(user.full_name).to eq('Abraham Lincoln')
    end
  end
end

Let’s take a closer look at this test code.

Examining the test

Let’s focus for a moment on this part of the test code:

it 'concatenates first and last name' do
  user = User.new(first_name: 'Abraham', last_name: 'Lincoln')
  expect(user.full_name).to eq('Abraham Lincoln')
end

This code is saying:

Instantiate a new User object with first name Abraham and last name Lincoln.
Verify that when we call this User object’s full_name method, it returns the full name, Abraham Lincoln.

What are the its and describes all about? Basically, these “it blocks” and “describe blocks” are there to allows us to put arbitrary labels on chunks of code to make the test more human-readable. (This is not precisely true but it’s true enough for our purposes for now.) As an illustration, I’ll drastically change the structure of the test:

# spec/models/user_spec.rb

require 'rails_helper'

RSpec.describe User, type: :model do
  it 'does stuff' do
    user = User.new(first_name: 'Abraham', last_name: 'Lincoln')
    expect(user.full_name).to eq('Abraham Lincoln')
  end
end

This test will pass just fine. Every it block needs to be nested inside a do block, but RSpec doesn’t care how deeply nested our test cases are or whether the descriptions we put on our tests (e.g. the vague and mysterious it 'does stuff') make any sense.

Running the test

If we run this test it will fail. It fails because the full_name method is not yet defined.

F

Failures:

  1) User#full_name concatenates first and last name
     Failure/Error: expect(user.full_name).to eq('Abraham Lincoln')
     
     NoMethodError:
       undefined method `full_name' for #<User:0x00007ff73d419e20>
     # ./spec/models/user_spec.rb:7:in `block (3 levels) in <top (required)>'

Finished in 0.00695 seconds (files took 0.77932 seconds to load)
1 example, 1 failure

Failed examples:

rspec ./spec/models/user_spec.rb:5 # User#full_name concatenates first and last name

Let’s now write the full_name method.

# app/models/user.rb

class User < ApplicationRecord
  def full_name
    "#{first_name} #{last_name}"
  end
end

Now the test passes.

.                                              

Finished in 0.008 seconds (files took 0.80982 seconds to load)                                
1 example, 0 failures

Taming legacy Ruby code using the “Sprout Method” technique (example 2)

2 Replies

Note: this post is titled “example 2” because some time ago I wrote another Sprout Method example, but after I wrote that post I decided I could come up with a better example.

The Sprout Method technique

If a project was developed without testing in mind, the code will often involve a lot of tight coupling (objects that depend closely on other objects) which makes test setup difficult.

The solution to this problem is simple in concept – just change the tightly coupled objects to loosely coupled ones – but the execution of this solution is often not simple or easy.

The challenge with legacy projects is that you often don’t want to touch any of this mysterious code before it has some test coverage, but it’s impossible to add tests without touching the code first, so it’s a chicken-egg problem. (Credit goes to Michael Feathers’ Working Effectively with Legacy Code for pointing out this chicken-egg problem.)

So how can this chicken-egg problem be overcome? One technique that can be applied is the Sprout Method technique, described both in WEWLC as well as Martin Fowler’s Refactoring.

Here’s an example of some obscure code that uses tight coupling:

require 'open-uri'
file = open('http://www.gutenberg.org/files/11/11-0.txt')
text = file.read
text.gsub!(/-/, ' ')
words = text.split
cwords = []
words.each do |w|
  w.gsub!(/[,\?\.‘’“”\:;!\(\)]/, '')
  cwords << w.downcase
end
words = cwords
words.sort!
whash = {}
words.each do |w|
  if whash[w].nil?
    whash[w] = 0
  end
  whash[w] += 1
end
whash = whash.sort_by { |k, v| v }.to_h
swords = words.sort_by do |el|
  el.length
end
lword = swords.last
whash.each do |k, v|
  puts "#{k.ljust(lword.length + 3 - v.to_s.length, '.')}#{v}"
end

If you looked at this code, I would forgive you for not understanding it at a glance.

One thing that makes this code problematic to test is that it’s tightly coupled to two dependencies: 1) the content of the file at http://www.gutenberg.org/files/11/11-0.txt and 2) stdout (the puts on the second-to-last line).

If I want to separate part of this code from its dependencies, maybe I could grab this chunk of the code (which is the majority of the code, but doesn’t include the network dependency):

text.gsub!(/-/, ' ')
words = text.split
cwords = []
words.each do |w|
  w.gsub!(/[,\?\.‘’“”\:;!\(\)]/, '')
  cwords &lt;&lt; w.downcase
end
words = cwords
words.sort!
whash = {}
words.each do |w|
  if whash[w].nil?
    whash[w] = 0
  end
  whash[w] += 1
end
whash = whash.sort_by { |k, v| v }.to_h
swords = words.sort_by do |el|
  el.length
end
lword = swords.last

I can now take these lines and put them in their own new method:

require 'open-uri'

def count_words(text)
  text.gsub!(/-/, ' ')
  words = text.split
  cwords = []
  words.each do |w|
    w.gsub!(/[,\?\.‘’“”\:;!\(\)]/, '')
    cwords << w.downcase
  end
  words = cwords
  words.sort!
  whash = {}
  words.each do |w|
    if whash[w].nil?
      whash[w] = 0
    end
    whash[w] += 1
  end
  whash = whash.sort_by { |k, v| v }.to_h
  swords = words.sort_by do |el|
    el.length
  end
  lword = swords.last
  [whash, lword]
end

file = open('http://www.gutenberg.org/files/11/11-0.txt')
whash, lword = count_words(file.read)

whash.each do |k, v|
  puts "#{k.ljust(lword.length + 3 - v.to_s.length, '.')}#{v}"
end

I might still not understand the code but at least now I can start to put tests on it. Whereas before the program would only operate on the contents of http://www.gutenberg.org/files/11/11-0.txt and output the contents to the screen, now I can feed in whatever content I like and have the results available in the return value of the method. (If you’d like to see another example of the Sprout Method technique, I wrote another example here.

Code smell: long, procedural background jobs (and how to fix it)

1 Reply

I’ve worked on a handful of Rails applications that have pretty decent test coverage in terms of model specs and feature specs but are lacking in tests for background jobs. It seems that developers often find background jobs more difficult to test than Active Record models—otherwise the test coverage would presumably be better in that area. I’ve found background jobs of 100+ lines in length with no tests, even though the same application’s Active Record models had pretty good test coverage.

Apparently the background job code is somehow viewed as “separate” from the application code. For some reason the background job code doesn’t get tests, and for some reason the background job code doesn’t have an object-oriented structure like the rest of the application.

Whatever the root cause of this issue, the fix is fortunately often pretty simple. We can apply the extract class refactoring technique to turn the long procedure background job into one or more neatly abstracted classes.

Just as I would when embarking on the risky task of refactoring any piece of untested legacy code, I would use characterization testing to help me make sure I’m preserving the original behavior of the background job code as I move it into its own class.

A Rails testing “hello world” using RSpec and Capybara

7 Replies

The following is an excerpt from my book, The Complete Guide to Rails Testing.

What we’re going to do

One of the biggest challenges in getting started with testing is just that—the simple act of getting started. What I’d like is for you, dear reader, to get a small but meaningful “win” under your belt as early as possible. If you’ve never written a single test in a Rails application before, then by the end of this chapter, you will have written your first test.

Here’s what we’re going to do:

- Initialize a plain-as-possible Rails application
- Create exactly one static page that says “Hello, world!”
- Write a single test (using RSpec and Capybara) that verifies that our static page in fact says “Hello, world!”

The goal here is just to walk through the motions of writing a test. The test itself is rather pointless. I would probably never write a test in real life that just verifies the content of a static page. But the goal is not to write a realistic test but to provide you with a mental “Lego brick” that you can combine with other mental Lego bricks later. The ins and outs of writing tests get complicated quick enough that I think it’s valuable to start with an example that’s almost absurdly simple.

The tools we’ll be using

For this exercise we’ll be using RSpec, anecdotally the most popular of the Rails testing frameworks, and Capybara, a library that enables browser interaction (clicking buttons, filling out forms, etc.) using Ruby. You may find some of the RSpec or Capybara syntax confusing or indecipherable. You also may well not understand where the RSpec stops and the Capybara starts. For the purposes of this “hello world” exercise, that’s okay. Our goal right now is not deep understanding but to begin putting one foot in front of the other.

Initializing the application

Run the “rails new” command

First we’ll initialize this Rails app using the good old rails new command.

$ rails new hello_world -T

You may be familiar with the -T flag. This flag means “no tests”. If we had done rails new hello_world without the -T flag, we would have gotten a Rails application with a test directory containing MiniTest tests. I want to use RSpec, so I want us to start with no MiniTest.

Let’s also create our application’s database at this point since we’ll have to do that eventually anyway.

$ cd hello_world
$ rails db:create

Update our Gemfile

This is the step where RSpec and Capybara will start to come into the picture. Each library will be included in our application in the form of a gem. We’ll also include the webdrivers gem which is necessary in order for Capybara to interact with the browser.

Let’s add the following to our Gemfile under the :development, :test group. I’ve added a comment next to each gem or group of gems describing its role. Even with the comments, it may not be abundantly clear at this moment what each gem is for. At the end of the exercise we’ll take a step back and talk about which library enabled which step of what we just did.

group :development, :test do
  # The RSpec testing framework
  gem 'rspec-rails'

  # Capybara, the library that allows us to interact with the browser using Ruby
  gem 'capybara'

  # The following gems aids with the nuts and bolts
  # of interacting with the browser.
  gem 'webdrivers'
end

Don’t forget to bundle install.

Install RSpec

Although we’ve already installed the RSpec gem, we haven’t installed RSpec into our application. Just like the Devise gem, for example, which requires us not only to add devise to our Gemfile but to also run rails g devise:install, RSpec installation is a two-step process. After we run this command we’ll have a spec directory in our application containing a couple config files.

$ rails g rspec:install

Now that we’ve gotten the “plumbing” work out of the way, let’s write some actual application code.

Creating our static page

Let’s generate a controller, HelloController, with just a single action, index.

$ rails g controller hello_world index

Let’s modify the action’s template so it just says “Hello, world!”.

Hello, world!

Just as a sanity check, let’s start the Rails server and open up our new page in the browser to make sure it works as expected.

$ rails server
$ open http://localhost:3000/hello_world/index

We have our test infrastructure. We have the feature we’re going to test. Now let’s write the test itself.

Writing our test

Write the actual test

Let’s create a file called spec/hello_world_spec.rb with the following content:

require 'rails_helper'

RSpec.describe 'Hello world', type: :system do
  describe 'index page' do
    it 'shows the right content' do
      visit hello_world_index_path
      expect(page).to have_content('Hello, world!')
    end
  end
end

If you focus on the “middle” two lines, the ones starting with visit and expect, you can see that this test takes the form of two steps: visit the hello world index path and verify that the page there says “Hello, world!” If you’d like to understand the test in more detail, below is an annotated version.

# This pulls in the config from spec/rails_helper.rb
# that's needed in order for the test to run.
require 'rails_helper'

# RSpec.describe, or just describe, is how all RSpec tests start.
# The 'Hello world' part is an arbitrary string and could have been anything.
# In this case we have something extra in our describe block, type: :system.
# The type: :system setting does have a functional purpose. It's what
# triggers RSpec to bring Capybara into the picture when we run the test.
RSpec.describe 'Hello world', type: :system do

  describe 'index page' do
    it 'shows the right content' do

      # This is where Capybara starts to come into the picture. "visit" is a
      # Capybara method. hello_world_index_path is just a Rails routing
      # helper method and has nothing to do with RSpec or Capybara.
      visit hello_world_index_path

      # The following is a mix of RSpec syntax and Capybara syntax. "expect"
      # and "to" are RSpec, "page" and "have_content" are Capybara. Newcomers
      # to RSpec and Capybara's English-sentence-like constructions often
      # have difficulty remembering when two words are separated by a dot or
      # an underscore or parenthesis, myself included. Don't worry, you'll
      # get familiar over time.
      expect(page).to have_content('Hello, world!')
    end
  end
end

Now that we’ve written and broken down our test, let’s run it!

Run the test

This test can be run by simply typing rspec followed by the path to the test file.

$ rspec spec/hello_world_spec.rb

The test should pass. There’s a problem with what we just did, though. How can we be sure that the test is testing what we think it’s testing? It’s possible that the test is passing because our code works. It’s also possible that the test is passing because we made a mistake in writing our test. This concern may seem remote but “false positives” happen a lot more than you might think. The only way to be sure our test is working is to see our test fail first.

Make the test fail

Why does seeing a test fail prior to passing give us confidence that the test works? What we’re doing is exercising two scenarios. In the first scenario, the application code is broken. In the second scenario, the application code is working correctly. If our test passes under that first scenario, the scenario where we know the application code is broken, then we know our test isn’t doing its job properly. A test should always fail when its feature is broken and always pass when its feature works. Let’s now break our “feature” by changing the text on our page to something wrong.

Jello, world!

When we run our test now it should see it fail with an error message like expected to find text "Hello, world!" in "Jello, world!".

$ rspec spec/hello_world_spec.rb

Watch the test run in the browser

Add the following to the bottom of spec/rails_helper.rb:

Capybara.default_driver = :selenium_chrome

Then run the test again:

$ rspec spec/hello_world_spec.rb

This time, Capybara should pop open a browser where you can see our test run. When I’m actively working with a test and I want to see exactly what it’s doing, I often find the running of the test to be too fast for me, so I’ll temporarily add a sleep wherever I want my test to slow down. In this case I would put the sleep right after visiting hello_index_path.

require 'rails_helper'

RSpec.describe 'Hello world', type: :system do
  describe 'index page' do
    it 'shows the right content' do
      visit hello_world_index_path
      sleep(5)
      expect(page).to have_content('Hello, world!')
    end
  end
end

Review

There we have it: the simplest possible Rails + RSpec + Capybara test. Unfortunately “the simplest possible Rails + RSpec + Capybara test” is still not particularly simple in absolute terms, but it’s pretty simple compared to everything that goes on in a real production application.

Who’s to blame for bad code? Coders.

2 Replies

When time pressure from managers drives coders to write poor-quality code, some people think the responsible party is the managers. I say it’s the coders. Here’s why I think so.

The ubiquity of time pressure in software development work

I’ve worked on a lot of development projects with a lot of development teams. In the vast majority of cases, if not all cases, the development team is perceived as being “behind”. Sometimes the development team’s progress is slower than the development team themselves set. Sometimes development progress is seen as behind relative to some arbitrary, unrealistic deadline set by someone outside the development team. Either way, no one ever comes by and says to the development team, “Wow, you guys are way ahead of where we expected!”

Often leadership will beg, plead, cajole, flatter, guilt-trip or bribe in order to increase the chances that their timeline will be met. They’ll ask the developers things like, “How can we ‘get creative’ to meet this timeline?” They’ll offer to have lunch, dinner, soda and alcohol brought in for a stretch of time so the developers can focus on nothing but the coding task at hand. Some leaders, lacking a sense of short-term/long-term thinking, will ask the developers to cut corners, to consciously incur technical debt in order to work faster. They might say things like “users don’t care about code” or “perfect is the enemy of the good“.

Caving to the pressure

Unfortunately, in my experience, most developers, most of the time, cave to the pressure to cut corners and deliver poor quality work. If you think I’m mistaken, I would ask you to show me a high-quality production codebase. There are very few in existence.

I myself am by no means innocent. I’ve caved to the pressure and delivered poor-quality work. But that doesn’t mean I think it’s excusable to do poor work or even (as some developers seem to believe) heroic.

Why bad code is the fault of coders, not managers

Why are coders, not managers, the ones responsible for bad code? Because coders are the ones who wrote the code. It was their fingers on the keyboard. The managers weren’t holding a gun to the developers’ heads and demanding they do a bad job.

“But,” a developer might say, “I had to cut corners in order to meet the deadline, and if I didn’t meet the deadline, I might get fired.” I don’t believe this reasoning stands up very well to scrutiny. I’d like to examine this invalid excuse for doing poor work as well as some others.

Invalid excuses for shipping poor work

Fallacy: cutting corners is faster

Why exactly is “bad code” bad? Bad code is bad precisely because it’s slow and expensive to work with. Saying, “I know this isn’t the right way to do it but it was faster” is a self-contradiction. The exact reason “the right way” is the right way is because it’s faster. Faster in the long run, of course. Sloppy, rushed work can be faster in the very short-term, of course, but any speed gain is usually negated the next time someone has to work with that area of code, which could be as soon as later that same day.

Fallacy: “they didn’t give me enough time to do a good job”

If I’m not given enough time to do a good job, the answer is not to do a bad job. The answer is that that thing can’t be done in that amount of time.

If my boss comes to me and says, “Can we do X in 6 weeks” and my response is “It would take 12 weeks to do X, but if we cut a lot of corners we could do it in 6 weeks” then all my boss hears is “WE CAN DO IT IN 6 WEEKS!”

Given the option between hitting a deadline with “bad code” and missing a deadline, managers will roughly always choose hitting the deadline with bad code. In this way managers are like children who ask over and over to eat candy for dinner. They can’t be trusted to take the pros and cons and make the right decision.

Fallacy: cutting corners is a business decision

It’s my boss’s role to tell me what to do and when to do it. It’s my job to, for the most part, comply. As a programmer I typically won’t have visibility into business decisions the way my boss does. As a programmer I will often not have the right skillset or level of information necessary to have a valid opinion on most business matters.

This idea that my boss can tell me what to do only applies down to a certain level of granularity though. I should let my boss tell me what to do but I shouldn’t let my boss tell me how to do it. I care about the professional reputation I have with my colleagues and with myself. I think there’s a certain minimum level of quality below which professionals should not be willing to go, even though it’s always an option to go lower.

Fallacy: “we can’t afford perfection”

I wish I had a dollar for every time I heard somebody say something like, “Guys, it doesn’t need to be perfect, it just needs to be good enough.”

The fallacy here is that the choice is between “perfect” and “good”. That’s a mistaken way of looking at it. The standard mode of operation during a rush period for developers is to pile sloppy Band-Aid fix on top of sloppy Band-Aid fix as if they desire nothing more strongly in life than to paint themselves into a corner from which they will never be able to escape. The alternative to this horror show isn’t “perfection”. The alternative is just what I might call a “minimally good job”.

Fallacy: “if I don’t comply with instructions to cut corners, I’ll get fired”

This might be true a very very very small percentage of the time. Most of the time it’s not, though. It’s pretty hard to get fired from a job. Firing someone is painful and expensive. Usually someone has to rack up many offenses before they finally get booted.

Fallacy: “we’ll clean it up after the deadline”

Fixing sloppy work after the deadline “when we have more time” or “when things slow down” never happens. Deadlines, such as a go-live date, often translate to increased pressure and time scarcity because now there’s more traffic and visibility on the system. Deadlines are usually followed by more deadlines. You never get “more time”. Pressure rarely goes down over time. It usually goes up.

Fallacy: “I would have done a better job, but it wasn’t up to me”

Nope. I alone am responsible for the work I do. If someone pressured me to do the wrong thing and in response I chose to do the wrong thing, then I’m responsible for choosing to cave to the pressure.

Fallacy: “I haven’t been empowered to make things better”

It’s rare that a superior will come along and tell me, “Jason, you explicitly have permission to make things better around here.” The reality is that it’s up to me to observe what needs to be done better and help make it happen. I’m a believer in the saying that “it’s easier to get forgiveness than permission”. I tend to do what my professional judgment tells me needs to be done without necessarily asking my boss if it’s okay. Does this get me in trouble sometimes? Yes. I look at those hand-slaps as the cost of self-empowerment.

Plus, all employees have power over their superiors in that they have the power to leave. I don’t mean this is a “let me have my way or I quit!” way. I mean that if I work somewhere that continually tries to force me to do a bad job, I don’t have to just accept it as my fate in life. I can go work somewhere else. That doesn’t fix anything for the organization I’m leaving (bad organizations and bad people can’t be fixed anyway) but it does fix the problem for me.

My admonition to my colleagues: don’t make excuses for poor work

I think developers make too many excuses and apologies for bad code. (And as a reminder, “bad code” means code that’s risky, time-consuming and expensive to work with.)

It’s not realistic to do a great job all the time, but I think we can at least do a little better. Right now I think we as a profession err too much on the side of justifying poor-quality work. I think we could stand to err more on the side of taking responsibility.

Sometimes it’s better for tests to hit the real API

2 Replies

A mysteriously broken test suite

The other day I ran my test suite and discovered that nine of my tests were suddenly and mysteriously broken. These tests were all related to the same feature: a physician search field that hits something called the NPPES NPI Registry, a public API.

My first thought was: is something wrong with the API? I visited the NPI Registry site and, sure enough, I got a 500 error when I clicked the link to their API URL. The API appeared to be broken. This was a Sunday so I didn’t necessarily the NPI API maintainers to fix it speedily. I decided to just ignore these nine specific tests for the time being.

By mid-Monday, to my surprise, my tests were still broken and the NPI API was still giving a 500 error. I would have expected them to have fixed it by then.

The surprising root cause of the bug

On Tuesday the API was still broken. On Wednesday the API was still broken. At this point I thought okay, maybe it’s not the case that the NPI API has just been down for four straight days with the NPI API maintainers not doing anything about it. Maybe the problem lies on my end. I decided to do some investigation.

Sure enough, the problem was on my end. Kind of. Evidently the NPI API maintainers had made a small tweak to how the API works. In addition to the fields I had already been sending like first_name and last_name, the NPI API now required an additional field, address_purpose. Without address_purpose, the NPI API would return a 500 error. I added address_purpose to my request and all my tests started passing again. And, of course, my application’s feature started working again.

The lesson: my tests were right to hit the real API

When I first discovered that nine of my tests were failing due to a broken external API, my first thought was, “Man, maybe I should mock out those API calls so my tests don’t fail when the API breaks.” But then I thought about it a little harder. The fact that the third-party API was broken meant my application was broken. The tests were failing because they ought to have been failing.

It’s been my observation that developers new to testing often develop the belief that “third-party API calls should be mocked out”. I don’t think this is always true. I think the decision whether to mock third-party APIs should be made on a case-by-case basis depending on certain factors. These factors include, but aren’t limited to:

Would allowing my tests to hit the real API mess up production data?
Is the API rate-limited?
Is the API prone to sluggishness or outages?

(Thanks to Kostis Kapelonis, who I interviewed on The Ruby Testing Podcast, for helping me think about API interaction this way.)

If hitting the real API poses no danger to production data, I would probably be more inclined to let my tests hit the real API than to mock out the responses. (Or, to be more precise, I would want my test suite to include some integration tests that hit the real API, even if I might have some other API-related tests that mock out API responses for economy of speed.) My application is only really working if the API is working too.

Refactorings should be atomic

Tiny atomic refactorings

I have a mantra I repeat to myself regularly: “never underestimate your own ability to screw stuff up”. Seemingly safe, innocuous, inconsequential one-line changes have an amazing ability to introduce surprising bugs.

For this reason, I virtually never mix refactorings or cleanups in with my feature work or bugfixes, even very tiny ones, unless the thing I’m cleaning up is directly in the path of the work I need to do to complete my feature or bugfix. Let’s say for example that I’m adding a new capability to the sales report. I don’t want to toss in a small cleanup and accidentally break the password reset feature.

If I discover a tiny improvement I want to make, e.g. replacing a single instance of appointments.map { |a| a.user } to appointments.map(&:user), I’ll make a separate commit just with that tiny improvement instead of letting it become a “rider” on an unrelated commit. This is basically just an application of the idea of atomic commits.

Bigger atomic refactorings

Sometimes a refactoring isn’t all that small. For example, sometimes you want to take a handful of classes and move them into a new namespace to give some structure to the files in your application.

In these cases I’ll try to create one single commit that effects the whole entire refactoring. This way, if I get 75% done with the refactoring and realize my whole premise is a blunder, I can just git checkout . && git clean -df and cleanly go back to the way things were before I started my refactoring work.

When to perform refactorings

In my experience there are two good times to perform a refactoring. If there’s an area of the code that I need to do some work in but it’s messy, I’ll perform a refactoring (or multiple refactorings) in that area of code as a separate piece of work before I start making my change in that area. I think of this as “cleaning the kitchen before I make dinner”. It’s faster and more enjoyable to make dinner in a clean kitchen than in a messy one.

One good time to perform a refactor is before a change. The other is after a change. If I make a change to an area of code and I find that afterward that area is unacceptably messy, I’ll perform a refactoring—again, as a separate piece of work—to clean it up. By the way, it’s possible to end up with a messy area of a codebase by simply writing poor-quality code, but that’s not the only way to introduce bad code. The accumulation of technical debt is natural and unavoidable and in fact the only way not to accumulate technical debt is to periodically perform refactorings.

The one time that I don’t think is a good time to perform refactorings is in the middle of a change. If the refactoring introduces a bug but the refactoring change is mixed with a change that you want to keep, it can be hard to back out the refactoring change without also affecting the “good” change. It’s also harder to review a pull request that contains both a change and a refactoring than it is to review a pull request that only contains a change. In order to have refactorings as a separate pull request, this means that the development team (and perhaps technical leadership) will have to be on board with the idea that you’ll regularly have PRs that contain nothing but a refactoring.

By the way, credit goes to Ben Orenstein (who I interviewed on my podcast) for teaching me that just before a change and just after a change are good times to do refactorings.

When refactorings can’t be atomic

Sometimes a refactoring is too big to be atomic. Sometimes I embark upon a “refactoring project” that will touch many areas of the codebase over the course of multiple weeks or months. In these cases I deliberately spread the refactoring project into small pieces over time as a way of mitigating risk. I’ve found though that most big refactorings, while not atomic as a whole, can be split into a number of smaller refactorings that are themselves individually atomic.

How I write characterization tests

What characterization tests are and what problem they solve

Let’s say I come across a piece of code that I want to refactor but unfortunately a) I don’t understand what it does and b) it’s not covered by tests. Refactoring untested code I don’t understand that’s not tested is risky business. But how can I write tests for the code if I don’t even know what it’s supposed to do?

Luckily there’s a technique called Characterization Testing, which I first discovered from Michael Feathers’ book, which makes solving this challenge pretty straightforward and easy. I’m going to show you how I do it.

What I’m going to show you

Below is a chunk of code I found in an old project of mine. The two methods below, services_with_info and products_with_info have a couple problems. One is that they’re very duplicative of each other. Another is that the Appointment class, which clocks in at 708 lines of code (most of which I’ve omitted out of mercy for the reader), is doing way two much stuff, and these two methods aren’t helping. These two methods should probably really belong to some other abstraction so Appointment can focus on being an Appointment and not have to worry about serializing its appointment_services and appointment_products.

class Appointment < ActiveRecord::Base
  def services_with_info
    appointment_services.collect { |item|
      item.serializable_hash.merge(
        "price" => number_with_precision(item.price, :precision => 2),
        "label" => item.service.name,
        "item_id" => item.service.id,
        "type" => "service"
      )
    }
  end

  def products_with_info
    appointment_products.collect { |item|
      item.serializable_hash.merge(
        "price" => number_with_precision(item.price, :precision => 2),
        "label" => item.product.name,
        "item_id" => item.product.id,
        "type" => "product"
      )
    }
  end
end

Refactoring the services_with_info method

Of these two methods I’m going to focus first on services_with_info. My goal is to take the body of the method, ask myself “What currently-nonexistent abstraction does this group of lines represent?”, create that abstraction, and move the body of the method there.

I might come up with an abstraction that I’m happy with on the first try, I might not. The first abstraction idea I’ll try is a new abstraction called AppointmentServiceCollection. Below is the class. For the most part all I’ve done is copy and paste the contents of services_with_info into this new class.

class AppointmentServiceCollection
  def initialize(appointment_services)
    @appointment_services = appointment_services
  end

  def to_h
    @appointment_services.to_h { |item|
      item.serializable_hash.merge(
        "price" => number_with_precision(item.price, :precision => 2),
        "label" => item.service.name,
        "item_id" => item.service.id,
        "type" => "service"
      )
    }
  end
end

First goal: simply instantiate the object inside a test

When I start writing my test for this new class I’m going to put in as little effort as possible. My first question to myself will be: can I even instantiate an AppointmentServiceCollection?

require 'spec_helper'

describe AppointmentServiceCollection do
  describe '#to_h' do
    it 'works' do
      appointment_service = create(:appointment_service)
      collection = AppointmentServiceCollection.new([appointment_service])
    end
  end
end

Second goal: see what value the method-under-test returns

When I run the test it passes. So far so good. My next step will be to add a very silly assertion.

require 'spec_helper'

describe AppointmentServiceCollection do
  describe '#to_h' do
    it 'works' do
      appointment_service = create(:appointment_service)
      collection = AppointmentServiceCollection.new([appointment_service])
      expect(collection.to_h).to eq('asdf')
    end
  end
end

I of course know that collection.to_h won’t equal asdf, but I don’t know what it will equal, so any value I check for will be equally wrong. No point in exerting any mental effort guessing what the output will be. The test result will tell me soon enough anyway.

When I run my test now, here’s what I get:

F

Failures:

  1) AppointmentServiceCollection#to_h works
     Failure/Error: expect(collection.to_h).to eq('asdf')
     TypeError:
       wrong element type AppointmentService at 0 (expected array)
     # ./app/models/appointment_service_collection.rb:7:in `to_h'
     # ./app/models/appointment_service_collection.rb:7:in `to_h'
     # ./spec/models/appointment_service_collection_spec.rb:8:in `block (3 levels) in <top (required)>'

Finished in 0.50943 seconds (files took 3.44 seconds to load)
1 example, 1 failure

That’s unexpected. I could scratch my head and puzzle over why this is happening, but instead I’ll just comment out code until the error goes away.

class AppointmentServiceCollection
  def initialize(appointment_services)
    @appointment_services = appointment_services
  end

  def to_h
    @appointment_services.to_h { |item|
      #item.serializable_hash.merge(
        #"price" => number_with_precision(item.price, :precision => 2),
        #"label" => item.service.name,
        #"item_id" => item.service.id,
        #"type" => "service"
      #)
    }
  end
end

When I run the test again I get the same error again. I’ll comment out more code.

class AppointmentServiceCollection
  def initialize(appointment_services)
    @appointment_services = appointment_services
  end

  def to_h
    #@appointment_services.to_h { |item|
      #item.serializable_hash.merge(
        #"price" => number_with_precision(item.price, :precision => 2),
        #"label" => item.service.name,
        #"item_id" => item.service.id,
        #"type" => "service"
      #)
    #}
  end
end

Now, not too surprisingly, I get a different error instead:

F

Failures:

  1) AppointmentServiceCollection#to_h works
     Failure/Error: expect(collection.to_h).to eq('asdf')
       
       expected: "asdf"
            got: nil
       
       (compared using ==)
     # ./spec/models/appointment_service_collection_spec.rb:8:in `block (3 levels) in <top (required)>'

Finished in 0.49608 seconds (files took 3.41 seconds to load)
1 example, 1 failure

This makes a lightbulb go on for me. Clearly the problem lies in the #@appointment_services.to_h { |item| line. The problem must be the to_h. What if I try map instead?

class AppointmentServiceCollection
  def initialize(appointment_services)
    @appointment_services = appointment_services
  end

  def to_h
    @appointment_services.map { |item|
      #item.serializable_hash.merge(
        #"price" => number_with_precision(item.price, :precision => 2),
        #"label" => item.service.name,
        #"item_id" => item.service.id,
        #"type" => "service"
      #)
    }
  end
end

That works. Instead of getting the “wrong element type” error I was getting before when the @appointment_services I’m now getting a different error that’s more in line with my expectations:

F

Failures:

  1) AppointmentServiceCollection#to_h works
     Failure/Error: expect(collection.to_h).to eq('asdf')
       
       expected: "asdf"
            got: [nil]
       
       (compared using ==)
       
       Diff:
       @@ -1,2 +1,2 @@
       -"asdf"
       +[nil]
       
     # ./spec/models/appointment_service_collection_spec.rb:8:in `block (3 levels) in <top (required)>'

Finished in 0.48778 seconds (files took 3.5 seconds to load)
1 example, 1 failure

Now let me uncomment the full body of the to_h method and see what happens.

class AppointmentServiceCollection
  def initialize(appointment_services)
    @appointment_services = appointment_services
  end

  def to_h
    @appointment_services.map { |item|
      item.serializable_hash.merge(
        "price" => number_with_precision(item.price, :precision => 2),
        "label" => item.service.name,
        "item_id" => item.service.id,
        "type" => "service"
      )
    }
  end
end

Now I get a different error. It doesn’t like my use of number_with_precision.

F

Failures:

  1) AppointmentServiceCollection#to_h works
     Failure/Error: expect(collection.to_h).to eq('asdf')
     NoMethodError:
       undefined method `number_with_precision' for #<AppointmentServiceCollection:0x007ff60274e640>
     # ./app/models/appointment_service_collection.rb:9:in `block in to_h'
     # ./app/models/appointment_service_collection.rb:7:in `map'
     # ./app/models/appointment_service_collection.rb:7:in `to_h'
     # ./spec/models/appointment_service_collection_spec.rb:8:in `block (3 levels) in <top (required)>'

Finished in 0.52793 seconds (files took 3.32 seconds to load)
1 example, 1 failure

Luckily I happen to know exactly how to fix this. In order to use the number_with_precision helper in a model (as opposed to in a view), you need to include the ActionView::Helpers::NumberHelper module.

class AppointmentServiceCollection
  include ActionView::Helpers::NumberHelper

  def initialize(appointment_services)
    @appointment_services = appointment_services
  end

  def to_h
    @appointment_services.map { |item|
      item.serializable_hash.merge(
        "price" => number_with_precision(item.price, :precision => 2),
        "label" => item.service.name,
        "item_id" => item.service.id,
        "type" => "service"
      )
    }
  end
end

The return values, revealed

With this error fixed, I now get my most interesting test result yet:

F                                              

Failures:                                      

  1) AppointmentServiceCollection#to_h works   
     Failure/Error: expect(collection.to_h).to eq('asdf')                                     
                                               
       expected: "asdf"                        
            got: [{"id"=>1, "appointment_id"=>1, "service_id"=>1, "created_at"=>Sun, 24 Feb 2019 16:45:16 EST -05:00, "updated_at"=>Sun, 24 Feb 2019 16:45:16 EST -05:00, "length"=>nil, "stylist_id"=>2, "price"=>"0.00", "label"=>"ve0xttqqfl", "item_id"=>1, "type"=>"service"}]        
                                               
       (compared using ==)                     
                                               
       Diff:                                   
       @@ -1,2 +1,12 @@                        
       -"asdf"                                 
       +[{"id"=>1,                             
       +  "appointment_id"=>1,                 
       +  "service_id"=>1,                     
       +  "created_at"=>Sun, 24 Feb 2019 16:45:16 EST -05:00,                                 
       +  "updated_at"=>Sun, 24 Feb 2019 16:45:16 EST -05:00,                                 
       +  "length"=>nil,                       
       +  "stylist_id"=>2,                     
       +  "price"=>"0.00",                     
       +  "label"=>"ve0xttqqfl",               
       +  "item_id"=>1,                        
       +  "type"=>"service"}]                  
                                               
     # ./spec/models/appointment_service_collection_spec.rb:8:in `block (3 levels) in <top (required)>'                                                                                      

Finished in 0.72959 seconds (files took 3.24 seconds to load)                                 
1 example, 1 failure

The reason I say this result is interesting is because instead of an error I get the actual value that the method is returning. The secrets of the universe are being revealed to me.

More realistic assertions

At this point I’m able to come up with some more realistic values to put in my test. I know that my test will fail because my values are just arbitrary and made up, but I feel confident that my values are pretty close.

require 'spec_helper'

describe AppointmentServiceCollection do
  describe '#to_h' do
    it 'works' do
      appointment_service = create(:appointment_service)
      collection = AppointmentServiceCollection.new([appointment_service])

      item = collection.to_h.first
      expect(item['price']).to eq('30.00')
      expect(item['label']).to eq('Mens Haircut')
      expect(item['item_id']).to eq(appointment_service.item.id)
      expect(item['type']).to eq('service')
    end
  end
end

I run my test just for fun and it of course fails. My next step is to make the test setup match my assertion values.

require 'spec_helper'

describe AppointmentServiceCollection do
  describe '#to_h' do
    it 'works' do
      appointment_service = create(
        :appointment_service,
        price: 30,
        service: create(:service, name: 'Mens Haircut')
      )

      collection = AppointmentServiceCollection.new([appointment_service])

      item = collection.to_h.first
      expect(item['price']).to eq('30.00')
      expect(item['label']).to eq('Mens Haircut')
      expect(item['item_id']).to eq(appointment_service.service.id)
      expect(item['type']).to eq('service')
    end
  end
end

What I do after I get the test passing

The test now passes. So far I’ve touched my application code as little as possible because I wanted to maximize the chances of preserving the original functionality. Now that I have a test in place, my test can do the job of preserving the original functionality, and I’m free to clean up and refactor the code.

class AppointmentServiceCollection
  include ActionView::Helpers::NumberHelper
  ITEM_TYPE = 'service'

  def initialize(appointment_services)
    @appointment_services = appointment_services
  end

  def to_h
    @appointment_services.map do |appointment_service|
      appointment_service.serializable_hash.merge(extra_attributes(appointment_service))
    end
  end

  private

  def extra_attributes(appointment_service)
    {
      'price'   => number_with_precision(appointment_service.price, precision: 2),
      'label'   => appointment_service.service.name,
      'item_id' => appointment_service.service.id,
      'type'    => ITEM_TYPE
    }
  end
end

I’ll also want to clean up my test code.

require 'spec_helper'

describe AppointmentServiceCollection do
  describe '#to_h' do
    let(:appointment_service) do
      create(
        :appointment_service,
        price: 30,
        service: create(:service, name: 'Mens Haircut')
      )
    end

    let(:item) { AppointmentServiceCollection.new([appointment_service]).to_h.first }

    it 'adds the right attributes' do
      expect(item['price']).to eq('30.00')
      expect(item['label']).to eq('Mens Haircut')
      expect(item['item_id']).to eq(appointment_service.service.id)
      expect(item['type']).to eq('service')
    end
  end
end

Lastly, I’ll replace the original services_with_info method body with my new abstraction. Here’s the original code.

def services_with_info
  appointment_services.collect { |item|
    item.serializable_hash.merge(
      "price" => number_with_precision(item.price, :precision => 2),
      "label" => item.service.name,
      "item_id" => item.service.id,
      "type" => "service"
    )
  }
end

Here’s the new version which uses AppointmentServiceCollection.

def services_with_info
  AppointmentServiceCollection.new(appointment_services).to_h
end

What we’ve accomplished

This is only a drop in the bucket but it helps. “A journey of a thousand miles begins with a single step,” as they say. This area of the code is now slightly more understandable. Next I could give the products_with_info the same treatment. Perhaps I could extract the duplication between the two methods into a superclass. If I apply this tactic to the Appointment class over and over, I could probably whittle it down from its current 708 lines to the more reasonable size of perhaps a few tens of lines.

The Beginner's Guideto Rails Testing

What exactly are integration tests?

The many different types of RSpec specs

Feature specs and model specs

The strengths and weaknesses of model specs

The strengths and weaknesses of feature specs

Where integration tests come into the picture, and what integration tests are

What’s the difference between integration tests, acceptance tests, end-to-end tests, system tests, and feature specs?

Integration tests, acceptance tests and end-to-end tests

System tests and feature specs

My usage of “integration test”

What do I write integration tests for, and how?

What to write integration tests for

How to write integration tests

Tutorial overview

Project description

Tutorial outline

Setting up our Rails project

Writing our integration tests

Creating the city resource

Integration tests for City

Installing the necessary gems

Writing the first integration test (creating a city)

Verifying that the test actually does its job

Integration test for trying to create a city with invalid inputs

Integration test for updating a city

Integration test for deleting a city

Where to go next

Race conditions

“Leaking” tests (non-deterministic test suite)

Reliance on third-party code

What we’re going to do

Initializing the Rails application

Installing RSpec

Creating the User model

The User test

Writing the test

Examining the test

Running the test

The Sprout Method technique

What we’re going to do

The tools we’ll be using

Initializing the application

Run the “rails new” command

Update our Gemfile

Install RSpec

Creating our static page

Writing our test

Write the actual test

Run the test

Make the test fail

Watch the test run in the browser

Review

The ubiquity of time pressure in software development work

Caving to the pressure

Why bad code is the fault of coders, not managers

Invalid excuses for shipping poor work

Fallacy: cutting corners is faster

Fallacy: “they didn’t give me enough time to do a good job”

Fallacy: cutting corners is a business decision

Fallacy: “we can’t afford perfection”

Fallacy: “if I don’t comply with instructions to cut corners, I’ll get fired”

Fallacy: “we’ll clean it up after the deadline”

Fallacy: “I would have done a better job, but it wasn’t up to me”

Fallacy: “I haven’t been empowered to make things better”

My admonition to my colleagues: don’t make excuses for poor work

A mysteriously broken test suite

The surprising root cause of the bug

The lesson: my tests were right to hit the real API

Tiny atomic refactorings

Bigger atomic refactorings

When to perform refactorings

When refactorings can’t be atomic

What characterization tests are and what problem they solve

What I’m going to show you

Refactoring the services_with_info method

First goal: simply instantiate the object inside a test

Second goal: see what value the method-under-test returns

The return values, revealed

More realistic assertions

The Beginner's Guide
to Rails Testing