Category Archives: Programming

RSpec mocks and stubs in plain English

One of the most common questions I see from beginners to Rails testing is what mocks and stubs are and when to use them. If you’re confused about mocks and stubs, you’re not alone. In my experience very few people understand them. In this post I’ll attempt to help clarify the matter, particularly in the context of Rails/RSpec.

I want to preface my explanation with a disclaimer. This post is partly derived from personal experience but it’s mostly derived from me just reading a bunch of posts online and reading a few books to try to understand mocks and stubs better. It’s entirely possible that what follows is not 100% accurate. But I’m risking the inaccuracy because I think the internet needs a clear, simple explanation of mocks and stubs and so far I haven’t been able to find one.

Here’s what we’re going to cover in this post:

The problem with testing terminology

The field of automated testing has about a million different terms and there’s not a complete consensus on what they all mean. For example, are end-to-end tests and acceptance tests the same thing? Some people would say yes, some would say no, and there’s no central authority to say who’s right and who’s wrong. That’s a problem with testing terminology in general.

A problem with mocks and stubs in particular is that programmers are often sloppy with the language. People say mock when they mean stub and vice versa.

The result of these two issues is that many explanations of mocks and stubs are very very very confusing.

Test doubles

I had a lightbulb moment when I read in Gerard Meszaros’ xUnit Test Patterns that mocks and stubs are each special types of test doubles.

To me this was a valuable piece of truth. Mocks and stubs are both types of test doubles. I can understand that. That’s a piece of knowledge that’s not likely to be invalidated by something else I’ll read later. My understanding has permanently advanced a little bit.

What’s a test double? The book xUnit Test Patterns (which I understand actually coined the term “test double”) likens a test double to a Hollywood stunt double. From the book’s online explanation of test doubles:

When the movie industry wants to film something that is potentially risky or dangerous for the leading actor to carry out, they hire a “stunt double” to take the place of the actor in the scene. The stunt double is a highly trained individual who is capable of meeting the specific requirements of the scene. They may not be able to act, but they know how to fall from great heights, crash a car, or whatever the scene calls for. How closely the stunt double needs to resemble the actor depends on the nature of the scene. Usually, things can be arranged such that someone who vaguely resembles the actor in stature can take their place.

The example I use later in this post is the example of a payment gateway. When we’re testing some code that interacts with a payment gateway, we of course don’t want our test code to actually hit e.g. the production Stripe API and charge people’s credit cards. We want to use a test double in place of the real payment gateway. The production Stripe API is like Tom Cruise in Mission Impossible, the payment gateway test double is of course like the stunt double.

Stub explanation

The best explanation of mocks and stubs I’ve been able to find online is this post by a guy named Michal Lipski.

I took his explanation, mixed it in my brain with other stuff I’ve read, and came up with this explanation:

A Test Stub is a fake object that’s used in place of a real object for the purpose of getting the program to behave the way we need it to in order to test it. A big part of a Test Stub’s job is to return pre-specified hard-coded values in response to method calls.

You can visit the xUnit Patterns Test Stub page for a more detailed and precise explanation. My explanation might not be 100% on the mark. I’m going for clarity over complete precision.

When would you want to use a test stub? We’ll see in my example code shortly.

Mock explanation

A Mock Object is a fake object that’s used in place of a real object for the purpose of listening to the methods called on the mock object. The main job of a Mock Object is to ensure that the right methods get called on it.

Again, for a more precise (but harder to understand) explanation, you can check out the xUnit Patterns Mock Object page.

We’ll also see a mock object use case in my example code.

The difference between mocks and stubs

As I understand it, and to paint with a very broad brush, Test Stubs help with inputs and Mock Objects help with outputs. A Test Stub is a fake thing you stick in there to trick your program into working properly under test. A Mock Object is a fake thing you stick in there to spy on your program in the cases where you’re not able to test something directly.

Again, I’m going for conciseness and clarity over 100% accuracy here. Now let’s take a look at a concrete example.

Example application code

Below is an example Ruby program I wrote. I tried to write it to meet the following conditions:

  • It’s as small and simple as possible
  • It would actually benefit from the use of one mock and one stub

My program involves three classes:

  • Payment: meant to simulate an ActiveRecord model
  • PaymentGateway: simulates a third-party payment gateway (e.g. Stripe)
  • Logger: logs payments

The code snippet below includes the program’s three classes as well as a single test case for the Payment class.

class Payment
  attr_accessor :total_cents

  def initialize(payment_gateway, logger)
    @payment_gateway = payment_gateway
    @logger = logger
  end

  def save
    response = @payment_gateway.charge(total_cents)
    @logger.record_payment(response[:payment_id])
  end
end

class PaymentGateway
  def charge(total_cents)
    puts "THIS HITS THE PRODUCTION API AND ALTERS PRODUCTION DATA. THAT'S BAD!"

    { payment_id: rand(1000) }
  end
end

class Logger
  def record_payment(payment_id)
    puts "Payment id: #{payment_id}"
  end
end

describe Payment do
  it 'records the payment' do
    payment_gateway = PaymentGateway.new
    logger = Logger.new

    payment = Payment.new(payment_gateway, logger)
    payment.total_cents = 1800
    payment.save
  end
end

Our test has two problems.

One is that it hits the real production API and alters production data. (My code doesn’t really do this; you can pretend that my puts statement hits a real payment gateway and charges somebody’s credit card.)

The other problem is that the test doesn’t verify that the payment gets logged. We could comment out the @logger.record_payment(response[:payment_id]) line and the test would still pass.

This is what we see when we run the test:

THIS HITS THE PRODUCTION API AND ALTERS PRODUCTION DATA. THAT'S BAD!
Payment id: 302
.

Finished in 0.00255 seconds (files took 0.09594 seconds to load)
1 example, 0 failures

Let’s first address the problem of altering production data. We can do this by replacing PaymentGateway with a special kind of test double, a stub.

Stub example

Below I’ve replaced payment_gateway = PaymentGateway.new with payment_gateway = double(). I’m also telling my new Test Double object (that is, my Test Stub) that it should expect to receive a charge method call, and when it does, return a payment id of 1234.

class Payment
  attr_accessor :total_cents

  def initialize(payment_gateway, logger)
    @payment_gateway = payment_gateway
    @logger = logger
  end

  def save
    response = @payment_gateway.charge(total_cents)
    @logger.record_payment(response[:payment_id])
  end
end

class PaymentGateway
  def charge(total_cents)
    puts "THIS HITS THE PRODUCTION API AND ALTERS PRODUCTION DATA. THAT'S BAD!"

    { payment_id: rand(1000) }
  end
end

class Logger
  def record_payment(payment_id)
    puts "Payment id: #{payment_id}"
  end
end

describe Payment do
  it 'records the payment' do
    payment_gateway = double()
    allow(payment_gateway).to receive(:charge).and_return(payment_id: 1234)

    logger = Logger.new

    payment = Payment.new(payment_gateway, logger)
    payment.total_cents = 1800
    payment.save
  end
end

If we run this test now, we can see that we no longer get the jarring “THIS HITS THE PRODUCTION API” message. This is because our test no longer calls the charge method on an instance of PaymentGateway, it calls the charge method on a test double.

Our payment_gateway variable is no longer actually an instance of PaymentGateway, it’s an instance of RSpec::Mocks::Double.

Payment id: 1234
.

Finished in 0.00877 seconds (files took 0.09877 seconds to load)
1 example, 0 failures

That takes care of the hitting-production problem. What about verifying the logging of the payment? This is a job for a different kind of test double, a mock object (or just mock).

Mock example

Now let’s replace Logger.new with logger = double(). Notice how RSpec doesn’t make a distinction between mocks and stubs. They’re all just Test Doubles. If we want to use a Test Double as a mock or as a stub, RSpec leaves that up to us and doesn’t care.

We’re also telling our new Mock Object that it needs (not just can, but has to, and it will raise an exception if not) receive a record_payment method call with the value 1234.

class Payment
  attr_accessor :total_cents

  def initialize(payment_gateway, logger)
    @payment_gateway = payment_gateway
    @logger = logger
  end

  def save
    response = @payment_gateway.charge(total_cents)
    @logger.record_payment(response[:payment_id])
  end
end

class PaymentGateway
  def charge(total_cents)
    puts "THIS HITS THE PRODUCTION API AND ALTERS PRODUCTION DATA. THAT'S BAD!"

    { payment_id: rand(1000) }
  end
end

class Logger
  def record_payment(payment_id)
    puts "Payment id: #{payment_id}"
  end
end

describe Payment do
  it 'records the payment' do
    payment_gateway = double()
    allow(payment_gateway).to receive(:charge).and_return(payment_id: 1234)

    logger = double()
    expect(logger).to receive(:record_payment).with(1234)

    payment = Payment.new(payment_gateway, logger)
    payment.total_cents = 1800
    payment.save
  end
end

Now that we have the line that says expect(logger).to receive(:record_payment).with(1234), our test is asserting that the payment gets logged. We can verify this by commenting out the @logger.record_payment(response[:payment_id]) and running our test again. We get the following error:

Failures:

  1) Payment records the payment
     Failure/Error: expect(logger).to receive(:record_payment).with(1234)
     
       (Double (anonymous)).record_payment(1234)
           expected: 1 time with arguments: (1234)
           received: 0 times

Why I don’t often use mocks or stubs in Rails

Having said all this, I personally hardly ever use mocks or stubs in Rails. I can count on one hand all the times I’ve used mocks or stubs over my eight years so far of doing Rails.

The main reason is that I just don’t often have the problems that test doubles solve. In Rails we don’t really do true unit tests. For better or worse, I’ve never encountered a Rails developer, myself included, who truly wants to test an object completely in isolation from all other objects. Instead we write model tests that hit the database and have full access to every single other object in the application. Whether this is good or bad can be debated but one thing seems clear to me: this style of testing eliminates the need for test doubles the vast majority of the time.

Another reason is that I find that tests written using test doubles are often basically just a restatement of the implementation of whatever’s being tested. The code says @logger.record_payment and the test says expect(logger).to receive(:record_payment). Okay, what did we really accomplish? We’re not testing the result, we’re testing the implementation. That’s probably better than nothing but if possible I’d rather just test the result, and quite often there’s nothing stopping me from testing the result instead of the implementation.

Lastly, I personally haven’t found myself working on a lot of projects that use some external resource like a payment gateway API that would make test doubles useful. If I were to work on such a project I imagine I certainly would make use of test doubles, I just haven’t worked on that type of project very much.

Refactoring to POROs (using tests)

For a good chunk of my Rails career I was under the impression that in Rails projects, we have one model per database table. If all I have in my application are three tables, `users`, `products`, and `orders`, then I’ll have exactly three corresponding ActiveRecord classes, `User`, `Product`, and `Order`.

Then I read a post by Steve Klabnik (which I can’t find now) pointing out that the classes in `app/models` don’t have to inherit from ActiveRecord. It was a real aha moment for me and a turning point in the way I write code.

These ActiveRecord-less classes are often called Plain old Ruby Objects or POROs.

Why POROs are useful

Small methods are easier to understand than large methods. Small classes are easier to understand than large classes. Code that’s easy to understand tends to me less expensive to maintain and more enjoyable to maintain.

Rails projects that only use the classes that inherit from ActiveRecord tend to eventually end up with ActiveRecord classes that are way too big and do way too much unrelated stuff. If a project contains a `Product` class, the `Product` class becomes sort of a dumping ground for anything that could be tangentially related to a product. Inside of the huge `Product` class there might be three or four abstractions “hiding” in the code, yearning to be separated into their own neat and tidy classes where they don’t have to live scrambled together with unrelated code.

I’ve also found that when I have a method that’s bigger than I’d like but just seems irreducible, I can pull out that method’s code into a new PORO and suddenly I see how my ugly method can be morphed into a neat and tidy class.

All this is rather abstract so let’s take a look at a concrete example.

An example of refactoring to a PORO

Here’s a piece of code written by a relatively unskilled programmer—me from eight years ago.

Don’t try to understand this method right now. First just observe its size and shape.

class Appointment < ActiveRecord::Base
  def self.save_from_params(params)
    a = params[:id] ? self.find(params[:id]) : self.new

    a.attributes = {
      time_block_type: TimeBlockType.find_by_code(params[:appointment][:time_block_type_code]),
      notes:           params[:appointment][:notes],
      stylist_id:      params[:appointment][:stylist_id],
      is_cancelled:    params[:appointment][:is_cancelled],
      tip:             params[:appointment][:tip],
    }

    if a.time_block_type_code == "APPOINTMENT"
      a.client = self.build_client(params, a.stylist.salon)
    else
      a.client = Client.no_client
      a.length = Appointment.determine_length(
        start_time: params[:appointment][:start_time_time],
        end_time:   params[:appointment][:end_time],
      )
    end

    a.set_time(params[:appointment][:start_time_time], params[:appointment][:start_time_ymd])
    a.set_repeat_logic(params)
    a.save
    a.set_payments_from_json_string(params[:serialized_payments])
    a.set_services_and_products_from_json_string(params[:serialized_products_and_services])
    a.record_transactions

    if !a.new_record?
      a.reload.generate_recurrence_hash_if_needed
      a.reload.save_future
    end
    a
  end
end 

At the time I wrote this method I knew that it was much longer than I’d like but I could figure out how to make it smaller. Now that I’m older and wiser I know that it can be refactored into a new class. I’m going to make use of the factory pattern and call this new class `AppointmentFactory`.

I need to be very careful when creating this new class to ensure that it works precisely the same as the existing method. I’m going to use tests to help reduce the likelihood that I screw anything up. I’m also going to go in very small steps.

Writing the first line of my PORO

The first small step I’m going to take is to move the first line of `save_from_params` into a new `AppointmentFactory` class.

class AppointmentFactory
  def self.create(params)
    params[:id] ? Appointment.find(params[:id]) : Appointment.new
  end
end

The original method will be changed, very slightly, to this:

class Appointment < ActiveRecord::Base
  def self.save_from_params(params)
    a = AppointmentFactory.create(params)

    a.attributes = {
      time_block_type: TimeBlockType.find_by_code(params[:appointment][:time_block_type_code]),
      notes:           params[:appointment][:notes],
      stylist_id:      params[:appointment][:stylist_id],
      is_cancelled:    params[:appointment][:is_cancelled],
      tip:             params[:appointment][:tip],
    }

    if a.time_block_type_code == "APPOINTMENT"
      a.client = self.build_client(params, a.stylist.salon)
    else
      a.client = Client.no_client
      a.length = Appointment.determine_length(
        start_time: params[:appointment][:start_time_time],
        end_time:   params[:appointment][:end_time],
      )
    end

    a.set_time(params[:appointment][:start_time_time], params[:appointment][:start_time_ymd])
    a.set_repeat_logic(params)
    a.save
    a.set_payments_from_json_string(params[:serialized_payments])
    a.set_services_and_products_from_json_string(params[:serialized_products_and_services])
    a.record_transactions

    if !a.new_record?
      a.reload.generate_recurrence_hash_if_needed
      a.reload.save_future
    end
    a
  end
end 

Writing tests for my PORO

Now I’m going to write the first tests for my new `AppointmentFactory` class. I can see based on the code that I’ll have to test two scenarios: 1) when an id is present and 2) when an id is not present.

I’ll start with just the shell of the test.

require 'spec_helper' # Why not request 'rails_helper'? Because this is an old project

RSpec.describe AppointmentFactory do
  describe '#create' do
    context 'when id is present' do
      it 'returns the appointment with that id' do
      end
    end

    context 'when id is not present' do
      it 'returns a new appointment instance' do
      end
    end
  end
end

Why do I write just the shell of the test first? The reason is to reduce the amount of stuff I have to juggle in my head. If I get the test cases I want to write out of my head and onto the screen, I’m free to stop thinking about that part of my task.

The first test case I’ll fill in is the scenario where I’m passing the id of an existing appointment.

Test for existing appointment

require 'spec_helper'

RSpec.describe AppointmentFactory do
  describe '#create' do
    context 'when id is present' do
      let(:existing_appointment) { create(:appointment) }
      let(:appointment) { AppointmentFactory.create(id: existing_appointment.id) }

      it 'returns the appointment with that id' do
        expect(appointment.id).to eq(existing_appointment.id)
      end
    end

    context 'when id is not present' do
      it 'returns a new appointment instance' do
      end
    end
  end
end

This test passes as expected. I’m skeptical, though. I never trust a test that I only see pass and never see fail. How can I be sure that the test is testing what I think it’s testing? How can I be sure I didn’t just make a mistake in the way I wrote the test?

In order to see the test fail, I’ll comment out the correct line and add something that won’t satisfy the test.

class AppointmentFactory
  def self.create(params)
    #params[:id] ? Appointment.find(params[:id]) : Appointment.new
    Appointment.new
  end
end

As I expect, the test now fails.

Failures:                              

  1) AppointmentFactory#create when id is present returns the appointment with that id                                                                       
     Failure/Error: expect(appointment.id).to eq(existing_appointment.id)     
                                       
       expected: 1                     
            got: nil                   
                                       
       (compared using ==)             
     # ./spec/models/appointment_factory_spec.rb:10:in `block (4 levels) in <top (required)>'

Now that I’m comfortable with that test case I’ll move onto the case where I’m not passing the id of an existing appointment.

Test for new appointment instance

require 'spec_helper'

RSpec.describe AppointmentFactory do
  describe '#create' do
    context 'when id is present' do
      let(:existing_appointment) { create(:appointment) }
      let(:appointment) { AppointmentFactory.create(id: existing_appointment.id) }

      it 'returns the appointment with that id' do
        expect(appointment.id).to eq(existing_appointment.id)
      end
    end

    context 'when id is not present' do
      let(:appointment) { AppointmentFactory.create }

      it 'returns a new appointment instance' do
        expect(appointment.id).to be_nil
      end
    end
  end
end

This actually doesn’t pass.

Failures:

  1) AppointmentFactory#create when id is not present returns a new appointment instance
     Failure/Error: let(:appointment) { AppointmentFactory.create }
     ArgumentError:
       wrong number of arguments (0 for 1)
     # ./app/models/appointment_factory.rb:2:in `create'
     # ./spec/models/appointment_factory_spec.rb:15:in `block (4 levels) in <top (required)>'
     # ./spec/models/appointment_factory_spec.rb:18:in `block (4 levels) in <top (required)>'

My create method is expecting an argument but I’m not passing it one. Rather than change my code to pass an empty hash, I’ll change the signature of the create method to default to an empty hash if no argument is provided.

class AppointmentFactory
  def self.create(params = {})
    params[:id] ? Appointment.find(params[:id]) : Appointment.new
  end
end

Fixing a misnomer

My test now passes. And now that I step back and assess what I’ve done, I realize that create is probably not the most sensical name for this method. Maybe something like find_or_create would be more fitting.

class AppointmentFactory
  def self.find_or_create(attributes = {})
    attributes[:id] ? Appointment.find(attributes[:id]) : Appointment.new
  end
end

I’ll of course have to change the naming in the test file as well.

require 'spec_helper'

RSpec.describe AppointmentFactory do
  describe '#find_or_create' do
    context 'when id is present' do
      let(:existing_appointment) { create(:appointment) }
      let(:appointment) { AppointmentFactory.find_or_create(id: existing_appointment.id) }

      it 'returns the appointment with that id' do
        expect(appointment.id).to eq(existing_appointment.id)
      end
    end

    context 'when id is not present' do
      let(:appointment) { AppointmentFactory.find_or_create }

      it 'returns a new appointment instance' do
        expect(appointment.id).to be_nil
      end
    end
  end
end

The very slightly refactored version of my original method

As a reminder, my original save_from_params method looks like this:

class Appointment < ActiveRecord::Base
  def self.save_from_params(params)
    a = AppointmentFactory.find_or_create(params)

    a.attributes = {
      time_block_type: TimeBlockType.find_by_code(params[:appointment][:time_block_type_code]),
      notes:           params[:appointment][:notes],
      stylist_id:      params[:appointment][:stylist_id],
      is_cancelled:    params[:appointment][:is_cancelled],
      tip:             params[:appointment][:tip],
    }

    if a.time_block_type_code == "APPOINTMENT"
      a.client = self.build_client(params, a.stylist.salon)
    else
      a.client = Client.no_client
      a.length = Appointment.determine_length(
        start_time: params[:appointment][:start_time_time],
        end_time:   params[:appointment][:end_time],
      )
    end

    a.set_time(params[:appointment][:start_time_time], params[:appointment][:start_time_ymd])
    a.set_repeat_logic(params)
    a.save
    a.set_payments_from_json_string(params[:serialized_payments])
    a.set_services_and_products_from_json_string(params[:serialized_products_and_services])
    a.record_transactions

    if !a.new_record?
      a.reload.generate_recurrence_hash_if_needed
      a.reload.save_future
    end
    a
  end
end

Tackling the second chunk of code, attribute setting

I want to focus now on moving the chunk starting with a.attributes = out of save_from_params and into AppointmentFactory. I’ll start with a test for just the first of those attributes, time_block_type.

describe 'attributes' do
  let!(:time_block_type) { create(:time_block_type, code: 'APPOINTMENT') }

  let(:appointment) do
    AppointmentFactory.find_or_create(
      appointment: { time_block_type_code: 'APPOINTMENT' }
    )
  end

  it 'sets a time block type' do
    expect(appointment.time_block_type).to eq(time_block_type)
  end
end

I run the test and watch it fail, then add the code to make the test pass.

class AppointmentFactory
  def self.find_or_create(attributes = {})
    appointment = attributes[:id] ? Appointment.find(attributes[:id]) : Appointment.new

    appointment.attributes = {
      time_block_type: TimeBlockType.find_by_code(attributes[:appointment][:time_block_type_code])
    }

    appointment
  end
end

Unfortunately it seems that one of my “old” tests is now failing.

Failures:
                      
  1) AppointmentFactory#find_or_create when id is present returns the appointment with that id
     Failure/Error: let(:appointment) { AppointmentFactory.find_or_create(id: existing_appointment.id) }
     NoMethodError:       
       undefined method `[]' for nil:NilClass
     # ./app/models/appointment_factory.rb:6:in `find_or_create'
     # ./spec/models/appointment_factory_spec.rb:7:in `block (4 levels) in <top (required)>'
     # ./spec/models/appointment_factory_spec.rb:10:in `block (4 levels) in <top (required)>'

Why is this test failing? The problem (which I discovered after much head-scratching) is that if I only pass in something for attributes[:id] and nothing for attributes[:appointment], then the line that tries to access attributes[:appointment][:time_block_type_code] says “You’re trying invoke the method [] on attributes[:appointment], but attributes[:appointment] is nil, and NilClass doesn’t have a method called [].”

Fair enough. The simplest thing I can think of to get the test passing is just to add a guard clause that returns unless attributes[:appointment] actually has something in it.

class AppointmentFactory
  def self.find_or_create(attributes = {})
    appointment = attributes[:id] ? Appointment.find(attributes[:id]) : Appointment.new
    return appointment unless attributes[:appointment].present?

    appointment.attributes = {
      time_block_type: TimeBlockType.find_by_code(attributes[:appointment][:time_block_type_code])
    }

    appointment
  end
end

Testing the rest of the attributes

With that out of the way I’m ready to turn my attention to the rest of the attributes. I’m not feeling a particular appetite to write an individual test for each attribute, and I don’t feel like separate tests would be much more valuable or clear than one single test case with several assertions. (In fact, I think one single test case with several assertions would actually be more clear than if they were separate.)

describe 'attributes' do
  let!(:time_block_type) { create(:time_block_type, code: 'APPOINTMENT') }
  let!(:stylist) { create(:stylist) }

  let(:appointment) do
    AppointmentFactory.find_or_create(
      appointment: {
        time_block_type_code: 'APPOINTMENT',
        notes: 'always late',
        stylist_id: stylist.id,
        is_cancelled: 'true',
        tip: '100'
      }
    )
  end

  it 'sets the proper attributes' do
    expect(appointment.time_block_type).to eq(time_block_type)
    expect(appointment.notes).to eq('always late')
    expect(appointment.stylist).to eq(stylist)
    expect(appointment.is_cancelled).to be true
    expect(appointment.tip).to eq(100)
  end
end

After running that test and watching it fail I’m ready to pull in the code that sets these attributes.

class AppointmentFactory
  def self.find_or_create(attributes = {})
    appointment = attributes[:id] ? Appointment.find(attributes[:id]) : Appointment.new
    return appointment unless attributes[:appointment].present?

    appointment.attributes = {
      time_block_type: TimeBlockType.find_by_code(attributes[:appointment][:time_block_type_code]),
      notes:           attributes[:appointment][:notes],
      stylist_id:      attributes[:appointment][:stylist_id],
      is_cancelled:    attributes[:appointment][:is_cancelled],
      tip:             attributes[:appointment][:tip]
    }

    appointment
  end
end

Now my test passes.

Refactoring my crappy AppointmentFactory code

So far I’ve just been focusing on getting my tests to pass as quickly and easily as possible with little to no regard to code quality. (I picked up this habit from Kent Beck’s Test Driven Development: By Example.)

Now that I have a non-trivial amount of code in my class, I feel ready to do some refactoring for understandability. I can feel confident that correct functionality will be preserved throughout my refactoring thanks to the safety net of my little test suite.

class AppointmentFactory
  def self.find_or_create(attributes = {})
    Appointment.find_or_initialize_by(id: attributes[:id]).tap do |appointment|
      appointment.attributes = filtered_attributes(attributes[:appointment])
    end
  end

  def self.filtered_attributes(appointment_attributes)
    return {} unless appointment_attributes.present?

    {
      time_block_type: TimeBlockType.find_by_code(appointment_attributes[:time_block_type_code]),
      notes:           appointment_attributes[:notes],
      stylist_id:      appointment_attributes[:stylist_id],
      is_cancelled:    appointment_attributes[:is_cancelled],
      tip:             appointment_attributes[:tip]
    }
  end
end

The slightly more refactored version of my original method

This is what the save_from_params method now looks like. Having shaved off 6 lines by moving some code into a PORO, this method is now 24 lines instead of 32. The job is of course far from done but it’s a good start.

class Appointment < ActiveRecord::Base
  def self.save_from_params(params)
    a = AppointmentFactory.find_or_create(params)

    if a.time_block_type_code == "APPOINTMENT"
      a.client = self.build_client(params, a.stylist.salon)
    else
      a.client = Client.no_client
      a.length = Appointment.determine_length(
        start_time: params[:appointment][:start_time_time],
        end_time:   params[:appointment][:end_time],
      )
    end

    a.set_time(params[:appointment][:start_time_time], params[:appointment][:start_time_ymd])
    a.set_repeat_logic(params)
    a.save
    a.set_payments_from_json_string(params[:serialized_payments])
    a.set_services_and_products_from_json_string(params[:serialized_products_and_services])
    a.record_transactions

    if !a.new_record?
      a.reload.generate_recurrence_hash_if_needed
      a.reload.save_future
    end
    a
  end
end

I could of course continue the illustration until the save_from_params method is down to a one-liner but I think you can see where I’m going, and hopefully this much is enough to convey my methodology.

My “refactor to PORO” methodology, boiled down

Here are the rough steps I’ve used in this post to improve the understandability of my code and put it under test coverage. I use this methodology all the time in production projects, whether it be a greenfield project or legacy project.

  1. Try to find a “missing abstraction” in the method whose bulk you want to reduce, and create a new PORO expressing that abstraction
  2. Identify a line (or group of lines) from the original function you want to move into its own class
  3. Write a failing test for that functionality
  4. Cut and paste the original line into the new PORO to make the test pass (making small adjustments if necessary)
  5. Repeat from step 2 until the PORO code gets too big and messy
  6. Refactor the PORO code until you’re satisfied with its cleanliness
  7. Repeat from step 2
  8. Rejoice

The difference between integration tests and controller tests in Rails

I recently came across a Reddit post asking about the difference between integration tests and controller tests. Some of the comments were interesting:

“I always write controller tests. I only write integration tests if there’s some JavaScript interacting with the controller or if the page is really fucking valuable.”

“In this case I would say it depends if you are building a full rails app (front-end and back-end) or only and API (back-end). For the first I would say an integration test should go through the front-end (which eventually calls the controllers). If you are doing an API, integration and controller tests would be the same.”

“Controller tests attempt to test the controller in isolation, and integration tests mimic a browser clicking through the app, i.e. they touch the entire stack. Honestly I only write integration tests. Controller tests are basically exactly the same thing but worse, and they are harder to write. “

“As far as I understand for the direction of Rails, controller tests will be going away and you’ll need to use integration tests.”

“From my experience integration tests are always harder to maintain and are much more fragile. Not trying to neglect it’s value but it comes with a price. I wonder how will people going to test api only rails apps if controller tests are gone?”

Which of these things are accurate and which are BS? I’ll do my best here to clarify. I’ll also add my own explanation of the difference between integration tests and controller tests.

My explanation of integration tests vs. controller tests

Before I even share my explanation I need to provide some context. There are two realities that make questions like “What’s the difference between integration tests and controller tests?” hard to answer.

Problem: terminology

First, there’s no consensus in the testing world on terminology. What one person calls an integration test might be what another person would call an end-to-end test or an acceptance test. No one can say whether any particular definition of a term is right or wrong because there’s no agreed-upon standard.

Problem: framework differences

Second, in what context are we talking about integration tests vs. controller tests? RSpec? MiniTest? Something else? The concepts of integration tests and controller tests map slightly differently onto each framework. Here’s how I’d put it:

General Testing Concept Relevant MiniTest Concept Relevant RSpec Concept
Integration Test Integration Test Feature Spec
End-to-End Test System Test Feature Spec
Controller Test Functional Test Request Spec (and, previously, Controller Spec)

Yikes. In order to talk about the difference between integration tests and controller tests I needed to involve no fewer than eight different terms: integration test, end-to-end test, system test, feature spec, controller test, functional test, request spec and controller spec.

I could share a treatment of integration and controller tests that’s very precise and technically correct but first let me try, for clarity’s sake, to share a useful approximation to the truth.

My approximately-true explanation

If a Rails application can be thought of a stack listed top-to-bottom as views-controllers-models, controller tests exercise controllers and everything “lower”. So controller tests test the “controllers-models” part of the “views-controllers-models” stack.

Integration tests test the views and everything “lower”. So in the views-controllers-models stack, integration tests test all three layers.

The messier truth

The messier truth is that since RSpec is (as far as I can tell) substantially more popular than MiniTest for commercial Rails applications, then you, dear reader, are more likely to be an RSpec user than a MiniTest user.

So instead of “integration tests”, the term that’s applicable to you is feature specs. Luckily the definition is pretty much the same, though. Feature specs exercise the whole application stack (views-controllers-models).

When we start to talk about controller tests (or controller specs in RSpec terminology), things get a little confusing. In addition to the concept of controller specs, there now exists the concept of request specs. What’s the difference between the two? As far as I can tell, the main difference is that request specs run in a closer simulation to a real application environment than controller specs. Therefore, the RSpec core team recommends using request specs over controller specs. So the main thing to be aware of is that when someone says “controller tests”, the RSpec concept to map that to is “request specs”.

To sum up: in RSpec, it’s not integration tests vs. controller tests, it’s feature specs vs. request specs. (For the record, I’m not a big fan of RSpec’s somewhat arcane terminology.)

When to use integration tests vs. controller tests

Now that we’ve covered what these two things are, how do we know when to use them?

When I use controller tests/request specs

I’ll start with controller tests (or again in RSpec terminology, request specs). As I discussed in detail in a different article, I only use controller/request specs in two specific scenarios: 1) when I’m maintaining a legacy project and 2) when I’m maintaining an API-only application.

Why are these the only two scenarios in which I use request specs? Because when I’m doing greenfield development, I try very hard to make my controllers do almost nothing by themselves. I push all possible behavior into models. Legacy projects, however, often possess bloated controllers containing lots of code. I find tests useful in teasing apart those bloated controllers so I can refactor the code to mostly use models instead. The reason I use request specs when developing API-only applications is because integration tests/feature specs just aren’t possible. There’s no browser with which to interact.

When I use integration tests/feature specs

Practically every feature I write gets an integration test. For example, when I’m building CRUD functionality for a resource (let’s say a resource called `Customer`), I’ll write a feature spec for attempting to create a new customer record (using both valid and invalid inputs, checking for either success or failure), a feature spec for attempting to update a customer record, and perhaps a feature spec for deleting a customer record. Since feature specs exercise the whole application stack including controllers, I pretty much always find redundant the idea of writing both a feature spec and a request spec for a particular feature.

Addressing the comments

Finally I want to address some of the comments I saw on the Reddit post I referenced at the beginning of this article because I don’t think all of them are accurate.

Page value

“I always write controller tests. I only write integration tests if there’s some JavaScript interacting with the controller or if the page is really fucking valuable.”

I personally don’t buy this idea. To me, one of the main benefits of having integration test/feature spec coverage is that I can automatically check the whole application for regressions every time I make a new commit. I hate it when a client of mine has to point out an error to me, even if it’s something trivial. I’d much rather have my code get tested by automated tests than by my client. (And yes, I get that tests can’t prove the absence of bugs and that my test suite won’t always catch all regressions, but I think something is a lot better than nothing.)

API-only applications

“In this case I would say it depends if you are building a full rails app (front-end and back-end) or only and API (back-end). For the first I would say an integration test should go through the front-end (which eventually calls the controllers). If you are doing an API, integration and controller tests would be the same.”

Let me address the last sentence first. Yes, if you’re writing an API-only application, the max you can test is “from the controller down”, so the idea of adding an integration test that tests an additional layer doesn’t apply.

Now let me address the first part of the comment. The idea behind the comment makes sense to me. I think what the comment author is saying is that if the application is a “traditional” Rails application, then an integration test would hit the front-end, the user-facing part of the application, and exercise all parts of the code from that starting point.

“The same thing but worse”

“Controller tests attempt to test the controller in isolation, and integration tests mimic a browser clicking through the app, i.e. they touch the entire stack. Honestly I only write integration tests. Controller tests are basically exactly the same thing but worse, and they are harder to write”

I believe I basically agree with this comment but I’d like to get a little more specific. I wouldn’t exactly agree that “controller tests are basically the exact same thing” as integration tests. As I said above, integration tests/feature specs test one additional layer of the application beyond controllers. There’s a lot in that extra layer. It makes a big difference.

As for the second part, “Controller tests are basically the same thing but worse,” it would be helpful to say why they’re “worse”. I would repeat what I said earlier in that controller tests/requests specs for me are usually redundant to any integration tests/feature specs I might have. The exceptions, again, are legacy projects and API-only applications.

Controller tests going away?

“As far as I understand for the direction of Rails, controller tests will be going away and you’ll need to use integration tests.”

Unless I’m mistaken I believe this comment is a little confused. As I mentioned earlier in this article, it’s true that the RSpec core team no longer recommends using controller specs. The recommended replacement though isn’t integration tests but request specs. However, I personally tend to favor integration tests/feature specs over controller specs/request specs anyway.

I’m not able to find any evidence that MiniTest controller tests are going away.

Integration tests fragile and harder to maintain?

From my experience integration tests are always harder to maintain and are much more fragile. Not trying to neglect it’s value but it comes with a price. I wonder how will people going to test api only rails apps if controller tests are gone?

I wouldn’t necessarily disagree. Out of the test types I use, I find integration tests/feature specs to be the most expensive to maintain. I would, however, suggest that the additional value integration tests/feature specs provide over controller tests/request specs more than makes up for their extra cost.

In Which I Dissect and Improve a Rails Test File I Found on Reddit

I recently found a Reddit post which shared the code of a test and basically asked, “Am I doing this right?”

I’m going to dissect this test and make suggestions for improvement. Here’s the original code as shared in the post.

The original code

require 'test_helper'
class UserTest < ActiveSupport::TestCase

  let(:user) { FactoryBot.build :user}

  it "must be valid" do
    value(user).must_be :valid?
  end

  it "can be saved" do
    assert_difference "User.count", 1 do
      user.save
    end
    assert user.persisted?
  end

  describe "won't be valid" do
    it "with an duplicated email" do
      user.save
      user2 = FactoryBot.build :user
      user2.wont_be :valid?
      user2.errors.count.must_equal 1
      user2.errors[:email].must_equal ["has already been taken"]
    end

    it "without an email" do
      user.email = nil
      value(user).wont_be :valid?
      user.errors[:email].first.must_equal "can't be blank"
    end

    it "with an falsy email" do
      user.email = "thisis@falsemail"
      value(user).wont_be :valid?
      user.errors[:email].first.must_equal "is invalid"
    end

    it "with an email without a mx-record" do
      user.email = "hi@thisdomainwillneverexistnorhaveamxrecord.com"
      value(user).wont_be :valid?
      user.errors[:email].first.must_equal "is invalid"
    end

    it "with an email that is on our blacklist" do
      user.email = "test@trashmail.com"
      value(user).wont_be :valid?
      user.errors[:email].first.must_equal "is a blacklisted email provider"
    end
  end
end

Initial overall thoughts

Overall I think the OP did a pretty good job. He or she got the spirit right. The changes I would make are fairly minor.

Before I get into the actual improvements I want to say that I personally am not familiar with the style of using matchers like `must_be` and I actually wasn’t able to turn up any discussion of `must_be` using a quick Google search. So as I add improvements to the test code, I’m also going to convert the syntax to the syntax I’m personally used to using.

Validity of fresh object

Let’s first examine the first test, the test for the validity of a `User` instance:

it "must be valid" do
  value(user).must_be :valid?
end

It’s not a bad idea to assert that a model object must be valid when it’s in a fresh, untouched state. This test could be improved, however, by making it clearer that when we say “it must be valid”, we mean we’re talking about a freshly instantiated object.

Here’s how I might write this test instead:

context "fresh instance" do
  it "is valid" do
    expect(user).to be_valid
  end
end

We can actually make the test even more concise by taking advantage of `subject`:

subject { FactoryBot.build(:user) }

context "fresh instance" do
  it { should be_valid }
end

Persistence

Let’s look at the next test case, the test that asserts that a user can be persisted.

it "can be saved" do
  assert_difference "User.count", 1 do
    user.save
  end
  assert user.persisted?
end

I don’t see any value in this test. It’s true that an object needs to be able to be successfully saved, but we’ll be doing plenty of saving in the other tests anyway. Plus it’s typically pretty safe to take for granted that saving an ActiveRecord object is going to work. Rather than improving this test, I would just delete it.

Email validity

Lastly let’s take a look at the tests for email validity.

describe "won't be valid" do
  it "with an duplicated email" do
    user.save
    user2 = FactoryBot.build :user
    user2.wont_be :valid?
    user2.errors.count.must_equal 1
    user2.errors[:email].must_equal ["has already been taken"]
  end

  it "without an email" do
    user.email = nil
    value(user).wont_be :valid?
    user.errors[:email].first.must_equal "can't be blank"
  end

  it "with an falsy email" do
    user.email = "thisis@falsemail"
    value(user).wont_be :valid?
    user.errors[:email].first.must_equal "is invalid"
  end

  it "with an email without a mx-record" do
    user.email = "hi@thisdomainwillneverexistnorhaveamxrecord.com"
    value(user).wont_be :valid?
    user.errors[:email].first.must_equal "is invalid"
  end

  it "with an email that is on our blacklist" do
    user.email = "test@trashmail.com"
    value(user).wont_be :valid?
    user.errors[:email].first.must_equal "is a blacklisted email provider"
  end
end

The overall idea is sound although the structure is actually backwards from how I would do it. Rather than saying “here are all the ways the object could be invalid”, I would list the various conditions and then, inside of each condition, assert the validity state. So here’s how I might structure these tests:

context "with an duplicated email" do
  it "is not valid" do
    user.save
    user2 = FactoryBot.build :user
    user2.wont_be :valid?
    user2.errors.count.must_equal 1
    user2.errors[:email].must_equal ["has already been taken"]
  end
end

context "without an email" do
  it "is not valid" do
    user.email = nil
    value(user).wont_be :valid?
    user.errors[:email].first.must_equal "can't be blank"
  end
end

context "with an falsy email" do
  it "is not valid" do
    user.email = "thisis@falsemail"
    value(user).wont_be :valid?
    user.errors[:email].first.must_equal "is invalid"
  end
end

context "with an email without a mx-record" do
  it "is not valid" do
    user.email = "hi@thisdomainwillneverexistnorhaveamxrecord.com"
    value(user).wont_be :valid?
    user.errors[:email].first.must_equal "is invalid"
  end
end

context "with an email that is on our blacklist" do
  it "is not valid" do
    user.email = "test@trashmail.com"
    value(user).wont_be :valid?
    user.errors[:email].first.must_equal "is a blacklisted email provider"
  end
end

I’ll also try to improve each of these test cases individually.

Email uniqueness

This test can be written more concisely. Rather than this:

context "with an duplicated email" do
  it "is not valid" do
    user.save
    user2 = FactoryBot.build :user
    user2.wont_be :valid?
    user2.errors.count.must_equal 1
    user2.errors[:email].must_equal ["has already been taken"]
  end
end

We can do this:

context "when email is not unique" do
  let!(:other_user) { FactoryBot.create(:user, email: subject.email) }

  it "is not valid" do
    expect(subject.errors[:email]).to eq(["has already been taken"])
  end
end

Email presence, email format validity, MX record validity, and blacklist check

These four test cases are all very similar so I’ll deal with them as a group. Here are the original version:

context "without an email" do
  it "is not valid" do
    user.email = nil
    value(user).wont_be :valid?
    user.errors[:email].first.must_equal "can't be blank"
  end
end

context "with an falsy email" do
  it "is not valid" do
    user.email = "thisis@falsemail"
    value(user).wont_be :valid?
    user.errors[:email].first.must_equal "is invalid"
  end
end

context "with an email without a mx-record" do
  it "is not valid" do
    user.email = "hi@thisdomainwillneverexistnorhaveamxrecord.com"
    value(user).wont_be :valid?
    user.errors[:email].first.must_equal "is invalid"
  end
end

context "with an email that is on our blacklist" do
  it "is not valid" do
    user.email = "test@trashmail.com"
    value(user).wont_be :valid?
    user.errors[:email].first.must_equal "is a blacklisted email provider"
  end
end

For the most part these tests are fine. There’s only one significant thing I would do differently. I find it redundant to both assert that there’s an error and assert that the object is invalid. If the object has an error I think it’s safe to assume that it’s also invalid. Here are my versions:

context "without an email" do
  it "is not valid" do
    subject.email = nil
    subject.save
    expect(subject.errors[:email]).to include("can't be blank")
  end
end

context "with an invalid email" do
  it "is not valid" do
    subject.email = "thisis@falsemail"
    subject.save
    expect(subject.errors[:email]).to include("is invalid")
  end
end

context "with an email without a mx-record" do
  it "is not valid" do
    subject.email = "hi@thisdomainwillneverexistnorhaveamxrecord.com"
    subject.save
    expect(subject.errors[:email]).to include("is invalid")
  end
end

context "with an email that is on our blacklist" do
  it "is not valid" do
    subject.email = "test@trashmail.com"
    subject.save
    expect(subject.errors[:email]).to include("is a blacklisted email provider")
  end
end

The entire improved test file

Lastly, here’s the full file after I’ve made my improvements:

require 'rails_helper'

RSpec.describe User, type: :model do
  subject { FactoryBot.build(:user) }

  context "fresh instance" do
    it { should be_valid }
  end

  context "when email is not unique" do
    let!(:other_user) { FactoryBot.create(:user, email: subject.email) }

    it "is not valid" do
      expect(subject.errors[:email]).to eq(["has already been taken"])
    end
  end

  context "without an email" do
    it "is not valid" do
      subject.email = nil
      subject.save
      expect(subject.errors[:email]).to include("can't be blank")
    end
  end

  context "with an invalid email" do
    it "is not valid" do
      subject.email = "thisis@falsemail"
      subject.save
      expect(subject.errors[:email]).to include("is invalid")
    end
  end

  context "with an email without a mx-record" do
    it "is not valid" do
      subject.email = "hi@thisdomainwillneverexistnorhaveamxrecord.com"
      subject.save
      expect(subject.errors[:email]).to include("is invalid")
    end
  end

  context "with an email that is on our blacklist" do
    it "is not valid" do
      subject.email = "test@trashmail.com"
      subject.save
      expect(subject.errors[:email]).to include("is a blacklisted email provider")
    end
  end
end

My Vim setup for Rails

Vim is my favorite editor because using Vim I can edit circles around anyone using Atom, Visual Studio, Sublime Text or any other non-keyboard-based editor. If “computers are a bicycle for the mind”, then using Vim is like shifting that bicycle into high gear. I like to shift that gear even higher using a few certain plugins:

vim-rspec lets me run my tests with a keystroke.

vim-rails takes advantage of the common structure of all Rails applications to let me navigate any Rails application super quickly.

ctrlp.vim allows me to quickly search for and open files (not earth-shattering or unique but of course very useful).

Unlike real debt, technical debt is unavoidable

I’ve always been a big fan of the “technical debt” analogy. The parallels to real debt are amusingly spot-on. For example, technical debt, like financial debt, incurs interest. If an organization accumulates too much technical debt, the organization ends up in a situation where productivity is impossible—technical bankruptcy.

I’ve recently noticed some ways in which the technical debt analogy isn’t perfect, though. I also have another analogy that I think is more fitting in certain ways.

You can’t just “not use the credit card”

In real life it’s possible not to go into debt. Just don’t use a credit card, get a mortgage, buy a car on a loan, etc.

In software it’s not possible to live “technical-debt-free”. Technical debt accumulates automatically. I’ll explain what I mean.

I recently worked on an application where a certain resource had queuing ability. You could mark items as queued, view a list of queued vs. not queued items, and download a CSV of what was queued. I like to think that the code I wrote for this feature was clean and DRY and reasonably free of technical debt.

Some weeks later I was asked to add a new resource to the application which had roughly similar (but not identical) queuing capabilities. I was able to reuse some of my original queuing code to keep things as DRY as possible, but most of the queuing code was so deeply baked into the original feature that it couldn’t be extracted and made generic without some serious thought and work. The only viable way to build the new queuing-related feature was to 1) consciously introduce duplication and then 2) refactor.

It’s arguable that I didn’t absolutely have to consciously introduce duplication, that I could have removed duplication as I went. I don’t typically find this sort of approach very possible though. It’s usually only after I write a bad implementation of something that I’m able to step back and see how it can be made good.

I’ll provide another quick example. Let’s say I’m working on an application that has four different CRUD interfaces. The CRUD interfaces are pretty similar, but not so similar that they could be considered duplicative of each other. I wouldn’t really consider myself to have technical debt at that point. But what about when I grow to 30 similar CRUD interfaces? At that point any duplication among the CRUD code becomes acutely painful. At that point I certainly have technical debt.

Alternate analogy: the sawblade

I still like the technical debt analogy but I’d like to introduce a new one. It’s not my idea. I got the idea from Stephen Covey’s 7 Habits of Highly Effective People.

There’s a story in the book of a man frantically sawing at a tree. He has been sawing for hours. Another man approaches him and says, “Why don’t you take ten minutes and sharpen the saw?” The first man says, “Are you kidding me?! I’m too busy sawing!”

To me, paying down technical debt is like sharpening the saw. You’re hitting pause on production work and making an investment today so you can work faster tomorrow.

The way in which I think the sawblade analogy is more fitting than technical debt is that a saw gets duller just from normal use, just like a codebase gets worse over time unless the maintainers regularly take a step back and make improvements. The sharper the blade is kept, the faster the team will be able to work.

Testing private methods

A question that has befuddled many developers, including myself for a long time, is: how do I test private methods? Should I test private methods?

My opinion is yes, but not directly. I test the behavior of a class’s private methods indirectly through the class’s public methods. In other words, I test my private methods the exact same way my private methods are used by the rest of the application.

The reasoning with my testing approach has to do with the reason why private methods exist. In my mind the value of private methods is that since a private method is hidden from the outside world I can feel free to refactor the private methods at any time and in any way, knowing that I’m not going to mess up any of the code that uses my class.

Here’s a concrete example of a class makes what I think is appropriate use of private methods:

class TypeaheadTag < ActionView::Helpers::Tags::Base
  def render
    hidden_field + text_field_tag
  end

  private

  def hidden_field
    @template_object.hidden_field(
      @object_name,
      "#{@method_name}_id",
      class: "#{@method_name}_id"
    )
  end

  def text_field_tag
    @template_object.text_field_tag(
      @method_name,
      value,
      class: "#{@options[:class]} #{@method_name}_typeahead"
    )
  end
end

I wrote this class to DRY up Twitter Bootstrap typeahead components, which appeared in many places in an application I was developing. The existence of this class (along with other supporting code) allows me to spit out a typeahead field as succinctly as this:

<%= form.typeahead :person, class: 'form-control' %>

The exact manner in which this typeahead class conjures up its necessary components – a hidden field and a text field – is its own private business. No external party needs to know or care how these things happen. The class should be free to refactor these methods, split these two methods into several methods if desired, or combine them all into one, all without altering its public interface one bit.

This principle is why I think it’s useful for tests not to have any knowledge of a class’s private methods. Again, it’s not that the behavior inside the private methods doesn’t get tested – it does. It just gets tested via the class’s public interface.

How to run system specs headlessly or not headlessly at will

I used to prefer seeing my system specs run in the browser but lately I’ve been preferring to run them headlessly. It’s a little faster that way and I find it a little less disruptive.

Sometimes I still find myself wanting to see a certain test in the browser though. Seeing the test run in the browser can make diagnosis of any problems much easier.

The dream

What would be ideal is if I could do something like the following. To run a spec headlessly, I could do this:

$ rspec spec/system/create_customer_spec.rb

To see the same spec run in the browser, I could do this:

$ SHOW_BROWSER=true rspec spec/system/create_customer_spec.rb

Let’s take a look at how to turn that dream into reality.

The implementation

The desired functionality can be implemented by adding the following to spec/rails_helper.rb:

# spec/rails_helper.rb

RSpec.configure do |config|
  config.before(:each, type: :system) do
    driven_by ENV['SHOW_BROWSER'] ? :selenium_chrome : :selenium_chrome_headless
  end
end

Now, for all system specs, Capybara will use the :selenium_chrome_headless driver normally and the :selenium_chrome driver when SHOW_BROWSER=true is specified.

Thanks to Reddit user jrochkind for helping me improve this solution.

Test smell: Obscure Test

You’ve probably heard of the idea of a “code smell” – a hint that something in the code is not quite right and ought to be changed.

Just as there are code smells, there are “test smells”. The book xUnit Test Patterns describes a number of them.

One of the smells described in the book is Obscure Test. An Obscure Test is a test that has a lot of noise in it, noise that’s making it hard to discern what the test is actually doing.

Here’s an example of an Obscure Test I wrote myself:

context 'the element does not exist' do
  before do
    contents = %(
      <?xml version="1.0" encoding="UTF-8"?>
      <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd">
        <channel>
          <item></item>
        </channel>
      </rss>
    )

    xml_doc = Nokogiri::XML(contents)
    episode_element = xml_doc.xpath('//item').first
    @rss_feed_episode = RSSFeedEpisode.new(episode_element)
  end

  it 'returns an empty string' do
    expect(@rss_feed_episode.content('title')).to eq('')
  end
end

There’s a lot of noise in the contents variable (e.g. <?xml version="1.0" encoding="UTF-8"?>). All that stuff is irrelevant to what the test is actually supposed to be testing. All this test should really care about is that we have an empty set of tags.

Here’s a refactored version of the same test:

context 'the element does not exist' do
  let(:rss_feed_episode) do 
    RSSFeedEpisodeTestFactory.create("<item></item>")
  end

  it 'returns an empty string' do
    expect(rss_feed_episode.content('title')).to eq('')
  end
end

Hopefully this is much more clear. The gory details of how to bring an RSS feed episode into existence are abstracted away into RSSFeedEpisodeTestFactory, a new class I created. Here’s what that class looks like:

class RSSFeedEpisodeTestFactory
  def self.create(inner_contents)
    @inner_contents = inner_contents
    rss_feed_episode
  end

  def self.rss_feed_episode
    RSSFeedEpisode.new(xml_doc.xpath('//item').first)
  end

  def self.xml_doc
    Nokogiri::XML(contents)
  end

  def self.contents
    %(
      <?xml version="1.0" encoding="UTF-8"?>
      <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd">
        <channel>#{@inner_contents}</channel>
      </rss>
    )
  end
end

Now I can use this factory class wherever I like. It not only helps me keep my tests more understandable but also helps cut down on duplication.

In the video below you can watch me refactor the “obscure” version into the more readable version as part of one of my free live Rails testing workshops.

Why the Boy Scout Rule is insufficient

Many programmers I’ve talked with are a fan of the Boy Scout Rule: leave the campsite a little cleaner than you found it.

For the most part I think this is a good guideline. I follow it myself. There are certain ways in which it’s insufficient, though. There’s also a way in which the Boy Scout Rule is actually harmful. I’ll discuss each individually.

When the Boy Scout Rule is not enough

Let’s imagine that I get a new job and start orienting myself to my new employer’s codebase. I find that the back end has good test coverage but test coverage on the JavaScript side is nonexistent. I talk with my new teammates and they agree that this is a big weak area of the application and we should add some JavaScript tests. This is a big can of worms, though, and no one has thought too deeply about it yet.

Now let’s say that the next day I have to touch some of this untested JavaScript code in the course of building a new feature. This is the kind of case where applying the Boy Scout Rule wouldn’t be a good idea. It would certainly be possible for me to apply the Boy Scout rule by adding some tests to this area. But in order to do so I’d have to pick a testing framework to use and maybe even think of a whole testing strategy (tools, structure, etc.). I’m not going to just unilaterally make these decisions on behalf of the whole team. I’d also potentially have to configure my new JavaScript test suite to get run when the existing tests get run. Otherwise we’d just have this weird lonely test site off to the side that has to be run separately and manually.

In the analogy of “leave the campsite a little cleaner than you found it”, this case is a little more like a national park containing dozens of individual campsites. Maybe I’ve discovered that none of the campsites have a proper firepit. I can’t just build a firepit on each campsite whenever I happen to do some other work on it (if I ever happen to do some other work on it at all). This is the kind of change that needs to be discussed among everyone and then applied as a distinct project as opposed to haphazardly over a period of who-knows-how-long.

What I want to leave you with is this: many cleanup tasks are campsite-level tasks. These can be done on one’s own without team involvement. Other cleanup tasks are park-level tasks, and are not in fact tasks but projects. These projects need to be done with the involvement of all the appropriate team members.

When the Boy Scout Rule is harmful

If in the course of building a feature or fixing a bug I make little improvements as I go, I’ve gone from doing just one thing—completing an issue in the backlog—to doing two things—completing an issue in the backlog and improving a piece of the code.

A lot of the time this is fine. Sometimes it’s problematic though. If I mix writing a feature with refactoring the code, my pull request will be harder to parse. It will be harder for any pull request reviewer to tell what exactly changed between the old code and the new code. Some of what I’ve done is refactoring, changing the code without changing the behavior. Other parts of what I’ve done change both the code and the behavior. How can someone tell which is which? It’s not quick and easy to tell which is which.

If a feature or bugfix is mixed with code improvements, my change is no longer atomic. Sometimes when I’m writing a feature or performing a piece of refactoring, I realize that I’m headed down a bad path and I want to revert what I’ve done. Or a few days after I introduce a change, it’s revealed that the change contains a bug, and the fastest and least risky fix is just to revert the entire change. If a feature/bugfix and piece of refactoring are mixed into one piece of work, it’s impossible to roll one back without rolling back the other.

If I discover that the area of code in which I need to change is hard to understand, my go-to move will often be to refactor the code as a separate, atomic piece of work, before I make my change. First I make the code easier to change by refactoring, then I make the change (to paraphrase Kent Beck). Preferably the refactoring step happens as a completely separate PR from the step that introduces the behavior change.

To me, “leave the campsite cleaner than you found it” makes it sound like I’m supposed to just kind of casually clean up a few things in the course of doing your regular work. This is often fine, but I’ve found that being more thoughtful and methodical about the cleanup work can bring substantially better rewards.