Category Archives: Programming

Why the Boy Scout Rule is insufficient

Many programmers I’ve talked with are a fan of the Boy Scout Rule: leave the campsite a little cleaner than you found it.

For the most part I think this is a good guideline. I follow it myself. There are certain ways in which it’s insufficient, though. There’s also a way in which the Boy Scout Rule is actually harmful. I’ll discuss each individually.

When the Boy Scout Rule is not enough

Let’s imagine that I get a new job and start orienting myself to my new employer’s codebase. I find that the back end has good test coverage but test coverage on the JavaScript side is nonexistent. I talk with my new teammates and they agree that this is a big weak area of the application and we should add some JavaScript tests. This is a big can of worms, though, and no one has thought too deeply about it yet.

Now let’s say that the next day I have to touch some of this untested JavaScript code in the course of building a new feature. This is the kind of case where applying the Boy Scout Rule wouldn’t be a good idea. It would certainly be possible for me to apply the Boy Scout rule by adding some tests to this area. But in order to do so I’d have to pick a testing framework to use and maybe even think of a whole testing strategy (tools, structure, etc.). I’m not going to just unilaterally make these decisions on behalf of the whole team. I’d also potentially have to configure my new JavaScript test suite to get run when the existing tests get run. Otherwise we’d just have this weird lonely test site off to the side that has to be run separately and manually.

In the analogy of “leave the campsite a little cleaner than you found it”, this case is a little more like a national park containing dozens of individual campsites. Maybe I’ve discovered that none of the campsites have a proper firepit. I can’t just build a firepit on each campsite whenever I happen to do some other work on it (if I ever happen to do some other work on it at all). This is the kind of change that needs to be discussed among everyone and then applied as a distinct project as opposed to haphazardly over a period of who-knows-how-long.

What I want to leave you with is this: many cleanup tasks are campsite-level tasks. These can be done on one’s own without team involvement. Other cleanup tasks are park-level tasks, and are not in fact tasks but projects. These projects need to be done with the involvement of all the appropriate team members.

When the Boy Scout Rule is harmful

If in the course of building a feature or fixing a bug I make little improvements as I go, I’ve gone from doing just one thing—completing an issue in the backlog—to doing two things—completing an issue in the backlog and improving a piece of the code.

A lot of the time this is fine. Sometimes it’s problematic though. If I mix writing a feature with refactoring the code, my pull request will be harder to parse. It will be harder for any pull request reviewer to tell what exactly changed between the old code and the new code. Some of what I’ve done is refactoring, changing the code without changing the behavior. Other parts of what I’ve done change both the code and the behavior. How can someone tell which is which? It’s not quick and easy to tell which is which.

If a feature or bugfix is mixed with code improvements, my change is no longer atomic. Sometimes when I’m writing a feature or performing a piece of refactoring, I realize that I’m headed down a bad path and I want to revert what I’ve done. Or a few days after I introduce a change, it’s revealed that the change contains a bug, and the fastest and least risky fix is just to revert the entire change. If a feature/bugfix and piece of refactoring are mixed into one piece of work, it’s impossible to roll one back without rolling back the other.

If I discover that the area of code in which I need to change is hard to understand, my go-to move will often be to refactor the code as a separate, atomic piece of work, before I make my change. First I make the code easier to change by refactoring, then I make the change (to paraphrase Kent Beck). Preferably the refactoring step happens as a completely separate PR from the step that introduces the behavior change.

To me, “leave the campsite cleaner than you found it” makes it sound like I’m supposed to just kind of casually clean up a few things in the course of doing your regular work. This is often fine, but I’ve found that being more thoughtful and methodical about the cleanup work can bring substantially better rewards.

Using tests as a tool to wrangle legacy projects

3 Replies

Legacy projects

In my career I’ve worked on my fair share of legacy projects. In fact I’d say that almost all the 50+ production projects I’ve worked on in my career have been legacy projects to some degree.

My experience is undoubtedly not a unique one. I would bet that most of most developers’ work has been on legacy projects. Due to Sturgeon’s law, I think it’s probably safe to assume that most code in the world is legacy code.

The challenges of working on a legacy project

Maintaining a legacy project is often a bummer. Many of the following things are often the case.

Changes take much longer than they should, perhaps by a factor of 10 or more.
Deployments are scary. They’re preceded by stress and followed by firefighting.
Due to the fragility of the system and frequent appearance of bugs and outages, trust in the development team is low.
The development team is always “behind” and under pressure to cut corners to meet deadlines.
Stakeholders are mostly not thrilled.
Due to these bad conditions, developer turnover is high, meaning the team has to spend time training new developers, leaving less time for development.

How do you reverse these challenges? The various problems associated with legacy projects have different causes which are not all solved by the same solutions. No single solution is a panacea, although some solutions go further than others.

One thing I’ve found to have a pretty good ROI is automated testing.

The ways automated tests can help legacy projects

Changes take less time

Why do changes take longer in legacy projects than in cleanly-coded ones?

The first reason is that in order to fix a bug or add a feature, you often have to have an understanding of the area of code you’re changing before you can make the change. If you don’t understand the existing code, you might not even know where to try to slot in your new code.

The second reason is that while any change in any codebase represents a certain amount of risk, the risk in legacy projects is amplified. Legacy projects features are often a delicate Jenga tower, ready to come crashing down at the slightest touch.

Tests can address both of these problems.

Tests can aid a developer in understanding a piece of code in multiple ways. Earliest on, before any changes take place, characterization tests can reveal the previously mysterious behavior of a piece of code.

The idea with a characterization test is that you write a test for a certain method that you know will fail (e.g. `expect(customer.name).to eq(‘asdf’)`), run the test, then change the test to expect the real value you saw (e.g. `expect(customer.name).to eq(‘Gern Blanston’)`). Through this “reverse TDD” method, a picture of the code’s behavior eventually becomes clear.

Later on, once a robust suite of tests has been built up around an area of functionality, tests can aid in the understandability of the code by enabling refactoring.

If my code is well-covered by tests, I can feel free to give descriptive names to obscurely-named variables, break large methods into smaller ones, break large classes into smaller ones, and whatever other changes I want. The knowledge that my tests protect against regressions can allow me to make changes with a sufficient level of confidence that my changes probably aren’t breaking anything.

Of course, the scope of my refactorings should be proportionate to my level of confidence that my changes are safe. Grand, sweeping refactorings will pretty much always come back to bite me.

Deployments become less scary

Legacy projects are often figuratively and literally more expensive to deploy than well-done projects.

The story often goes like this: management observes that each time a deployment happens, bugs are introduced. Bugs frustrate users, erode users’ trust, incur support cost, and of course incur additional development cost.

Because the organization wants to minimize these problematic and expensive bugs, an extensive manual pre-deployment QA process is put in place, which of course costs money. Often there will be a “feature freeze” leading up to the deployment during the QA period, which of course incurs the opportunity cost of discontinuous developer productivity.

And because releases are so risky and expensive, management wants to do the risky and expensive thing less frequently. But this unfortunately has the opposite of the intended effect.

Because each deployment is bigger, there’s more stuff to go wrong, a greater “risk surface area”. It also takes more time and effort to find the root cause of any issue the deployment introduced because a) there’s more stuff to sort through when searching for the root cause and b) the root cause was potentially introduced a long time ago and so may well not be fresh in the developers’ minds.

Tests can help make deployments less risky in two ways. For one, tests tend to:

Help protect against regressions
Help protect against new bugs
Help improve code quality by enabling refactoring

All these things make the application less fragile, decreasing the likelihood that any particular deployment will introduce bugs.

Second, the presence of automated tests enables the practice of continuous deployment.

If an application can be tested to a reasonable degree of certainty by running an automated test suite that takes just a few minutes to run, then each developer can potentially deploy many times per day. Spread across the team, this might mean dozens or hundreds of deployments per day.

Contrast this to the alternative. If there is no automated test suite and each deployment requires the QA team to go through a manual QA process first, then that QA process is a bottleneck that prevents deployments from happening very frequently. Even if the QA team might be technically capable of running through the whole manual test process daily to allow daily deployments, they would probably not want to spend their whole jobs doing only pre-deployment tests. And that would still only get you one deployment per day instead of the dozens or hundreds that continuous deployment would allow.

The obstacles to putting tests on a legacy project

Let’s say you’ve read the above and you say, “Great. I’m sold on adding tests to my testless legacy project. How do I start?”

Unfortunately it’s not usually simple or easy to add tests to a currently-untested legacy project. There are two main obstacles.

Missing test infrastructure

Writing tests requires a test infrastructure. In order to write a test, I need to be able to part of the application I’m testing (often called the “system under test” or SUT) into a state that allow me to exercise the code and make the assertions I need to make in order to tell me that the code is working.

Getting a test framework and related testing tools installed and configured can be non-trivial. I don’t know of any way to make this easier, I just think it’s a good thing to be aware of.

I’ve worked in teams where the boss says, “Let’s make sure we write tests for all new features.” This seems like a sensible rule on the surface but unfortunately there’s no way this rule can be followed if there’s not already a testing infrastructure and testing strategy in place.

The development team and leadership have to agree on what testing tools they’re going to use, which itself can be a very non-trivial step. All the right people have to have buy-in on the testing tooling and testing strategy or else the team will experience a lot of drag as they try to add tests to the application. You have to pull the car onto the road before you hit the gas. If you hit the gas while you’re still in the garage, you’re just going to crash into the wall. So make sure to think about the big picture and get the right team members on board before you try to add tests to your legacy application.

Dependencies and tight coupling

Sometimes it’s easy to get the SUT into the necessary state. More often in legacy projects, it’s very difficult. Sometimes it’s practically impossible. The reason for the difficulty is often tight coupling between dependencies.

Here’s what I mean by tight coupling. Let’s say we want to test class A which depends on classes B and C, which in turn depend on classes D, E, F and G. This means we have to instantiate seven objects (instances of A, B, C, D, E, F and G) just to test class A. If the dependency chain is long enough, the test may be so painful to write that the developer will decide in frustration to toss his laptop off a bridge and begin a new life in subsistence farming rather than write the test.

How to overcome the obstacles to adding tests to legacy code

Missing test infrastructure

If a project has no tests at all, you have two jobs before you:

Write some tests
Set up the testing infrastructure to make it possible to write tests

Both these things are hard. To the extent possible, I would want to “play on easy mode” to the extent possible when I’m first starting to put test coverage on the project.

So rather than trying to identify the most mission-critical parts of the application and putting tests on those, I would ask myself, “What would be easiest to test?” and write some tests for that area of the code.

For example, maybe the application I’m working on has a checkout page and a contact page. The checkout page is obviously more mission-critical to the business than the contact page. If the checkout page breaks there are of course immediate financial and other business repercussions. If the contact page breaks the consequences might be negligible. So a set of tests covering the checkout page would clearly be more valuable than a set of tests covering the contact page.

However, the business value of the set of tests I’m considering writing isn’t the only factor to take into consideration. If a set of tests for the checkout page would be highly valuable but the setup and infrastructure work necessary to get those tests into place is sufficiently time-consuming that the team is too daunted by the work to ever even get started, then that area of the code is probably not the best candidate for the application’s first tests.

By adding some trivial tests to a trivial piece of functionality, a beachhead can be established that makes the addition of later tests that much easier. It may be that in order to add a test for the contact page, it takes 15 minutes of work to write the test and four hours of work to put the testing infrastructure in place. Now, when I finally do turn my attention to writing tests for the checkout page, much of the plumbing work has already been done, and now my job is that much easier.

Dependencies and tight coupling

If a project was developed without testing in mind, the code will often involve a lot of tight coupling (objects that depend closely on other objects) which makes test setup difficult.

The solution to this problem is simple in concept – just change the tightly coupled objects to loosely coupled ones – but the execution of this solution is often not simple or easy.

The challenge with legacy projects is that you often don’t want to touch any of this mysterious code before it has some test coverage, but it’s impossible to add tests without touching the code first, so it’s a chicken-egg problem. (Credit goes to Michael Feathers’ Working Effectively with Legacy Code for pointing out this chicken-egg problem.)

So how can this chicken-egg problem be overcome? One technique that can be applied is the Sprout Method technique, described both in WEWLC as well as Martin Fowler’s Refactoring.

Here’s an example of some obscure code that uses tight coupling:

require 'open-uri'
file = open('http://www.gutenberg.org/files/11/11-0.txt')
text = file.read
text.gsub!(/-/, ' ')
words = text.split
cwords = []
words.each do |w|
  w.gsub!(/[,\?\.‘’“”\:;!\(\)]/, '')
  cwords << w.downcase
end
words = cwords
words.sort!
whash = {}
words.each do |w|
  if whash[w].nil?
    whash[w] = 0
  end
  whash[w] += 1
end
whash = whash.sort_by { |k, v| v }.to_h
swords = words.sort_by do |el|
  el.length
end
lword = swords.last
whash.each do |k, v|
  puts "#{k.ljust(lword.length + 3 - v.to_s.length, '.')}#{v}"
end

If you looked at this code, I would forgive you for not understanding it at a glance.

One thing that makes this code problematic to test is that it’s tightly coupled to two dependencies: 1) the content of the file at http://www.gutenberg.org/files/11/11-0.txt and 2) stdout (the puts on the second-to-last line).

If I want to separate part of this code from its dependencies, maybe I could grab this chunk of the code:

require 'open-uri'
file = open('http://www.gutenberg.org/files/11/11-0.txt')
text = file.read
text.gsub!(/-/, ' ')
words = text.split
cwords = []
words.each do |w|
  w.gsub!(/[,\?\.‘’“”\:;!\(\)]/, '')
  cwords &lt;&lt; w.downcase
end
words = cwords
words.sort!
whash = {}
words.each do |w|
  if whash[w].nil?
    whash[w] = 0
  end
  whash[w] += 1
end
whash = whash.sort_by { |k, v| v }.to_h
swords = words.sort_by do |el|
  el.length
end
lword = swords.last
whash.each do |k, v|
  puts "#{k.ljust(lword.length + 3 - v.to_s.length, '.')}#{v}"
end

I can now take these lines and put them in their own new method:

require 'open-uri'

def count_words(text)
  text.gsub!(/-/, ' ')
  words = text.split
  cwords = []
  words.each do |w|
    w.gsub!(/[,\?\.‘’“”\:;!\(\)]/, '')
    cwords << w.downcase
  end
  words = cwords
  words.sort!
  whash = {}
  words.each do |w|
    if whash[w].nil?
      whash[w] = 0
    end
    whash[w] += 1
  end
  whash = whash.sort_by { |k, v| v }.to_h
  swords = words.sort_by do |el|
    el.length
  end
  lword = swords.last
  [whash, lword]
end

file = open('http://www.gutenberg.org/files/11/11-0.txt')
whash, lword = count_words(file.read)

whash.each do |k, v|
  puts "#{k.ljust(lword.length + 3 - v.to_s.length, '.')}#{v}"
end

I might still not understand the code but at least now I can start to put tests on it. Whereas before the program would only operate on the contents of http://www.gutenberg.org/files/11/11-0.txt and output the contents to the screen, now I can feed in whatever content I like and have the results available in the return value of the method. (If you’d like to see another example of the Sprout Method technique, I wrote a post about it here.

In addition to the Sprout Method technique I’ve also brought another concept into the picture which I’ll describe now.

Dependency injection

Dependency injection (DI) is a fancy term for a simple concept: passing an object’s dependencies as instance variables (or passing a method’s dependencies as arguments).

I applied DI in my above example when I defined a count_words method that takes a text argument. Instead of the method being responsible for knowing about the content it parses, the method will now happily parse whatever string we give it, not caring if it comes from open('http://www.gutenberg.org/files/11/11-0.txt') or a hard-coded string like 'please please please let me get what i want'.

This gives us a new capability: now, instead of being confined to testing the content of Alice’s Adventures in Wonderland, we can write a test that feeds in the extremely plain piece of content 'please please please let me get what i want' and assert that the resulting count of 'please' is 3. We can of course feed in other pieces of content, like 'hello-hello hello, hello' to ensure the proper behavior under other conditions (in this case, the condition when “weird” characters are present).

If you’d like to see an object-oriented example of applying dependency injection to our original piece of legacy code, here’s how I might do that:

require 'open-uri'

class Document
  def initialize(text)
    @text = text
  end

  def count_words
    @text.gsub!(/-/, ' ')
    words = @text.split
    cwords = []
    words.each do |w|
      w.gsub!(/[,\?\.‘’“”\:;!\(\)]/, '')
      cwords << w.downcase
    end
    words = cwords
    words.sort!
    whash = {}
    words.each do |w|
      if whash[w].nil?
        whash[w] = 0
      end
      whash[w] += 1
    end
    whash = whash.sort_by { |k, v| v }.to_h
    swords = words.sort_by do |el|
      el.length
    end
    lword = swords.last
    [whash, lword]
  end
end

file = open('http://www.gutenberg.org/files/11/11-0.txt')
document = Document.new(file.read)
whash, lword = document.count_words

whash.each do |k, v|
  puts "#{k.ljust(lword.length + 3 - v.to_s.length, '.')}#{v}"
end

Conclusion

Hopefully this post has armed you with a few useful techniques to help you get your legacy project under control and start turning things around.

“We don’t have time to write tests”

The hidden (fallacious) beliefs behind “no time for tests”

If I say I didn’t write tests due to lack of time, I must view tests as something “extra”. Tests are an extra thing that take up time but aren’t strictly necessary. The real work of course is the feature development itself, writing application code. If I write tests for my code I can finish in 10 hours but if I skip the tests I can finish in 6 hours. So by excluding tests, I’m making the responsible choice for the business and focusing only on what’s essential.

If I’m thinking this way, I’m fooling myself.

Tests save time in the short run

In my experience most developers believe that testing saves time in the long run but not in the short run. Tests are an investment you make: you slow things down today so that in six months things can be faster.

I don’t believe that this is an accurate view of reality. It’s true it often takes longer to write a feature plus tests than it would take to write the same feature without tests. For example, if I build a very simple form that saves a database record and that’s all there is to it, then skipping the test would save time (and writing the test would arguably not add huge value). But the second things get somewhat nuanced, tests start being a big time saver.

The reason is that everything needs to get tested, it’s just a question of whether that testing happens via automated tests or manual tests. Let’s say I’m working on an e-commerce application that has a coupon feature at checkout. Certain coupons can be stacked and certain ones can’t. Certain products are eligible for discounts and certain ones aren’t. There are a lot of possible paths through this feature. All the outcomes need to be tested and, every time the code gets changed, everything needs to be re-tested to check for regressions. Doing this manually is a huge pain in the ass. It’s easier, faster, more reliable and more enjoyable to write automated tests that check all this stuff for us. In this case we don’t have to wait six months before the tests start making our work faster. We experience the benefits of testing immediately.

How to convince yourself or your manager to let you write tests

When the pressure of a deadline is bearing down upon you and you feel like all the stakeholders are breathing down your neck, it can often be very difficult to resist the temptation to skip testing.

Sometimes the pressure to skip tests comes from leadership/management. But in my experience the choice to skip tests usually belongs to the developers.

When you find yourself in those situations here’s my advice. First ask, “Why do I feel tempted not to write tests?” For me personally, I feel tempted not to write tests when doing so would require me to interrupt my feeling of being “in the zone” so I can stop and puzzle over how to go about writing tests for the code I’m writing. I don’t want to stop the momentum, I want to move onto the next to-do on my list and demonstrate progress to my stakeholders. The idea of writing a test feels irresponsible. So in these situations I remind myself not to listen to my feelings but instead listen to my brain. I let my brain remind me that it’s okay to take the time to do my job the most effective way I know how to do it. I give myself permission to do a good job. That might sound silly but for me it actually works.

If the instruction to skip testing comes from management I think there are three options.

One option is to try to educate management that writing tests actually saves time, not just in the long term but in the short term. Unfortunately, experience tells me that this approach is usually not successful. If management commands the development team not to write tests, the root cause is usually an unfixable “we’re too busy sawing to sharpen the saw” mindset that can’t be reversed with any amount of reasoning. Stupid managers can’t be trained to be smart. But if management is in fact smart but uneducated on testing, there’s hope. Managers really do want their developers to follow good practices to ensure sustainable productivity. If your manager is smart, he or she will probably be on your side on the testing issue, provided the manager really understands what it’s all about.

Another option is just to say fuck it and write tests anyway, against orders. I personally have never been big on the idea of doing my work in a dumb way just because my boss told me to. Sometimes I get in trouble for it, sometimes I don’t. One time I got fired for my uncooperativeness. Usually I don’t get in trouble though and everything works out fine. I will say, by the way, that I don’t think I’ve ever been ordered not to write tests. I have been ordered to do other dumb things though, like write code in isolation for three months before deploying to production. This leads me to option three.

If you’re continually ordered by management to skip tests, a very real option is just to leave and get a different job.

But no matter what your scenario, don’t fall into the fallacious belief that skipping tests saves time. It usually doesn’t.

Rails scaffolding and TDD are incompatible (but that’s okay)

6 Replies

Testing + TDD = a serious learning curve

Learning Rails testing is pretty hard. There are a lot of principles and tools to learn. Getting comfortable with testing in Rails (or any framework) often takes developers years.

Compounding the difficulty is TDD. If I’m just starting out with testing, should I learn TDD? Writing tests at all is hard enough. How am I supposed to write a test first?

And if the “testing + TDD” combo doesn’t generate enough cognitive turmoil, “testing + TDD + scaffolding” makes the scenario even murkier.

Scaffolding and TDD

In my experience most Rails developers take advantage of scaffolding to generate CRUD interfaces really quickly. Scaffolding is of course awesome and a big part of what makes Rails Rails.

Where things can get confusing is when you mix the ideas of “I want to use scaffolding to build an application quickly” and “I want to TDD my Rails app”. You cannot do both at the same time.

It seems obvious to me now but it took me a long time to realize it. In TDD, I write the tests before writing code. Scaffolding generates code but not tests*, the exact opposite. So I can’t apply TDD to scaffold-generated code. Test-after is my only option.

*It’s true that an application can be configured to automatically generate the shells of tests whenever a scaffold is generated, but they’re just shells of tests and they don’t actually test anything.

What to do about this TDD-scaffolding incompatibility

So, scaffolding and TDD are incompatible. What do we do about this?

My answer is nothing. The fact that scaffold-generated code can’t be TDD’d isn’t a bad thing, it’s just a fact. I’m making a conscious trade-off: in exchange for getting a bunch of CRUD code for free, I’m trading away the benefits of TDD in that particular area. I think on balance I come out ahead.

This fact does mean that I have to exercise a little more discipline when I’m working with scaffold-generated code. TDD automatically results in good code coverage because, if I follow the methodology to the letter (which BTW I don’t always), I never write a line of code without writing a test first. The rhythm of the red-green-refactor loop means that I don’t really have to apply much discipline. I just put one foot in front of the other according to the rules of TDD and when the dust settles I have good test coverage.

But when doing test-after, I need to resist the temptation to say “Hey, this code works. Do I really need to write a test? I’ll just move onto the next thing.”

How I write tests for scaffold-generated code

After I generate a scaffold I’ll usually follow something like the following process.

Write model specs for attribute validation
Write a feature spec for the “happy path” of creating the resource
Write a feature spec for the “happy path” of updating the resource
Look for anything that would be tedious to regression-test manually and write a feature spec for that

That’s about all I worry about for scaffold-generated code. If during the course of development a stupid bug pops up that a test would easily have prevented, I’ll backfill that code path with a test. And if I need to extend the functionality provided by the scaffold (in other words, if I need to write new code manually), I’ll TDD that, because at that point there’s nothing stopping me from doing so.

When I use controller/request specs and when I don’t

4 Replies

Every so often I come across a question of controller or request specs and when to use them.

Before I can address this I need to get some terminology out of the way. In RSpec there are controller tests and request specs which are similar but subtly different. Let’s talk about what the difference is.

Controller specs vs. request specs

As of (apparently) July 2016, the Rails team and the RSpec core team recommend using request specs instead of controller specs. Here’s why, in their words:

For new Rails apps: we don’t recommend adding the rails-controller-testing gem to your application. The official recommendation of the Rails team and the RSpec core team is to write request specs instead. Request specs allow you to focus on a single controller action, but unlike controller tests involve the router, the middleware stack, and both rack requests and responses. This adds realism to the test that you are writing, and helps avoid many of the issues that are common in controller specs. In Rails 5, request specs are significantly faster than either request or controller specs were in rails 4, thanks to the work by Eileen Uchitelle1 of the Rails Committer Team.

If I had to extract a TL;DR from this I’d put it this way: “The RSpec team recommends that we use request specs instead of controller specs. The reason is that request specs provide a little bit more of a realistic test bed than controller specs”.

So henceforth in this post I won’t be mentioning controller specs, only request specs.

I only use request specs in three scenarios

My default policy is not to use request specs at all. I find that fine-grained model specs and coarse-grained feature specs give me a sufficient level of coverage. Plus the fact that I put very little code in my controllers means that controller-focused tests would add relatively little value.

There are three specific scenarios though where I find request specs helpful.

Scenario 1: legacy projects

If my controllers are skinny to the point of being skeletal then I don’t feel a strong need to test them directly (although I certainly still test the controller behavior indirectly with feature specs).

If on the other hand my controllers are morbidly obese, as they often are in Rails legacy projects, then I find request specs to be a very handy tool.

When I encounter a bloated controller method my first desire is just to scoop out the body of that method and move it to either a new or existing model object. Unfortunately that’s not always a safe option due to the code in that method being tightly coupled to the particulars of that controller. Adding to the risk is that I often don’t understand exactly what the controller method is supposed to do.

So in these scenarios I’ll write a characterization test in the form of a few request specs that cover that controller method’s behavior.

These request specs will bring me two benefits. First, the process of interrogating the code via tests will often help me gain a decent level of understanding of what the code is doing. Second, the tests themselves will provide a safety net that mitigates the risk of any change.

In this case my request specs hopefully have a limited lifespan. Hopefully my request specs will assist me in moving the controller code into models. Then I can reincarnate the request specs as model specs.

Scenario 2: APIs

I said earlier that I tend not to use request specs because the combination model specs and feature specs usually gives me the level of confidence I desire.

Since APIs don’t have GUIs that can be exercised, feature specs are out of the picture. So in these cases I instead use request specs.

Hopefully my API code is designed such that most of the heavy lifting is done by the models and tested by model specs. I don’t want to have to test a whole bunch of code path permutations via request specs. If there’s complexity to be tested, I want it to be tested via a model spec. To me a request spec on an API serves the same general purpose as a feature spec on a complete web application which is to answer the question: “Do all the layers of the application work together?”

When I’m coding a new “traditional” Rails app I actually disable request specs along with a handful of other spec types.

Scenario 3: it’s too awkward or costly to write a feature spec

There are some scenarios where it would be too awkward, impractical or slow to write a full feature spec for the controller behavior I want to test.

For example, let’s say I want the delete action of a controller to allow deletion of the resource under certain conditions but not under certain other conditions. It might be too slow or cumbersome to write a full feature spec for this, so in this case I might just write a request spec.

“It’s ugly but it works”

“Ugly” is just a euphemism for “hard to understand”

When a programmer refers to a piece of their own code as “ugly”, what they usually actually mean is that it’s hard to understand. “It’s ugly but it works” is a much more defensible-sounding statement than “It’s hard to understand but it works”.

“It works” is only half the job of code

In Refactoring, Martin Fowler wrote, “Any fool can write code that a computer can understand. Good programmers write code that humans can understand.”

The reason that it’s important for humans to be able to understand code is of course because it’s humans who are responsible for maintaining the code. If a piece of code works but is impossible to understand, then it’s going to be a lot more time consuming and expensive to maintain.

“But what about deadlines?”

Sometimes time pressure drives programmers to cut corners and write worse code than they could. I’ve certainly done it.

I would argue though that I usually don’t find writing bad code advantageous for the purpose of meeting deadlines. In fact, I tend to find that I can write code faster if I not only focus on code quality but clean up some of the existing code before adding new code. Consciously incurring technical debt in order to meet a deadline usually just makes the next deadline even harder to meet.

Avoiding Network Calls In Rails Tests Without Using Mocks, Stubs or VCR

1 Reply

An important principle of testing is that tests should be deterministic. The passing or failing of a test shouldn’t depend on the date it was run, whether any certain other tests ran before it, or whether some condition outside the application changed.

If a test hits an external URL then it’s susceptible to being non-deterministic. I’ll give you an example from something I was working on today.

Today I wrote a test that said, “When I save a new podcast to the database, all the episodes listed in that podcast’s XML feed should get saved to the database.” I pointed my code at the XML feed URL for the Tim Ferriss Show and expected that when I saved the Tim Ferriss Show, 349 episodes would go into the database.

You can probably sense the problem with this test. What happens when I run this test next week when Tim Ferriss has 350 episodes? The test will fail even though the code will still work.

The solution to this problem was to change the test so it doesn’t hit the URL. But I didn’t use VCR or mocks or stubs. I used dependency injection.

Basically I changed the level of responsibility of the method that parses the XML feed. The original “agreement” was “I’ll give you an XML feed URL and you grab its contents from the internet and parse the file and save the episodes to the database”. I changed that agreement to “I’ll give you the contents of an XML file (which could have come from the internet or from the filesystem) and you parse it and save the episodes to the database”.

Here’s what my parsing method looks like:

def save_and_parse!(xml_feed_contents)
  save!

  PodcastRSSFile.new(
    show: self,
    contents: xml_feed_contents
  ).consume!
end

This method (which you can see in context on GitHub here) doesn’t know or care where the XML feed contents came from.

This means that in production I can pass XML feed contents that came from the internet, and in test I can pass XML feed contents that came from the filesystem.

Does this mean that the actual act of downloading the XML feed content from the internet is untested? Yes. That’s a downside, but I think it’s a small one and I prefer that downside over the downside of having non-deterministic tests.

By the way, I recorded myself writing this test as part of one of my free Rails testing workshops. You can see the recording on YouTube here.

Common legacy project challenges and how to address them

2 Replies

What is a legacy project?

The terms “legacy project” and “legacy code” mean different things to different developers. I think one thing that most developers could agree on is that a legacy project is a project that’s difficult or painful to maintain.

Spend enough years in the software industry and you might find yourself forming the opinion that most projects are legacy projects. Some projects, of course, are “more legacy” than others.

What follows is a list of what I’ve found to be some of the most common characteristics of legacy projects. Under each item I’ve included the antidote to the issue.

Here are some common characteristics of legacy projects:

Poor processes and/or incompetent team members
Few tests or no tests
Mysteriously-named variables, methods, etc.
Original maintainers who are no longer available
Unknown proper behavior
Bugs

Poor processes and/or incompetent team members

Out of all the challenges of a legacy project I think this one has the biggest impact and is the most challenging to fix. To be totally frank, it’s often a problem that’s impossible to fix, and I think it’s a big part of why so many software projects fail and so much software out there is half-broken.

The problem is this: projects end up with bad code because the code was brought into existence by bad processes and/or incompetent developers. The code doesn’t exist separately from the people and processes. The quality of the code is a direct function of the people and processes that created it.

Let’s imagine a team whose leader believes code should only be released “when it provides value to the user”, and so deployments only happen once every three months. When the deployments do happen, they’re risky, shuttle-launch type affairs with a lot of stress leading up to them and a lot of firefighting during and after. And let’s say due to time pressure the dev team believes they “don’t have time to write tests”. Instead of writing small and crisply-defined user stories, the team works on big, nebulous stories that drag on and on because no one ever nailed down where the finish line is.

If you asked me to make a prediction about this team I would guess there’s a good chance they’re going to write some pretty bad code (that is, code that’s risky and time-consuming to maintain). If I were going to make a bet about the team’s meeting a deadline, I would bet heavily against them based on the way they work.

The antidote

So, what can be done about people and process problems? What unfortunately compounds this problem is that you can’t just show the team the answers to their problems and expect things to start turning around. Sometimes dev teams and managers follow bad practices out of simple inexperience. In those cases it’s relatively easy to educate them into better behavior.

Just as often, the problem is impossible to solve. I consider there to be three types of employees:

People who are already good
People who aren’t good yet but can be later
People who aren’t good and never will be

If a team is made up of people of types 1 and 2 then there’s hope. These people can be educated into smarter processes that produce more easily maintainable code.

The people of type 3 just need to go. Unfortunately, in legacy projects, type 3 people are the people in charge and so there’s no one to fire them. I had a boss once who forced me to write code for three months before deploying it. This inevitably worked out badly. The proposal of continuous deployment was met with a hard no (as were roughly all the other proposals made to that particular manager). I once worked for an organization that drove a ~$300,000 project into the ground by failing to put the mysterious work first. Then they put me on another project which they began the exact same way. I warned them they that were headed for the same fate a second time but they ignored me. Most of this particular organization’s projects failed, and for similar reasons. If an organizer’s leadership refuses to learn from conventional wisdom or even from their own failures, then the word for this is stupidity and it cannot be fixed. The only option for a developer in this situation is to quit and get a job somewhere better.

But what about cases where both leadership and the development team have a willingness to learn what it would take to make things better?

Here are some of the most common practices I see teams failing to follow that would have a potentially large impact if the team were to start following them.

Automated testing
Agile development
Continuous integration
Continuous delivery/deployment
Pair programming
PR reviews
Atomic commits
Small, crispy-defined stories

Now that I’ve address the somewhat meta (but super important) matter of people and processes, let’s talk about some of the code-related issues.

Few tests or no tests

In his book Working Effectively with Legacy Code, Michael Feathers defines legacy code as simply “code without tests”. I find this definition interesting for its conciseness.

Think of the benefits that testing affords that you’re missing in a test-free legacy project. You can’t make changes without being reasonably confident you’re not introducing regressions. You can’t refactor safely. You don’t have an executable specification of what the software is supposed to do, so your only recourse is to squint at the code (which is often hard or impossible to understand) to try to figure out what it does.

The antidote

If a project doesn’t have tests then the antidote is of course to write tests. It’s unfortunately not that simple, though.

I’ve worked on teams where leadership imposes a rule: “All new code must have tests.” In my experience this rule doesn’t work very well. I’ll explain why.

Not all code is equally testable. Dependencies make code hard to test. If a project was written without testing in mind there are probably a lot of entangled dependencies blocking developers from writing tests on that code. There’s a chicken-egg problem: the code needs to be changed in order to break the dependencies and make it testable, but the developers can’t safely change the code until tests are in place. This is why adding tests to legacy projects is hard.

The solution in this situation is really just to buy Working Effectively with Legacy Code by Michael Feathers. The techniques described in the book like Sprout Class, Sprout Method, characterization testing and others are super useful tools in putting tests on legacy code.

If a project has no tests at all it can be especially challenging. You might wonder what to test first. Your instincts might tell you to try to test the most important code first, or to start writing tests with whatever your next feature is. My inclination though would be to start with whatever’s easiest. If it’s easiest to write a test verifying that the sign-in page says “Sign in”, then I might write that test first, even though it might feel stupid and meaningless. Then I might test something slightly more difficult to test, and so on, until I’m able to test pretty much anything in the application that I want to.

Mysteriously-named variables, methods, etc.

I define bad code as code that’s risky and/or expensive to change. “Expensive” is basically interchangeable with “time-consuming”.

If the variables, methods and classes in a project are named in a way that’s misleading or difficult to understand, then the project will be more time-consuming than necessary to maintain. As I explain in Variable Name Anti-Patterns, it’s better to save time thinking than to save time typing. Typing is cheap. Thinking is expensive.

The antidote

Luckily it’s usually pretty straightforward to change the names of things, although sometimes you’ll find entities so deeply baked into the code and the database that it’s prohibitively expensive to go back and change it at this point.

But if you find a bad variable name, how do you decide what to rename it to? My rule for naming things is: call things what they are. This might sound super obvious but based on how often I encounter entities that are named for something other than what they are, it’s a rule that I think is worth explicitly stating.

Sometimes a class is poorly named just because the author didn’t bother to put any thought into the matter. But other times there’s a misfit between the name of the class and the abstraction that ought to exist. So sometimes instead of or in addition to renaming a class, I’ll invent a new abstraction and move some of that classes code into the new abstraction.

You’ll find more advice on naming in Variable Name Anti-Patterns.

Original maintainers who are no longer available

One particularly painful aspect of legacy projects is that the people who wrote the original code are often no longer around. So when you want to ask, “What is THIS here for?” or “What the fuck were you thinking, DALE?!?!” You can’t, because Dale writes shitty code somewhere else now.

The antidote

Unfortunately there really is nothing you can do about this one. But you can prevent the same thing from happening to someone else. You can start to write documentation for the project.

In my mind there are at least two kinds of documentation for a project: technical documentation and domain documentation. Both can be super helpful.

The thing I like about writing domain documentation is that it doesn’t go out of date as quickly as documentation about the particulars of the project. Some domains can be learned in books, Wikipedia, etc. but other domains require knowledge that mostly only exists in people’s heads. Every organization also has “institutional knowledge” or “tribal knowledge” which is stored in the heads of the employees but isn’t written down anywhere. It helps to write these things down.

When it comes to technical documentation I don’t find much value in documenting specific methods, classes, etc. although that is sometimes helpful. I prefer to start with things like how to get the project set up locally, what how the production environment is structured, and stuff like that. The process of writing this documentation can often help the team (and individual team members) understand things that they wouldn’t have otherwise investigated.

Unknown proper behavior

Does the code work? That question is sometimes impossible to answer because you don’t even know what the code is supposed to do.

One of the quirks of legacy project maintenance is that the important thing isn’t to ensure correct behavior, it’s to preserve current behavior. I’ve heard of cases where a legacy project contains a bug, but users actually depend on that buggy behavior, so fixing the bug would itself be a bug.

The antidote

One way to help ensure that current behavior is being preserved is to use characterization testing. I think of characterization testing like TDD in reverse. Here’s how I write characterization tests.

Let’s say I’m writing a characterization test for a method. First I’ll comment out the entire body of the method. Then I’ll write a test. I might not know what to assert in the test, so I’ll assert that the method will return “asdf”. I of course know that the method won’t return “asdf”, but I can run the test to see what the method really does return, then change my assertion to match that.

Once I have a test case I’ll uncomment whatever’s needed to make that test pass. In this sense I’m following the golden rule of TDD: don’t write any new code without having a test that covers it. In this case the code has already been written but that’s just kind of a technicality. I’m starting with no code (it’s all commented out) and then I’m “writing” the code by uncommenting it.

The result of my “reverse TDD” process and regular TDD is the same: I end up with more or less full test coverage. Once I have that I’m free to refactor with confidence that I’m preserving current behavior.

Bugs

Bugs are of course a very common characteristic of legacy projects. I don’t think I need to spend much room discussing bugs.

The antidote

The antidote to having bugs is of course to fix them. The sounds simple but it’s much easier said than done.

I’ve worked on teams where we have nasty bugs in the application but leadership prioritizes new development over bugfixes. This is a failure in short-term/long-term thinking. Prioritizing new development over bugfixes might provide more speed in the very short term but in the long term it slows things way down. In order for things to turn around, leadership has to have the discipline to temporarily slow down development in order to get the application into a more stable state. If leadership is too stubborn to do this then there’s probably no hope for the project.

Automated tests can of course help with bugfixes, particularly regressions. It can be very frustrating for everyone involved when the developers fix a bug only to have the same exact bug reappear later. If tests are written along with each bugfix (which is again easier said than done) then the team can be reasonably confident that each bugfix will be permanent.

Factories and fixtures in Rails

4 Replies

One necessity of automated testing is having some data to test with. There are three ways I know of to generate test data in Rails:

Manually
Fixtures
Factories

All three can be viable solutions depending on the situation. Let’s first explore manual creation.

Manual data creation

Manual data creation can be convenient enough if you only have a few attributes on a model and no dependencies. For example, let’s say I have a PaymentType model with just one attribute, name. If I want to test both valid and invalid states, that’s easy:

valid_payment_type = PaymentType.new(name: 'Visa')
invalid_payment_type = PaymentType.new(name: '')

But now let’s say we have the idea of an Order which is made up of multiple LineItems and Payments, each of which has a PaymentType.

order = Order.create!(
  line_items: [
    LineItem.create!(name: 'Electric dog polisher', price_cents: 40000)
  ],
  payments: [
    Payment.create!(
      amount_cents: 40000,
      payment_method: PaymentMethod.create!(name: 'Visa')
    )
  ]
)

That’s annoying. We had to expend mental energy to arbitrarily come up with details (“Electric dog polisher” with a price of $400.00, paid by Visa) that aren’t even necessarily relevant.

What if all we wanted to test was that when the payment total equals the line item total, order.balance_due returns zero?

This is where factories come in handy.

Factories

Factories are actually not a test-specific concept. “Factory method” is a design pattern that you can find in the “Gang of Four” design patterns book. The pattern just happens to come in handy for the purpose of testing.

The idea with a factory is basically that you have a method/function that generates new class instances for you.

In the Gang of Four book they use an example where a factory called Creator will return either an instance of MyProduct or YourProduct depending on certain conditions. So rather than having to say “if this case, then instantiate a MyProduct, otherwise, instantiate a YourProduct” all over the place, you can just say Creator.create(relevant_data) and get back the appropriate class instance.

You might be able to imagine how such a factory would be useful. In the case of testing, the kind of factories we want will be a little bit different. We’re not concerned with abstracting away whether class A or class B gets instantiated. We want to abstract away the generation of irrelevant details and tedious-to-set-up model associations.

Here’s an example of how the setup for an Order instance might look if we used a factory, specifically Factory Bot. (By the way, Factory Bot is the most popular Rails testing factory but there’s nothing particularly special about the way it works. The ideas in Factory Bot could have been implemented any number of ways.)

order = FactoryBot.create(
  :order,
  line_items: [FactoryBot.create(:line_item, price_cents: 40000)],
  payments: [FactoryBot.create(:payment, amount_cents: 40000)]
)

In this case we’re specifying only the details that are relevant to the test. We don’t care about the line item name or the payment method. As long as we have a payment total that matches the line item total, that’s all where care about.

How would we achieve this same thing using fixtures?

Fixtures

First off let me say that I’m much less experienced with fixtures than I am with factories. If you want to learn more about fixtures I suggest you read something from someone who knows what they’re talking about.

Having said that, if we wanted to set up the same test data that we did above using factories, here’s how I believe we’d do it with a fixture. Typically fixtures are expressed in terms of YAML files.

# orders.yml
payments_equal_line_item_total:
  # no attributes needed

# line_items.yml
electric_dog_polisher:
  order: payments_equal_line_item_total
  name: 'Electric dog polisher'
  price_cents: 40000

# payment_methods.yml
visa:
  name: 'Visa'

# payments.yml
first:
  order: payments_equal_line_item_total
  payment_method: visa
  amount_cents: 40000

Once the fixture data is established, instantiating an object that uses the data is as simple as referring to the key for that piece of data:

order = orders(:payments_equal_line_item_total)

So, what’s best? Factories, fixtures, or manual creation?

Which is best?

I think I’ve demonstrated that manually generating test data can quickly become prohibitively tedious. (You can often mitigate this problem, though, by using loose coupling and dependency injection in your project.) But just because manual data generation doesn’t work out great in all scenarios doesn’t mean it’s never useful. I still use manual data generation when I only need to spin up something small. It has a benefit of clarity and low overhead.

Like I said above, I’ve definitely used factories more than fixtures. My main reason is that when I use factories, the place where I specify the test data and the place where I use the test data are close to each other. With fixtures, on the other hand, the setup is hidden. If I want to see exactly what test data is being generated, I have to go into the YAML files and check. I find this too tedious.

Another issue I have with fixtures in practice is that, in my experience, teams will set up a “world of data” and then use that big and complicated world as the basis for all the tests. I prefer to start each test with a clean slate, to the extent possible, and generate for each test the bare minimum data that’s needed for that test. I find that this makes the tests easier to understand. I want to be clear that I think that’s a usage problem, though, not an inherent flaw in the concept of fixtures.

There’s also no reason why you couldn’t use both factories and fixtures in a project. I’m not sure what the use case might be but I could imagine a case where fixtures provide some minimal baseline of fixed data (the payment types of Visa, MasterCard, cash, etc. come to mind as something that would be often-needed and never changed) and then factories take the baton from there and generate dynamic data on a per-test basis.

My go-to test generation method is factories, and that’s what I generally recommend to others. But if the use case were right I wouldn’t hesitate to use fixtures instead of or in addition to factories.

Why “users don’t care about code” is a bad saying

What this has to do with testing

If a developer or manager believes that users don’t care about code, he or she probably also believes that users don’t care about test coverage.

And in the same sense that it’s technically true that users don’t care about code, it is technically true that users don’t care about tests.

When a dev team is under pressure to release a feature, what’s usually the first thing to go? Tests, of course. I can’t count how many times I’ve heard somebody say something like, “We would have written tests but we wanted to get something out quickly.” It’s commonly believed that skipping tests saves time but except for the most trivial features it’s usually a lot more time-consuming to test features manually. Like every other form of cutting corners, skipping tests creates technical debt and makes our work harder.

Cutting corners and consciously incurring technical debt doesn’t just make our work harder at some abstract distant time in the future. Technical debt costs technical interest. Those interest payments typically start being due immediately. And the more technical debt we have, the slower and less reliably we’re able to deliver value to users.

So please don’t tell developers that users don’t care about code as a way to encourage “speed”. Poorly-written code does affect users.

The Beginner's Guideto Rails Testing

When the Boy Scout Rule is not enough

When the Boy Scout Rule is harmful

Legacy projects

The challenges of working on a legacy project

The ways automated tests can help legacy projects

Changes take less time

Deployments become less scary

The obstacles to putting tests on a legacy project

Missing test infrastructure

Dependencies and tight coupling

How to overcome the obstacles to adding tests to legacy code

Missing test infrastructure

Dependencies and tight coupling

Dependency injection

Conclusion

The hidden (fallacious) beliefs behind “no time for tests”

Tests save time in the short run

How to convince yourself or your manager to let you write tests

Testing + TDD = a serious learning curve

Scaffolding and TDD

What to do about this TDD-scaffolding incompatibility

How I write tests for scaffold-generated code

Controller specs vs. request specs

I only use request specs in three scenarios

Scenario 1: legacy projects

Scenario 2: APIs

Scenario 3: it’s too awkward or costly to write a feature spec

“Ugly” is just a euphemism for “hard to understand”

“It works” is only half the job of code

“But what about deadlines?”

What is a legacy project?

Poor processes and/or incompetent team members

The antidote

Few tests or no tests

The antidote

Mysteriously-named variables, methods, etc.

The antidote

Original maintainers who are no longer available

The antidote

Unknown proper behavior

The antidote

Bugs

The antidote

Manual data creation

Factories

Fixtures

Which is best?

What this has to do with testing

The Beginner's Guide
to Rails Testing