Author Archives: Jason Swett

Common causes of flickering/flapping/flaky tests

A flapping test is a test that sometimes passes and sometimes fails even though the application code being tested hasn’t changed. Flapping tests can be hard to reproduce, diagnose, and fix. Here are some of the common causes I know of for flapping tests. If you know the common causes of flapping tests it can go a long way toward diagnosis. Once diagnosis is out of the way, the battle is half over.

Race conditions

Let’s say you have a Capybara test that clicks a page element that fires off an AJAX request. The AJAX request completes, making some other page element clickable. Sometimes the AJAX request beats the test runner, meaning the test works fine. Sometimes the test runner beats the AJAX request, meaning the test breaks.

“Leaking” tests (non-deterministic test suite)

Sometimes a test will fail to clean up after itself and “leak” state into other tests. To take a contrived but obvious example, let’s say test A checks that the “list customers” page shows exactly 5 customers and test B verifies that a customer can be created successfully. If test B doesn’t clean up after itself, then running test B before test A will cause test A to fail because there are now 6 customers instead of 5. This exact issue is almost never an issue in Rails (because each test runs inside a transaction) but database interaction isn’t the only possible form of leaky tests.

When I’m fixing this sort of issue I like to a) determine an order of running the tests that always causes my flapping test to fail and then b) figure out what’s happening in the earlier test that makes the later test fail. In RSpec I like to take note of the seed number, then run rspec --seed <seed number> which will re-run my tests in the same order.

Reliance on third-party code

Sometimes tests unexpectedly fail because some behavior of third-party code has changed. I actually sometimes consciously choose to live with this category of flapping test. If an API I rely on goes down and my tests fail, the tests arguably should fail because my application is genuinely broken. But if my tests are flapping in a way that’s not legitimate, I’d probably change my test to use something like VCR instead of hitting the real API.

A Rails model test “hello world”

The following is an excerpt from my book, Rails Testing for Beginners.

What we’re going to do

What we’re going to do in this post is:

  1. Initialize a new Rails application
  2. Install RSpec using the rspec-rails gem
  3. Generate a User model
  4. Write a single test for that User model

The test we’ll write will be a trivial one. The user will have both a first and last name. On the User model we’ll define a full_name method that concatenates first and last name. Our test will verify that this method works properly. Let’s now begin the first step, initializing the Rails application.

Initializing the Rails application

Let’s initialize the application using the -T flag meaning “no test framework”. (We want RSpec in this case, not the Rails default of MiniTest.)

Now we can install RSpec.

Installing RSpec

To install RSpec, all we have to do is add rspec-rails to our Gemfile and then do a bundle install.

We’ve installed the RSpec gem but we still have to run rails g rspec:install which will create a couple configuration files for us.

Now we can move on to creating the User model.

Creating the User model

Let’s create a User model with just two attributes: first_name and last_name.

Now we can write our test.

The User test

Writing the test

Here’s the test code:

Let’s take a closer look at this test code.

Examining the test

Let’s focus for a moment on this part of the test code:

This code is saying:

  1. Instantiate a new User object with first name Abraham and last name Lincoln.
  2. Verify that when we call this User object’s full_name method, it returns the full name, Abraham Lincoln.

What are the its and describes all about? Basically, these “it blocks” and “describe blocks” are there to allows us to put arbitrary labels on chunks of code to make the test more human-readable. (This is not precisely true but it’s true enough for our purposes for now.) As an illustration, I’ll drastically change the structure of the test:

This test will pass just fine. Every it block needs to be nested inside a do block, but RSpec doesn’t care how deeply nested our test cases are or whether the descriptions we put on our tests (e.g. the vague and mysterious it 'does stuff') make any sense.

Running the test

If we run this test it will fail. It fails because the full_name method is not yet defined.

Let’s now write the full_name method.

Now the test passes.

Reader Q&A: Kaemon’s question about test-first vs. test-after

Recently reader Kaemon L wrote me with the following question:

“As a beginner, is it better to write tests before you code to make it pass? or is it better to code first, write tests for the code to pass, and then add more tests as you come across bugs? In my experience so far learning RSpec, I’ve found it easier to code first and then write tests afterwards. Only because when I would try to write tests first I wasn’t exactly sure what needed to be tested, or how I was planning to write the code.

This is a great question. In addressing this question I find it useful to realize that when you’re learning testing you’re actually embarking on two parallel endeavors:

1. Writing tests (outcome)
2. Learning how to write tests (education)

I think it’s useful to make the distinction between these two parts of the work. If you make the realization that achieving the desired outcome is only half your job, and the other half is learning, then it frees you up to do things “incorrectly” for the sake of moving forward.

With that out of the way, what’s actually better? Writing tests first or after?

I’m not sure that it makes sense to frame this question in terms of “better” or “worse”. When I think of test-driven development, I don’t think of it as “better” than test-after in all situations, I think of TDD as having certainadvantages in certain scenarios.

What are the advantages of test-driven development?

TDD can separate the what from the how. If I write the test first, I can momentarily focus on what I want to accomplish and relieve my mind of the chore of thinking of how. Then, once I switch from writing the test to writing the implementation, I can free my mind of thinking about everything the feature needs to do and just focus on making the feature work.

TDD increases the chances that every single thing I’ve written is covered by a test. The “golden rule” of TDD (which I don’t always follow) is said to be “never write any new code without a failing test first”. If I follow that, I’m virtually guaranteed 100% test coverage.

TDD forces me to write easily-testable code. If I write the test first and the code after, I’m forced to write code that can be tested. There’s no other way. If I write the code first and try to test it afterward, I might find myself in a pickle. As a happy side benefit, code that’s easily testable happens to also usually be easy to understand and to work with.

TDD forces me to have a tight feedback loop. I write a test, I write some code. I write another test, I write some more code. When I write my tests after, I’m not forced to have such a fast feedback loop. There’s nothing stopping me from coding for hours before I stop myself and write a test.

If I choose to write my tests after my application code instead of before, I’m giving up the above benefits. But that doesn’t mean that test-after is automatically an inferior workflow in all situations.

Let’s go back to the two parallel endeavors listed above: writing tests and learning how to write tests. If I’m trying to write tests as I’m writing features and I just can’t figure out how to write the test first, then I have the following options:
1. Try to plow through and somehow write the test first anyway
2. Give up and don’t write any tests
3. Write the tests after

If #1 is too hard and I’m just hopelessly stuck, then #3 is a much better option than #2. Especially if I make a mental shift and switch from saying “I’m trying to write a test” to saying “I’m trying to learn how to write tests”. If all I’m trying to do is learn how to write tests, then anything goes. There’s literally nothing at all I could do wrong as part of my learning process, because the learning process is a separate job from producing results.

Lastly, what if I get to the stage in my career where I’m fully comfortable with testing? Is TDD better than test-after? I would personally consider myself fully comfortable with testing at this stage in my career (although of course no one is ever “done” learning). I deliberately do not practice TDD 100% of the time. Sometimes I just find it too hard to write the test first. In these cases sometimes I’ll do a “spike” where I write some throwaway code just to get a feel for what the path forward might look like. Then I’ll discard my throwaway code afterward and start over now that I’m smarter. Other times I’ll just begin with the implementation and keep a list of notes like “write test for case X, write test for case Y”.

To sum it all up: I’m not of the opinion that TDD is a universally superior workflow to non-TDD. I don’t think it’s important to hold oneself to TDD practices when learning testing. But once a person does reach a point of being comfortable with testing, TDD is an extremely valuable methodology to follow.

Reader Q&A: Tommy’s question about testing legacy code

Code with Jason subscriber Tommy C. recently wrote in with the following question:

Jason,

So I have found that one of the hurdles to testing beginners face is that the code they are trying to test is not always very testable. This is either because they themselves have written it that way or because they have inherited it. So, this presents a sort of catch 22. You have code with no tests that is hard to test. You can’t refactor the code because there are no tests in place to ensure you have not changed the behavior of the code.

I noticed that you have said that you don’t bother to test controllers or use request specs. I agree that in your situation, since you write really thin controllers that is a good call. However, in my situation, I have inherited controllers that are doing some work that I would like to test. I would like to move that logic out eventually, but right now all I can get away with is adding some test/specs.

These are some of the things that make testing hard for me. When I’m working on a greenfield side project all is good, but you don’t always have clean, testable code to work with.

Thanks,
–Tommy

Chicken/egg problem

Tommy brings up a classic legacy project chicken/egg problem: you can’t add tests until you change the way the code is structured, but you’re afraid to change the structure of the code before you have tests. It’s a seemingly intractable problem but, luckily, there’s a path forward.

The answer is to make use of the Extract Method and Extract Class refactoring techniques, also known as Sprout Method and Sprout Class. The idea is that if you come across a method that’s too long to easily write a test for, you just grab a chunk of lines from the method and move it – completely unmodified – into its own method or class, and then write tests around that new method or class. These techniques are a way to be reasonably sure (not absolutely guaranteed, but reasonably sure) that your small change has not altered the behavior of the application.

I learned about Sprout Method/Sprout Class from Michael Feathers and Martin Fowler. I also wrote a post about using Sprout Method in Ruby here.

Request specs

To address controller tests/request specs: Tommy might be referring to a post I wrote where I said most of the time I don’t use controller specs/request specs. (I also wrote about the same thing in my book, Rails Testing for Beginners.) There are two scenarios where I do use request specs, though: API-only projects and legacy projects that have a bunch of logic in the controllers. I think Tommy is doing the exact right thing by putting tests around the controller code and gradually moving the code out of the controllers over time.

If you, like Tommy, are trying to put tests on a legacy project and finding it difficult, don’t despair. It’s just a genuinely hard thing. That’s why people have written entire books about it!

Do you have a question about testing Rails legacy code, or about anything else to do with testing? Just email me at jason@codewithjason.com or tweet me at @jasonswett. I’m able to, I’ll write an answer to your question just like I did with Tommy’s.

Taming legacy Ruby code using the “Sprout Method” technique (example 2)

Note: this post is titled “example 2” because some time ago I wrote another Sprout Method example, but after I wrote that post I decided I could come up with a better example.

The Sprout Method technique

If a project was developed without testing in mind, the code will often involve a lot of tight coupling (objects that depend closely on other objects) which makes test setup difficult.

The solution to this problem is simple in concept – just change the tightly coupled objects to loosely coupled ones – but the execution of this solution is often not simple or easy.

The challenge with legacy projects is that you often don’t want to touch any of this mysterious code before it has some test coverage, but it’s impossible to add tests without touching the code first, so it’s a chicken-egg problem. (Credit goes to Michael Feathers’ Working Effectively with Legacy Code for pointing out this chicken-egg problem.)

So how can this chicken-egg problem be overcome? One technique that can be applied is the Sprout Method technique, described both in WEWLC as well as Martin Fowler’s Refactoring.

Here’s an example of some obscure code that uses tight coupling:

If you looked at this code, I would forgive you for not understanding it at a glance.

One thing that makes this code problematic to test is that it’s tightly coupled to two dependencies: 1) the content of the file at http://www.gutenberg.org/files/11/11-0.txt and 2) stdout (the puts on the second-to-last line).

If I want to separate part of this code from its dependencies, maybe I could grab this chunk of the code:

I can now take these lines and put them in their own new method:

I might still not understand the code but at least now I can start to put tests on it. Whereas before the program would only operate on the contents of http://www.gutenberg.org/files/11/11-0.txt and output the contents to the screen, now I can feed in whatever content I like and have the results available in the return value of the method. (If you’d like to see another example of the Sprout Method technique, I wrote another example here.

What if I want to do test-first but I find it easier to do test-after?

Recently Code with Jason reader Kaemon L. wrote me with the following question:

“As a beginner, is it better to write tests before you code to make it pass? or is it better to code first, write tests for the code to pass, and then add more tests as you come across bugs? In my experience so far learning RSpec, I’ve found it easier to code first and then write tests afterwards.​ Only because when I would try to write tests first I wasn’t exactly sure what needed to be tested, or how I was planning to write the code.​”

This is a great question. In addressing this question I find it useful to realize that when you’re learning testing you’re actually embarking on two parallel endeavors.

The two parallel endeavors of learning testing

The two parallel endeavors are:

1. Writing tests (outcome)
2. Learning how to write tests (education)

I think it’s useful to make the distinction between these two parts of the work. If you make the realization that achieving the desired outcome is only half your job, and the other half is learning, then it frees you up to do things “incorrectly” for the sake of moving forward.

With that out of the way, what’s actually better? Writing tests first or after?

What’s better, test-first or test-after?

I’m not sure that it makes sense to frame this question in terms of “better” or “worse”. When I think of test-driven development, I don’t think of it as “better” than test-after in all situations, I think of TDD as having certain ​advantages ​in certain scenarios.

What are the advantages of test-driven development?

TDD can separate the ​what​ from the ​how​. If I write the test first, I can momentarily focus on ​what​ I want to accomplish and relieve my mind of the chore of thinking of ​how​. Then, once I switch from writing the test to writing the implementation, I can free my mind of thinking about everything the feature needs to do and just focus on making the feature work.

TDD increases the chances that every single thing I’ve written is covered by a test. The “golden rule” of TDD (which I don’t always follow) is said to be “never write any new code without a failing test first”. If I follow that, I’m virtually guaranteed 100% test coverage.

TDD forces me to write easily-testable code. If I write the test first and the code after, I’m forced to write code that can be tested. There’s no other way. If I write the code first and try to test it afterward, I might find myself in a pickle. As a happy side benefit, code that’s easily testable happens to also usually be easy to understand and to work with.

TDD forces me to have a tight feedback loop. I write a test, I write some code. I write another test, I write some more code. When I write my tests after, I’m not forced to have such a fast feedback loop. There’s nothing stopping me from coding for hours before I stop myself and write a test.

If I choose to write my tests after my application code instead of before, I’m giving up the above benefits. But that doesn’t mean that test-after is automatically an inferior workflow in all situations.

Learning how to test (process) vs. producing tests (result)

Let’s go back to the two parallel endeavors listed above: writing tests and learning how to write tests. If I’m trying to write tests as I’m writing features and I just can’t figure out how to write the test first, then I have the following options:
1. Try to plow through and somehow write the test first anyway
2. Give up and don’t write any tests
3. Write the tests after

If #1 is too hard and I’m just hopelessly stuck, then #3 is a much better option than #2. Especially if I make a mental shift and switch from saying “I’m trying to write a test” to saying “I’m trying to learn how to write tests”. If all I’m trying to do is learn how to write tests, then anything goes. There’s literally nothing at all I could do wrong as part of my learning process, because the learning process is a separate job from producing results.

What about later?

Lastly, what if I get to the stage in my career where I’m fully comfortable with testing? Is TDD better than test-after? I would personally consider myself fully comfortable with testing at this stage in my career (although of course no one is ever “done” learning). I deliberately do not practice TDD 100% of the time. Sometimes I just find it too hard to write the test first. In these cases sometimes I’ll do a “spike” where I write some throwaway code just to get a feel for what the path forward might look like. Then I’ll discard my throwaway code afterward and start over now that I’m smarter. Other times I’ll just begin with the implementation and keep a list of notes like “write test for case X, write test for case Y”.

TDD: Advantageous, indispensable, but not universally “better”

To sum it all up: I’m not of the opinion that TDD is a universally superior workflow to non-TDD. I don’t think it’s important to hold oneself to TDD practices when learning testing. But once a person does reach a point of being comfortable with testing, TDD is an extremely valuable methodology to follow.

Code smell: long, procedural background jobs (and how to fix it)

I’ve worked on a handful of Rails applications that have pretty decent test coverage in terms of model specs and feature specs but are lacking in tests for background jobs. It seems that developers often find background jobs more difficult to test than Active Record models—otherwise the test coverage would presumably be better in that area. I’ve found background jobs of 100+ lines in length with no tests, even though the same application’s Active Record models had pretty good test coverage.

I’m not sure why this is. My suspicion is that background job code is somehow viewed as “separate” from the application code. For some reason the background job code doesn’t need tests, and for some reason the background job code wouldn’t benefit from an object-oriented structure like the rest of the application.

Whatever the root cause of this issue, the fix is fortunately often pretty simple. We can apply the extract class refactoring technique to turn the long procedure background job into one or more neatly abstracted classes.

Just as I would when embarking on the risky task of refactoring any piece of untested legacy code, I would use characterization testing to help me make sure I’m preserving the original behavior of the background job code as I move it into its own class.

A Rails testing “hello world” using RSpec and Capybara

The following is an excerpt from my book, Rails Testing for Beginners.

What we’re going to do

One of the biggest challenges in getting started with testing is just that—the simple act of getting started. What I’d like is for you, dear reader, to get a small but meaningful “win” under your belt as early as possible. If you’ve never written a single test in a Rails application before, then by the end of this chapter, you will have written your first test.

Here’s what we’re going to do:

    • Initialize a plain-as-possible Rails application
    • Create exactly one static page that says “Hello, world!”
    • Write a single test (using RSpec and Capybara) that verifies that our static page in fact says “Hello, world!”

The goal here is just to walk through the motions of writing a test. The test itself is rather pointless. I would probably never write a test in real life that just verifies the content of a static page. But the goal is not to write a realistic test but to provide you with a mental “Lego brick” that you can combine with other mental Lego bricks later. The ins and outs of writing tests get complicated quick enough that I think it’s valuable to start with an example that’s almost absurdly simple.

The tools we’ll be using

For this exercise we’ll be using RSpec, anecdotally the most popular of the Rails testing frameworks, and Capybara, a library that enables browser interaction (clicking buttons, filling out forms, etc.) using Ruby. You may find some of the RSpec or Capybara syntax confusing or indecipherable. You also may well not understand where the RSpec stops and the Capybara starts. For the purposes of this “hello world” exercise, that’s okay. Our goal right now is not deep understanding but to begin putting one foot in front of the other.

Initializing the application

Run the “rails new” command

First we’ll initialize this Rails app using the good old rails new command.

You may be familiar with the -T flag. This flag means “no tests”. If we had done rails new hello_world without the -T flag, we would have gotten a Rails application with a test directory containing MiniTest tests. I want to use RSpec, so I want us to start with no MiniTest.

Let’s also create our application’s database at this point since we’ll have to do that eventually anyway.

Update our Gemfile

This is the step where RSpec and Capybara will start to come into the picture. Each library will be included in our application in the form of a gem. We’ll also include two other gems, selenium-webdriver and chromedriver-helper, which are necessary in order for Capybara to interact with the browser.

Let’s add the following to our Gemfile under the :development, :test group. I’ve added a comment next to each gem or group of gems describing its role. Even with the comments, it may not be abundantly clear at this moment what each gem is for. At the end of the exercise we’ll take a step back and talk about which library enabled which step of what we just did.

Don’t forget to bundle install.

Install RSpec

Although we’ve already installed the RSpec gem, we haven’t installed RSpec into our application. Just like the Devise gem, for example, which requires us not only to add devise to our Gemfile but to also run rails g devise:install, RSpec installation is a two-step process. After we run this command we’ll have a spec directory in our application containing a couple config files.

Now that we’ve gotten the “plumbing” work out of the way, let’s write some actual application code.

Creating our static page

Let’s generate a controller, HelloController, with just a single action, index.

Let’s modify the action’s template so it just says “Hello, world!”.

Just as a sanity check, let’s start the Rails server and open up our new page in the browser to make sure it works as expected.

We have our test infrastructure. We have the feature we’re going to test. Now let’s write the test itself.

Writing out test

Write the actual test

Let’s create a file called spec/hello_world_spec.rb with the following content:

If you focus on the “middle” two lines, the ones starting with visit and expect, you can see that this test takes the form of two steps: visit the hello world index path and verify that the page there says “Hello, world!” If you’d like to understand the test in more detail, below is an annotated version.

Now that we’ve written and broken down our test, let’s run it!

Run the test

This test can be run by simply typing rspec followed by the path to the test file.

The test should pass. There’s a problem with what we just did, though. How can we be sure that the test is testing what we think it’s testing? It’s possible that the test is passing because our code works. It’s also possible that the test is passing because we made a mistake in writing our test. This concern may seem remote but “false positives” happen a lot more than you might think. The only way to be sure our test is working is to see our test fail first.

Make the test fail

Why does seeing a test fail prior to passing give us confidence that the test works? What we’re doing is exercising two scenarios. In the first scenario, the application code is broken. In the second scenario, the application code is working correctly. If our test passes under that first scenario, the scenario where we know the application code is broken, then we know our test isn’t doing its job properly. A test should always fail when its feature is broken and always pass when its feature works. Let’s now break our “feature” by changing the text on our page to something wrong.

When we run our test now it should see it fail with an error message like expected to find text "Hello, world!" in "Jello, world!".

Watch the test run in the browser

Add the following to the bottom of spec/rails_helper.rb:

Then run the test again:

This time, Capybara should pop open a browser where you can see our test run. When I’m actively working with a test and I want to see exactly what it’s doing, I often find the running of the test to be too fast for me, so I’ll temporarily add a sleep wherever I want my test to slow down. In this case I would put the sleep right after visiting hello_index_path.

Review

There we have it: the simplest possible Rails + RSpec + Capybara test. Unfortunately “the simplest possible Rails + RSpec + Capybara test” is still not particularly simple in absolute terms, but it’s pretty simple compared to everything that goes on in a real production application.

Who’s to blame for bad code? Coders.

When time pressure from managers drives coders to write poor-quality code, some people think the responsible party is the managers. I say it’s the coders. Here’s why I think so.

The ubiquity of time pressure in software development work

I’ve worked on a lot of development projects with a lot of development teams. In the vast majority of cases, if not all cases, the development team is perceived as being “behind”. Sometimes the development team’s progress is slower than the development team themselves set. Sometimes development progress is seen as behind relative to some arbitrary, unrealistic deadline set by someone outside the development team. Either way, no one ever comes by and says to the development team, “Wow, you guys are way ahead of where we expected!”

Often leadership will beg, plead, cajole, flatter, guilt-trip or bribe in order to increase the chances that their timeline will be met. They’ll ask the developers things like, “How can we ‘get creative’ to meet this timeline?” They’ll offer to have lunch, dinner, soda and alcohol brought in for a stretch of time so the developers can focus on nothing but the coding task at hand. Some leaders, lacking a sense of short-term/long-term thinking, will ask the developers to cut corners, to consciously incur technical debt in order to work faster. They might say things like “users don’t care about code” or “perfect is the enemy of the good“.

Caving to the pressure

Unfortunately, in my experience, most developers, most of the time, cave to the pressure to cut corners and deliver poor quality work. If you think I’m mistaken, I would ask you to show me a high-quality production codebase. There are very few in existence.

I myself am by no means innocent. I’ve caved to the pressure and delivered poor-quality work. But that doesn’t mean I think it’s excusable to do poor work or even (as some developers seem to believe) heroic.

Why bad code is the fault of coders, not managers

Why are coders, not managers, the ones responsible for bad code? Because coders are the ones who wrote the code. It was their fingers on the keyboard. The managers weren’t holding a gun to the developers’ heads and demanding they do a bad job.

“But,” a developer might say, “I had to cut corners in order to meet the deadline, and if I didn’t meet the deadline, I might get fired.” I don’t believe this reasoning stands up very well to scrutiny. I’d like to examine this invalid excuse for doing poor work as well as some others.

Invalid excuses for shipping poor work

Fallacy: cutting corners is faster

Why exactly is “bad code” bad? Bad code is bad precisely because it’s slow and expensive to work with. Saying, “I know this isn’t the right way to do it but it was faster” is a self-contradiction. The exact reason “the right way” is the right way is because it’s faster. Faster in the long run, of course. Sloppy, rushed work can be faster in the very short-term, of course, but any speed gain is usually negated the next time someone has to work with that area of code, which could be as soon as later that same day.

Fallacy: “they didn’t give me enough time to do a good job”

If I’m not given enough time to do a good job, the answer is not to do a bad job. The answer is that that thing can’t be done in that amount of time.

If my boss comes to me and says, “Can we do X in 6 weeks” and my response is “It would take 12 weeks to do X, but if we cut a lot of corners we could do it in 6 weeks” then all my boss hears is “WE CAN DO IT IN 6 WEEKS!”

Given the option between hitting a deadline with “bad code” and missing a deadline, managers will roughly always choose hitting the deadline with bad code. In this way managers are like children who ask over and over to eat candy for dinner. They can’t be trusted to take the pros and cons and make the right decision.

Fallacy: cutting corners is a business decision

It’s my boss’s role to tell me what to do and when to do it. It’s my job to, for the most part, comply. As a programmer I typically won’t have visibility into business decisions the way my boss does. As a programmer I will often not have the right skillset or level of information necessary to have a valid opinion on most business matters.

This idea that my boss can tell me what to do only applies down to a certain level of granularity though. I should let my boss tell me what to do but I shouldn’t let my boss tell me how to do it. I care about the professional reputation I have with my colleagues and with myself. I think there’s a certain minimum level of quality below which professionals should not be willing to go, even though it’s always an option to go lower.

Fallacy: “we can’t afford perfection”

I wish I had a dollar for every time I heard somebody say something like, “Guys, it doesn’t need to be perfect, it just needs to be good enough.”

The fallacy here is that the choice is between “perfect” and “good”. That’s a mistaken way of looking at it. The standard mode of operation during a rush period for developers is to pile sloppy Band-Aid fix on top of sloppy Band-Aid fix as if they desire nothing more strongly in life than to paint themselves into a corner from which they will never be able to escape. The alternative to this horror show isn’t “perfection”. The alternative is just what I might call a “minimally good job”.

Fallacy: “if I don’t comply with instructions to cut corners, I’ll get fired”

This might be true a very very very small percentage of the time. Most of the time it’s not, though. It’s pretty hard to get fired from a job. Firing someone is painful and expensive. Usually someone has to rack up many offenses before they finally get booted.

Fallacy: “we’ll clean it up after the deadline”

I find it funny that some people can make this claim with a straight face. Fixing sloppy work after the deadline “when we have more time” or “when things slow down” never happens. Deadlines, such as a go-live date, often translate to increased pressure and time scarcity because now there’s more traffic and visibility on the system. Deadlines are usually followed by more deadlines. You never get “more time”. Pressure rarely goes down over time. It usually goes up.

Fallacy: “I would have done a better job, but it wasn’t up to me”

Nope. I alone am responsible for the work I do. If someone pressured me to do the wrong thing and in response I chose to do the wrong thing, then I’m responsible for choosing to cave to the pressure.

Fallacy: “I haven’t been empowered to make things better”

It’s rare that a superior will come along and tell me, “Jason, you explicitly have permission to make things better around here.” The reality is that it’s up to me to observe what needs to be done better and help make it happen. I’m a believer in the saying that “it’s easier to get forgiveness than permission”. I tend to do what my professional judgment tells me needs to be done without necessarily asking my boss if it’s okay. Does this get me in trouble sometimes? Yes. I look at those hand-slaps as the cost of self-empowerment.

Plus, all employees have power over their superiors in that they have the power to leave. I don’t mean this is a “let me have my way or I quit!” way. I mean that if I work somewhere that continually tries to force me to do a bad job, I don’t have to just accept it as my fate in life. I can go work somewhere else. That doesn’t fix anything for the organization I’m leaving (bad organizations and bad people can’t be fixed anyway) but it does fix the problem for me.

My plea to my colleagues: don’t make excuses for poor work

I think developers make too many excuses and apologies for bad code. (And as a reminder, “bad code” means code that’s risky, time-consuming and expensive to work with.)

It’s not realistic to do a great job all the time, but I think we can at least do a little better. Right now I think we as a profession err too much on the side of justifying poor-quality work. I think we could stand to err more on the side of taking responsibility.

Sometimes it’s better for tests to hit the real API

A mysteriously broken test suite

The other day I ran my test suite and discovered that nine of my tests were suddenly and mysteriously broken. These tests were all related to the same feature: a physician search field that hits something called the NPPES NPI Registry, a public API.

My first thought was: is something wrong with the API? I visited the NPI Registry site and, sure enough, I got a 500 error when I clicked the link to their API URL. The API appeared to be broken. This was a Sunday so I didn’t necessarily the NPI API maintainers to fix it speedily. I decided to just ignore these nine specific tests for the time being.

By mid-Monday, to my surprise, my tests were still broken and the NPI API was still giving a 500 error. I would have expected them to have fixed it by then.

The surprising root cause of the bug

On Tuesday the API was still broken. On Wednesday the API was still broken. At this point I thought okay, maybe it’s not the case that the NPI API has just been down for four straight days with the NPI API maintainers not doing anything about it. Maybe the problem lies on my end. I decided to do some investigation.

Sure enough, the problem was on my end. Kind of. Evidently the NPI API maintainers had made a small tweak to how the API works. In addition to the fields I had already been sending like first_name and last_name, the NPI API now required an additional field, address_purpose. Without address_purpose, the NPI API would return a 500 error. I added address_purpose to my request and all my tests started passing again. And, of course, my application’s feature started working again.

The lesson: my tests were right to hit the real API

When I first discovered that nine of my tests were failing due to a broken external API, my first thought was, “Man, maybe I should mock out those API calls so my tests don’t fail when the API breaks.” But then I thought about it a little harder. The fact that the third-party API was broken meant my application was broken. The tests were failing because they ought to have been failing.

It’s been my observation that developers new to testing often develop the belief that “third-party API calls should be mocked out”. I don’t think this is always true. I think the decision whether to mock third-party APIs should be made on a case-by-case basis depending on certain factors. These factors include, but aren’t limited to:

  • Would allowing my tests to hit the real API mess up production data?
  • Is the API rate-limited?
  • Is the API prone to sluggishness or outages?

(Thanks to Kostis Kapelonis, who I interviewed on The Ruby Testing Podcast, for helping me think about API interaction this way.)

If hitting the real API poses no danger to production data, I would probably be more inclined to let my tests hit the real API than to mock out the responses. (Or, to be more precise, I would want my test suite to include some integration tests that hit the real API, even if I might have some other API-related tests that mock out API responses for economy of speed.) My application is only really working if the API is working too.