Refactorings should be atomic

by Jason Swett,

It’s often seen as wise not to try to make huge, sweeping improvements to a codebase all in a go. I couldn’t agree more. Such changes carry way too much risk.

An alternative way of approaching codebase improvements is to apply the Boy Scout rule, “always leave the campground a little cleaner than you found it”. I think this rule is a pretty good one and appropriate under some circumstances, although I think some caution is in order as to how exactly to interpret the rule. The main danger of this rule is that if interpreted the wrong way, it discourages atomicity.

Tiny atomic refactorings

I have a mantra I repeat to myself regularly: “never underestimate your own ability to screw stuff up”. Seemingly safe, innocuous, inconsequential one-line changes have an amazing ability to introduce surprising bugs.

For this reason, I virtually never mix refactorings or cleanups in with my feature work or bugfixes, even very tiny ones, unless the thing I’m cleaning up is directly in the path of the work I need to do to complete my feature or bugfix. Let’s say for example that I’m adding a new capability to the sales report. I don’t want to toss in a small cleanup and accidentally break the password reset feature.

If I discover a tiny improvement I want to make, e.g. replacing a single instance of { |a| a.user } to, I’ll make a separate commit just with that tiny improvement instead of letting it become a “rider” on an unrelated commit. This is basically just an application of the idea of atomic commits.

Bigger atomic refactorings

Sometimes a refactoring isn’t all that small. For example, sometimes you want to take a handful of classes and move them into a new namespace to give some structure to the files in your application.

In these cases I’ll try to create one single commit that effects the whole entire refactoring. This way, if I get 75% done with the refactoring and realize my whole premise is a blunder, I can just git checkout . && git clean -df and cleanly go back to the way things were before I started my refactoring work.

When to perform refactorings

In my experience there are two good times to perform a refactoring. If there’s an area of the code that I need to do some work in but it’s messy, I’ll perform a refactoring (or multiple refactorings) in that area of code as a separate piece of work before I start making my change in that area. I think of this as “cleaning the kitchen before I make dinner”. It’s faster and more enjoyable to make dinner in a clean kitchen than in a messy one.

One good time to perform a refactor is before a change. The other is after a change. If I make a change to an area of code and I find that afterward that area is unacceptably messy, I’ll perform a refactoring—again, as a separate piece of work—to clean it up. By the way, it’s possible to end up with a messy area of a codebase by simply writing poor-quality code, but that’s not the only way to introduce bad code. The accumulation of technical debt is natural and unavoidable and in fact the only way not to accumulate technical debt is to periodically perform refactorings.

The one time that I don’t think is a good time to perform refactorings is in the middle of a change. If the refactoring introduces a bug but the refactoring change is mixed with a change that you want to keep, it can be hard to back out the refactoring change without also affecting the “good” change. It’s also harder to review a pull request that contains both a change and a refactoring than it is to review a pull request that only contains a change. In order to have refactorings as a separate pull request, this means that the development team (and perhaps technical leadership) will have to be on board with the idea that you’ll regularly have PRs that contain nothing but a refactoring.

By the way, credit goes to Ben Orenstein (who I interviewed on my podcast) for teaching me that just before a change and just after a change are good times to do refactorings.

When refactorings can’t be atomic

Sometimes a refactoring is too big to be atomic. Sometimes I embark upon a “refactoring project” that will touch many areas of the codebase over the course of multiple weeks or months. In these cases I deliberately spread the refactoring project into small pieces over time as a way of mitigating risk. I’ve found though that most big refactorings, while not atomic as a whole, can be split into a number of smaller refactorings that are themselves individually atomic.

Leave a Reply

Your email address will not be published. Required fields are marked *