Category Archives: Legacy Code

Using tests as a tool to wrangle legacy projects

Legacy projects

In my career I’ve worked on my fair share of legacy projects. In fact I’d say that almost all the 50+ production projects I’ve worked on in my career have been legacy projects to some degree.

My experience is undoubtedly not a unique one. I would bet that most of most developers’ work has been on legacy projects. Due to Sturgeon’s law, I think it’s probably safe to assume that most code in the world is legacy code.

The challenges of working on a legacy project

Maintaining a legacy project is often a bummer. Many of the following things are often the case.

  • Changes take much longer than they should, perhaps by a factor of 10 or more.
  • Deployments are scary. They’re preceded by stress and followed by firefighting.
  • Due to the fragility of the system and frequent appearance of bugs and outages, trust in the development team is low.
  • The development team is always “behind” and under pressure to cut corners to meet deadlines.
  • Stakeholders are mostly not thrilled.
  • Due to these bad conditions, developer turnover is high, meaning the team has to spend time training new developers, leaving less time for development.

How do you reverse these challenges? The various problems associated with legacy projects have different causes which are not all solved by the same solutions. No single solution is a panacea, although some solutions go further than others.

One thing I’ve found to have a pretty good ROI is automated testing.

The ways automated tests can help legacy projects

Changes take less time

Why do changes take longer in legacy projects than in cleanly-coded ones?

The first reason is that in order to fix a bug or add a feature, you often have to have an understanding of the area of code you’re changing before you can make the change. If you don’t understand the existing code, you might not even know where to try to slot in your new code.

The second reason is that while any change in any codebase represents a certain amount of risk, the risk in legacy projects is amplified. Legacy projects features are often a delicate Jenga tower, ready to come crashing down at the slightest touch.

Tests can address both of these problems.

Tests can aid a developer in understanding a piece of code in multiple ways. Earliest on, before any changes take place, characterization tests can reveal the previously mysterious behavior of a piece of code.

The idea with a characterization test is that you write a test for a certain method that you know will fail (e.g. `expect(customer.name).to eq(‘asdf’)`), run the test, then change the test to expect the real value you saw (e.g. `expect(customer.name).to eq(‘Gern Blanston’)`). Through this “reverse TDD” method, a picture of the code’s behavior eventually becomes clear.

Later on, once a robust suite of tests has been built up around an area of functionality, tests can aid in the understandability of the code by enabling refactoring.

If my code is well-covered by tests, I can feel free to give descriptive names to obscurely-named variables, break large methods into smaller ones, break large classes into smaller ones, and whatever other changes I want. The knowledge that my tests protect against regressions can allow me to make changes with a sufficient level of confidence that my changes probably aren’t breaking anything.

Of course, the scope of my refactorings should be proportionate to my level of confidence that my changes are safe. Grand, sweeping refactorings will pretty much always come back to bite me.

Deployments become less scary

Legacy projects are often figuratively and literally more expensive to deploy than well-done projects.

The story often goes like this: management observes that each time a deployment happens, bugs are introduced. Bugs frustrate users, erode users’ trust, incur support cost, and of course incur additional development cost.

Because the organization wants to minimize these problematic and expensive bugs, an extensive manual pre-deployment QA process is put in place, which of course costs money. Often there will be a “feature freeze” leading up to the deployment during the QA period, which of course incurs the opportunity cost of discontinuous developer productivity.

And because releases are so risky and expensive, management wants to do the risky and expensive thing less frequently. But this unfortunately has the opposite of the intended effect.

Because each deployment is bigger, there’s more stuff to go wrong, a greater “risk surface area”. It also takes more time and effort to find the root cause of any issue the deployment introduced because a) there’s more stuff to sort through when searching for the root cause and b) the root cause was potentially introduced a long time ago and so may well not be fresh in the developers’ minds.

Tests can help make deployments less risky in two ways. For one, tests tend to:

  • Help protect against regressions
  • Help protect against new bugs
  • Help improve code quality by enabling refactoring

All these things make the application less fragile, decreasing the likelihood that any particular deployment will introduce bugs.

Second, the presence of automated tests enables the practice of continuous deployment.

If an application can be tested to a reasonable degree of certainty by running an automated test suite that takes just a few minutes to run, then each developer can potentially deploy many times per day. Spread across the team, this might mean dozens or hundreds of deployments per day.

Contrast this to the alternative. If there is no automated test suite and each deployment requires the QA team to go through a manual QA process first, then that QA process is a bottleneck that prevents deployments from happening very frequently. Even if the QA team might be technically capable of running through the whole manual test process daily to allow daily deployments, they would probably not want to spend their whole jobs doing only pre-deployment tests. And that would still only get you one deployment per day instead of the dozens or hundreds that continuous deployment would allow.

The obstacles to putting tests on a legacy project

Let’s say you’ve read the above and you say, “Great. I’m sold on adding tests to my testless legacy project. How do I start?”

Unfortunately it’s not usually simple or easy to add tests to a currently-untested legacy project. There are two main obstacles.

Missing test infrastructure

Writing tests requires a test infrastructure. In order to write a test, I need to be able to part of the application I’m testing (often called the “system under test” or SUT) into a state that allow me to exercise the code and make the assertions I need to make in order to tell me that the code is working.

Getting a test framework and related testing tools installed and configured can be non-trivial. I don’t know of any way to make this easier, I just think it’s a good thing to be aware of.

I’ve worked in teams where the boss says, “Let’s make sure we write tests for all new features.” This seems like a sensible rule on the surface but unfortunately there’s no way this rule can be followed if there’s not already a testing infrastructure and testing strategy in place.

The development team and leadership have to agree on what testing tools they’re going to use, which itself can be a very non-trivial step. All the right people have to have buy-in on the testing tooling and testing strategy or else the team will experience a lot of drag as they try to add tests to the application. You have to pull the car onto the road before you hit the gas. If you hit the gas while you’re still in the garage, you’re just going to crash into the wall. So make sure to think about the big picture and get the right team members on board before you try to add tests to your legacy application.

Dependencies and tight coupling

Sometimes it’s easy to get the SUT into the necessary state. More often in legacy projects, it’s very difficult. Sometimes it’s practically impossible. The reason for the difficulty is often tight coupling between dependencies.

Here’s what I mean by tight coupling. Let’s say we want to test class A which depends on classes B and C, which in turn depend on classes D, E, F and G. This means we have to instantiate seven objects (instances of A, B, C, D, E, F and G) just to test class A. If the dependency chain is long enough, the test may be so painful to write that the developer will decide in frustration to toss his laptop off a bridge and begin a new life in subsistence farming rather than write the test.

How to overcome the obstacles to adding tests to legacy code

Missing test infrastructure

If a project has no tests at all, you have two jobs before you:

  1. Write some tests
  2. Set up the testing infrastructure to make it possible to write tests

Both these things are hard. To the extent possible, I would want to “play on easy mode” to the extent possible when I’m first starting to put test coverage on the project.

So rather than trying to identify the most mission-critical parts of the application and putting tests on those, I would ask myself, “What would be easiest to test?” and write some tests for that area of the code.

For example, maybe the application I’m working on has a checkout page and a contact page. The checkout page is obviously more mission-critical to the business than the contact page. If the checkout page breaks there are of course immediate financial and other business repercussions. If the contact page breaks the consequences might be negligible. So a set of tests covering the checkout page would clearly be more valuable than a set of tests covering the contact page.

However, the business value of the set of tests I’m considering writing isn’t the only factor to take into consideration. If a set of tests for the checkout page would be highly valuable but the setup and infrastructure work necessary to get those tests into place is sufficiently time-consuming that the team is too daunted by the work to ever even get started, then that area of the code is probably not the best candidate for the application’s first tests.

By adding some trivial tests to a trivial piece of functionality, a beachhead can be established that makes the addition of later tests that much easier. It may be that in order to add a test for the contact page, it takes 15 minutes of work to write the test and four hours of work to put the testing infrastructure in place. Now, when I finally do turn my attention to writing tests for the checkout page, much of the plumbing work has already been done, and now my job is that much easier.

Dependencies and tight coupling

If a project was developed without testing in mind, the code will often involve a lot of tight coupling (objects that depend closely on other objects) which makes test setup difficult.

The solution to this problem is simple in concept – just change the tightly coupled objects to loosely coupled ones – but the execution of this solution is often not simple or easy.

The challenge with legacy projects is that you often don’t want to touch any of this mysterious code before it has some test coverage, but it’s impossible to add tests without touching the code first, so it’s a chicken-egg problem. (Credit goes to Michael Feathers’ Working Effectively with Legacy Code for pointing out this chicken-egg problem.)

So how can this chicken-egg problem be overcome? One technique that can be applied is the Sprout Method technique, described both in WEWLC as well as Martin Fowler’s Refactoring.

Here’s an example of some obscure code that uses tight coupling:

require 'open-uri'
file = open('http://www.gutenberg.org/files/11/11-0.txt')
text = file.read
text.gsub!(/-/, ' ')
words = text.split
cwords = []
words.each do |w|
  w.gsub!(/[,\?\.‘’“”\:;!\(\)]/, '')
  cwords << w.downcase
end
words = cwords
words.sort!
whash = {}
words.each do |w|
  if whash[w].nil?
    whash[w] = 0
  end
  whash[w] += 1
end
whash = whash.sort_by { |k, v| v }.to_h
swords = words.sort_by do |el|
  el.length
end
lword = swords.last
whash.each do |k, v|
  puts "#{k.ljust(lword.length + 3 - v.to_s.length, '.')}#{v}"
end

If you looked at this code, I would forgive you for not understanding it at a glance.

One thing that makes this code problematic to test is that it’s tightly coupled to two dependencies: 1) the content of the file at http://www.gutenberg.org/files/11/11-0.txt and 2) stdout (the puts on the second-to-last line).

If I want to separate part of this code from its dependencies, maybe I could grab this chunk of the code:

require 'open-uri'
file = open('http://www.gutenberg.org/files/11/11-0.txt')
text = file.read
text.gsub!(/-/, ' ')
words = text.split
cwords = []
words.each do |w|
  w.gsub!(/[,\?\.‘’“”\:;!\(\)]/, '')
  cwords &lt;&lt; w.downcase
end
words = cwords
words.sort!
whash = {}
words.each do |w|
  if whash[w].nil?
    whash[w] = 0
  end
  whash[w] += 1
end
whash = whash.sort_by { |k, v| v }.to_h
swords = words.sort_by do |el|
  el.length
end
lword = swords.last
whash.each do |k, v|
  puts "#{k.ljust(lword.length + 3 - v.to_s.length, '.')}#{v}"
end

I can now take these lines and put them in their own new method:

require 'open-uri'

def count_words(text)
  text.gsub!(/-/, ' ')
  words = text.split
  cwords = []
  words.each do |w|
    w.gsub!(/[,\?\.‘’“”\:;!\(\)]/, '')
    cwords << w.downcase
  end
  words = cwords
  words.sort!
  whash = {}
  words.each do |w|
    if whash[w].nil?
      whash[w] = 0
    end
    whash[w] += 1
  end
  whash = whash.sort_by { |k, v| v }.to_h
  swords = words.sort_by do |el|
    el.length
  end
  lword = swords.last
  [whash, lword]
end

file = open('http://www.gutenberg.org/files/11/11-0.txt')
whash, lword = count_words(file.read)

whash.each do |k, v|
  puts "#{k.ljust(lword.length + 3 - v.to_s.length, '.')}#{v}"
end

I might still not understand the code but at least now I can start to put tests on it. Whereas before the program would only operate on the contents of http://www.gutenberg.org/files/11/11-0.txt and output the contents to the screen, now I can feed in whatever content I like and have the results available in the return value of the method. (If you’d like to see another example of the Sprout Method technique, I wrote a post about it here.

In addition to the Sprout Method technique I’ve also brought another concept into the picture which I’ll describe now.

Dependency injection

Dependency injection (DI) is a fancy term for a simple concept: passing an object’s dependencies as instance variables (or passing a method’s dependencies as arguments).

I applied DI in my above example when I defined a count_words method that takes a text argument. Instead of the method being responsible for knowing about the content it parses, the method will now happily parse whatever string we give it, not caring if it comes from open('http://www.gutenberg.org/files/11/11-0.txt') or a hard-coded string like 'please please please let me get what i want'.

This gives us a new capability: now, instead of being confined to testing the content of Alice’s Adventures in Wonderland, we can write a test that feeds in the extremely plain piece of content 'please please please let me get what i want' and assert that the resulting count of 'please' is 3. We can of course feed in other pieces of content, like 'hello-hello hello, hello' to ensure the proper behavior under other conditions (in this case, the condition when “weird” characters are present).

If you’d like to see an object-oriented example of applying dependency injection to our original piece of legacy code, here’s how I might do that:

require 'open-uri'

class Document
  def initialize(text)
    @text = text
  end

  def count_words
    @text.gsub!(/-/, ' ')
    words = @text.split
    cwords = []
    words.each do |w|
      w.gsub!(/[,\?\.‘’“”\:;!\(\)]/, '')
      cwords << w.downcase
    end
    words = cwords
    words.sort!
    whash = {}
    words.each do |w|
      if whash[w].nil?
        whash[w] = 0
      end
      whash[w] += 1
    end
    whash = whash.sort_by { |k, v| v }.to_h
    swords = words.sort_by do |el|
      el.length
    end
    lword = swords.last
    [whash, lword]
  end
end

file = open('http://www.gutenberg.org/files/11/11-0.txt')
document = Document.new(file.read)
whash, lword = document.count_words

whash.each do |k, v|
  puts "#{k.ljust(lword.length + 3 - v.to_s.length, '.')}#{v}"
end

Conclusion

Hopefully this post has armed you with a few useful techniques to help you get your legacy project under control and start turning things around.

Common legacy project challenges and how to address them

What is a legacy project?

The terms “legacy project” and “legacy code” mean different things to different developers. I think one thing that most developers could agree on is that a legacy project is a project that’s difficult or painful to maintain.

Spend enough years in the software industry and you might find yourself forming the opinion that most projects are legacy projects. Some projects, of course, are “more legacy” than others.

What follows is a list of what I’ve found to be some of the most common characteristics of legacy projects. Under each item I’ve included the antidote to the issue.

Here are some common characteristics of legacy projects:

Poor processes and/or incompetent team members

Out of all the challenges of a legacy project I think this one has the biggest impact and is the most challenging to fix. To be totally frank, it’s often a problem that’s impossible to fix, and I think it’s a big part of why so many software projects fail and so much software out there is half-broken.

The problem is this: projects end up with bad code because the code was brought into existence by bad processes and/or incompetent developers. The code doesn’t exist separately from the people and processes. The quality of the code is a direct function of the people and processes that created it.

Let’s imagine a team whose leader believes code should only be released “when it provides value to the user”, and so deployments only happen once every three months. When the deployments do happen, they’re risky, shuttle-launch type affairs with a lot of stress leading up to them and a lot of firefighting during and after. And let’s say due to time pressure the dev team believes they “don’t have time to write tests”. Instead of writing small and crisply-defined user stories, the team works on big, nebulous stories that drag on and on because no one ever nailed down where the finish line is.

If you asked me to make a prediction about this team I would guess there’s a good chance they’re going to write some pretty bad code (that is, code that’s risky and time-consuming to maintain). If I were going to make a bet about the team’s meeting a deadline, I would bet heavily against them based on the way they work.

The antidote

So, what can be done about people and process problems? What unfortunately compounds this problem is that you can’t just show the team the answers to their problems and expect things to start turning around. Sometimes dev teams and managers follow bad practices out of simple inexperience. In those cases it’s relatively easy to educate them into better behavior.

Just as often, the problem is impossible to solve. I consider there to be three types of employees:

  1. People who are already good
  2. People who aren’t good yet but can be later
  3. People who aren’t good and never will be

If a team is made up of people of types 1 and 2 then there’s hope. These people can be educated into smarter processes that produce more easily maintainable code.

The people of type 3 just need to go. Unfortunately, in legacy projects, type 3 people are the people in charge and so there’s no one to fire them. I had a boss once who forced me to write code for three months before deploying it. This inevitably worked out badly. The proposal of continuous deployment was met with a hard no (as were roughly all the other proposals made to that particular manager). I once worked for an organization that drove a ~$300,000 project into the ground by failing to put the mysterious work first. Then they put me on another project which they began the exact same way. I warned them they that were headed for the same fate a second time but they ignored me. Most of this particular organization’s projects failed, and for similar reasons. If an organizer’s leadership refuses to learn from conventional wisdom or even from their own failures, then the word for this is stupidity and it cannot be fixed. The only option for a developer in this situation is to quit and get a job somewhere better.

But what about cases where both leadership and the development team have a willingness to learn what it would take to make things better?

Here are some of the most common practices I see teams failing to follow that would have a potentially large impact if the team were to start following them.

Now that I’ve address the somewhat meta (but super important) matter of people and processes, let’s talk about some of the code-related issues.

Few tests or no tests

In his book Working Effectively with Legacy Code, Michael Feathers defines legacy code as simply “code without tests”. I find this definition interesting for its conciseness.

Think of the benefits that testing affords that you’re missing in a test-free legacy project. You can’t make changes without being reasonably confident you’re not introducing regressions. You can’t refactor safely. You don’t have an executable specification of what the software is supposed to do, so your only recourse is to squint at the code (which is often hard or impossible to understand) to try to figure out what it does.

The antidote

If a project doesn’t have tests then the antidote is of course to write tests. It’s unfortunately not that simple, though.

I’ve worked on teams where leadership imposes a rule: “All new code must have tests.” In my experience this rule doesn’t work very well. I’ll explain why.

Not all code is equally testable. Dependencies make code hard to test. If a project was written without testing in mind there are probably a lot of entangled dependencies blocking developers from writing tests on that code. There’s a chicken-egg problem: the code needs to be changed in order to break the dependencies and make it testable, but the developers can’t safely change the code until tests are in place. This is why adding tests to legacy projects is hard.

The solution in this situation is really just to buy Working Effectively with Legacy Code by Michael Feathers. The techniques described in the book like Sprout Class, Sprout Method, characterization testing and others are super useful tools in putting tests on legacy code.

If a project has no tests at all it can be especially challenging. You might wonder what to test first. Your instincts might tell you to try to test the most important code first, or to start writing tests with whatever your next feature is. My inclination though would be to start with whatever’s easiest. If it’s easiest to write a test verifying that the sign-in page says “Sign in”, then I might write that test first, even though it might feel stupid and meaningless. Then I might test something slightly more difficult to test, and so on, until I’m able to test pretty much anything in the application that I want to.

Mysteriously-named variables, methods, etc.

I define bad code as code that’s risky and/or expensive to change. “Expensive” is basically interchangeable with “time-consuming”.

If the variables, methods and classes in a project are named in a way that’s misleading or difficult to understand, then the project will be more time-consuming than necessary to maintain. As I explain in Variable Name Anti-Patterns, it’s better to save time thinking than to save time typing. Typing is cheap. Thinking is expensive.

The antidote

Luckily it’s usually pretty straightforward to change the names of things, although sometimes you’ll find entities so deeply baked into the code and the database that it’s prohibitively expensive to go back and change it at this point.

But if you find a bad variable name, how do you decide what to rename it to? My rule for naming things is: call things what they are. This might sound super obvious but based on how often I encounter entities that are named for something other than what they are, it’s a rule that I think is worth explicitly stating.

Sometimes a class is poorly named just because the author didn’t bother to put any thought into the matter. But other times there’s a misfit between the name of the class and the abstraction that ought to exist. So sometimes instead of or in addition to renaming a class, I’ll invent a new abstraction and move some of that classes code into the new abstraction.

You’ll find more advice on naming in Variable Name Anti-Patterns.

Original maintainers who are no longer available

One particularly painful aspect of legacy projects is that the people who wrote the original code are often no longer around. So when you want to ask, “What is THIS here for?” or “What the fuck were you thinking, DALE?!?!” You can’t, because Dale writes shitty code somewhere else now.

The antidote

Unfortunately there really is nothing you can do about this one. But you can prevent the same thing from happening to someone else. You can start to write documentation for the project.

In my mind there are at least two kinds of documentation for a project: technical documentation and domain documentation. Both can be super helpful.

The thing I like about writing domain documentation is that it doesn’t go out of date as quickly as documentation about the particulars of the project. Some domains can be learned in books, Wikipedia, etc. but other domains require knowledge that mostly only exists in people’s heads. Every organization also has “institutional knowledge” or “tribal knowledge” which is stored in the heads of the employees but isn’t written down anywhere. It helps to write these things down.

When it comes to technical documentation I don’t find much value in documenting specific methods, classes, etc. although that is sometimes helpful. I prefer to start with things like how to get the project set up locally, what how the production environment is structured, and stuff like that. The process of writing this documentation can often help the team (and individual team members) understand things that they wouldn’t have otherwise investigated.

Unknown proper behavior

Does the code work? That question is sometimes impossible to answer because you don’t even know what the code is supposed to do.

One of the quirks of legacy project maintenance is that the important thing isn’t to ensure correct behavior, it’s to preserve current behavior. I’ve heard of cases where a legacy project contains a bug, but users actually depend on that buggy behavior, so fixing the bug would itself be a bug.

The antidote

One way to help ensure that current behavior is being preserved is to use characterization testing. I think of characterization testing like TDD in reverse. Here’s how I write characterization tests.

Let’s say I’m writing a characterization test for a method. First I’ll comment out the entire body of the method. Then I’ll write a test. I might not know what to assert in the test, so I’ll assert that the method will return “asdf”. I of course know that the method won’t return “asdf”, but I can run the test to see what the method really does return, then change my assertion to match that.

Once I have a test case I’ll uncomment whatever’s needed to make that test pass. In this sense I’m following the golden rule of TDD: don’t write any new code without having a test that covers it. In this case the code has already been written but that’s just kind of a technicality. I’m starting with no code (it’s all commented out) and then I’m “writing” the code by uncommenting it.

The result of my “reverse TDD” process and regular TDD is the same: I end up with more or less full test coverage. Once I have that I’m free to refactor with confidence that I’m preserving current behavior.

Bugs

Bugs are of course a very common characteristic of legacy projects. I don’t think I need to spend much room discussing bugs.

The antidote

The antidote to having bugs is of course to fix them. The sounds simple but it’s much easier said than done.

I’ve worked on teams where we have nasty bugs in the application but leadership prioritizes new development over bugfixes. This is a failure in short-term/long-term thinking. Prioritizing new development over bugfixes might provide more speed in the very short term but in the long term it slows things way down. In order for things to turn around, leadership has to have the discipline to temporarily slow down development in order to get the application into a more stable state. If leadership is too stubborn to do this then there’s probably no hope for the project.

Automated tests can of course help with bugfixes, particularly regressions. It can be very frustrating for everyone involved when the developers fix a bug only to have the same exact bug reappear later. If tests are written along with each bugfix (which is again easier said than done) then the team can be reasonably confident that each bugfix will be permanent.