What is a legacy project?
The terms “legacy project” and “legacy code” mean different things to different developers. I think one thing that most developers could agree on is that a legacy project is a project that’s difficult or painful to maintain.
Spend enough years in the software industry and you might find yourself forming the opinion that most projects are legacy projects. Some projects, of course, are “more legacy” than others.
What follows is a list of what I’ve found to be some of the most common characteristics of legacy projects. Under each item I’ve included the antidote to the issue.
Here are some common characteristics of legacy projects:
Poor processes and/or incompetent team members
Out of all the challenges of a legacy project I think this one has the biggest impact and is the most challenging to fix. To be totally frank, it’s often a problem that’s impossible to fix, and I think it’s a big part of why so many software projects fail and so much software out there is half-broken.
The problem is this: projects end up with bad code because the code was brought into existence by bad processes and/or incompetent developers. The code doesn’t exist separately from the people and processes. The quality of the code is a direct function of the people and processes that created it.
Let’s imagine a team whose leader believes code should only be released “when it provides value to the user”, and so deployments only happen once every three months. When the deployments do happen, they’re risky, shuttle-launch type affairs with a lot of stress leading up to them and a lot of firefighting during and after. And let’s say due to time pressure the dev team believes they “don’t have time to write tests”. Instead of writing small and crisply-defined user stories, the team works on big, nebulous stories that drag on and on because no one ever nailed down where the finish line is.
If you asked me to make a prediction about this team I would guess there’s a good chance they’re going to write some pretty bad code (that is, code that’s risky and time-consuming to maintain). If I were going to make a bet about the team’s meeting a deadline, I would bet heavily against them based on the way they work.
So, what can be done about people and process problems? What unfortunately compounds this problem is that you can’t just show the team the answers to their problems and expect things to start turning around. Sometimes dev teams and managers follow bad practices out of simple inexperience. In those cases it’s relatively easy to educate them into better behavior.
Just as often, the problem is impossible to solve. I consider there to be three types of employees:
- People who are already good
- People who aren’t good yet but can be later
- People who aren’t good and never will be
If a team is made up of people of types 1 and 2 then there’s hope. These people can be educated into smarter processes that produce more easily maintainable code.
The people of type 3 just need to go. Unfortunately, in legacy projects, type 3 people are the people in charge and so there’s no one to fire them. I had a boss once who forced me to write code for three months before deploying it. This inevitably worked out badly. The proposal of continuous deployment was met with a hard no (as were roughly all the other proposals made to that particular manager). I once worked for an organization that drove a ~$300,000 project into the ground by failing to put the mysterious work first. Then they put me on another project which they began the exact same way. I warned them they that were headed for the same fate a second time but they ignored me. Most of this particular organization’s projects failed, and for similar reasons. If an organizer’s leadership refuses to learn from conventional wisdom or even from their own failures, then the word for this is stupidity and it cannot be fixed. The only option for a developer in this situation is to quit and get a job somewhere better.
But what about cases where both leadership and the development team have a willingness to learn what it would take to make things better?
Here are some of the most common practices I see teams failing to follow that would have a potentially large impact if the team were to start following them.
- Automated testing
- Agile development
- Continuous integration
- Continuous delivery/deployment
- Pair programming
- PR reviews
- Atomic commits
- Small, crispy-defined stories
Now that I’ve address the somewhat meta (but super important) matter of people and processes, let’s talk about some of the code-related issues.
Few tests or no tests
In his book Working Effectively with Legacy Code, Michael Feathers defines legacy code as simply “code without tests”. I find this definition interesting for its conciseness.
Think of the benefits that testing affords that you’re missing in a test-free legacy project. You can’t make changes without being reasonably confident you’re not introducing regressions. You can’t refactor safely. You don’t have an executable specification of what the software is supposed to do, so your only recourse is to squint at the code (which is often hard or impossible to understand) to try to figure out what it does.
If a project doesn’t have tests then the antidote is of course to write tests. It’s unfortunately not that simple, though.
I’ve worked on teams where leadership imposes a rule: “All new code must have tests.” In my experience this rule doesn’t work very well. I’ll explain why.
Not all code is equally testable. Dependencies make code hard to test. If a project was written without testing in mind there are probably a lot of entangled dependencies blocking developers from writing tests on that code. There’s a chicken-egg problem: the code needs to be changed in order to break the dependencies and make it testable, but the developers can’t safely change the code until tests are in place. This is why adding tests to legacy projects is hard.
The solution in this situation is really just to buy Working Effectively with Legacy Code by Michael Feathers. The techniques described in the book like Sprout Class, Sprout Method, characterization testing and others are super useful tools in putting tests on legacy code.
If a project has no tests at all it can be especially challenging. You might wonder what to test first. Your instincts might tell you to try to test the most important code first, or to start writing tests with whatever your next feature is. My inclination though would be to start with whatever’s easiest. If it’s easiest to write a test verifying that the sign-in page says “Sign in”, then I might write that test first, even though it might feel stupid and meaningless. Then I might test something slightly more difficult to test, and so on, until I’m able to test pretty much anything in the application that I want to.
Mysteriously-named variables, methods, etc.
I define bad code as code that’s risky and/or expensive to change. “Expensive” is basically interchangeable with “time-consuming”.
If the variables, methods and classes in a project are named in a way that’s misleading or difficult to understand, then the project will be more time-consuming than necessary to maintain. As I explain in Variable Name Anti-Patterns, it’s better to save time thinking than to save time typing. Typing is cheap. Thinking is expensive.
Luckily it’s usually pretty straightforward to change the names of things, although sometimes you’ll find entities so deeply baked into the code and the database that it’s prohibitively expensive to go back and change it at this point.
But if you find a bad variable name, how do you decide what to rename it to? My rule for naming things is: call things what they are. This might sound super obvious but based on how often I encounter entities that are named for something other than what they are, it’s a rule that I think is worth explicitly stating.
Sometimes a class is poorly named just because the author didn’t bother to put any thought into the matter. But other times there’s a misfit between the name of the class and the abstraction that ought to exist. So sometimes instead of or in addition to renaming a class, I’ll invent a new abstraction and move some of that classes code into the new abstraction.
You’ll find more advice on naming in Variable Name Anti-Patterns.
Original maintainers who are no longer available
One particularly painful aspect of legacy projects is that the people who wrote the original code are often no longer around. So when you want to ask, “What is THIS here for?” or “What the fuck were you thinking, DALE?!?!” You can’t, because Dale writes shitty code somewhere else now.
Unfortunately there really is nothing you can do about this one. But you can prevent the same thing from happening to someone else. You can start to write documentation for the project.
In my mind there are at least two kinds of documentation for a project: technical documentation and domain documentation. Both can be super helpful.
The thing I like about writing domain documentation is that it doesn’t go out of date as quickly as documentation about the particulars of the project. Some domains can be learned in books, Wikipedia, etc. but other domains require knowledge that mostly only exists in people’s heads. Every organization also has “institutional knowledge” or “tribal knowledge” which is stored in the heads of the employees but isn’t written down anywhere. It helps to write these things down.
When it comes to technical documentation I don’t find much value in documenting specific methods, classes, etc. although that is sometimes helpful. I prefer to start with things like how to get the project set up locally, what how the production environment is structured, and stuff like that. The process of writing this documentation can often help the team (and individual team members) understand things that they wouldn’t have otherwise investigated.
Unknown proper behavior
Does the code work? That question is sometimes impossible to answer because you don’t even know what the code is supposed to do.
One of the quirks of legacy project maintenance is that the important thing isn’t to ensure correct behavior, it’s to preserve current behavior. I’ve heard of cases where a legacy project contains a bug, but users actually depend on that buggy behavior, so fixing the bug would itself be a bug.
One way to help ensure that current behavior is being preserved is to use characterization testing. I think of characterization testing like TDD in reverse. Here’s how I write characterization tests.
Let’s say I’m writing a characterization test for a method. First I’ll comment out the entire body of the method. Then I’ll write a test. I might not know what to assert in the test, so I’ll assert that the method will return “asdf”. I of course know that the method won’t return “asdf”, but I can run the test to see what the method really does return, then change my assertion to match that.
Once I have a test case I’ll uncomment whatever’s needed to make that test pass. In this sense I’m following the golden rule of TDD: don’t write any new code without having a test that covers it. In this case the code has already been written but that’s just kind of a technicality. I’m starting with no code (it’s all commented out) and then I’m “writing” the code by uncommenting it.
The result of my “reverse TDD” process and regular TDD is the same: I end up with more or less full test coverage. Once I have that I’m free to refactor with confidence that I’m preserving current behavior.
Bugs are of course a very common characteristic of legacy projects. I don’t think I need to spend much room discussing bugs.
The antidote to having bugs is of course to fix them. The sounds simple but it’s much easier said than done.
I’ve worked on teams where we have nasty bugs in the application but leadership prioritizes new development over bugfixes. This is a failure in short-term/long-term thinking. Prioritizing new development over bugfixes might provide more speed in the very short term but in the long term it slows things way down. In order for things to turn around, leadership has to have the discipline to temporarily slow down development in order to get the application into a more stable state. If leadership is too stubborn to do this then there’s probably no hope for the project.
Automated tests can of course help with bugfixes, particularly regressions. It can be very frustrating for everyone involved when the developers fix a bug only to have the same exact bug reappear later. If tests are written along with each bugfix (which is again easier said than done) then the team can be reasonably confident that each bugfix will be permanent.
Probably there is some obvious answer that I can’t think of, but what is the point of commenting out the method to do characterization tests?
You can simply add the test, run it, and change it to assert whatever the output is right away.
I want to make sure the test fails so that I can make sure the test is actually testing what I think it’s testing.