Cohesion

by Jason Swett,

Every codebase is a story. Well-designed programs tell a coherent, easy-to-understand story. Other programs are poorly designed and tell a confusing, hard-to-understand story. And it’s often the case that a program wasn’t designed at all, and so no attempt was made to tell a coherent story. But there’s some sort of story in the code no matter what.

If a codebase is like a story, a file in a codebase is like a chapter in a book. A well written-chapter will clearly let the reader know what the most important points are and will feature those important points most prominently. A chapter is most understandable when it principally sticks to just one topic.

The telling of the story may unavoidably require the conveyance of incidental details. When this happens, those incidental details will be put in their proper place and not mixed confusingly with essential points. If a detail would pose too much of a distraction or an interruption, it gets moved to a footnote or appendix or parenthetical clause.

A piece of code is cohesive if a) everything in it shares one single idea and b) it doesn’t mix incidental details with essential points.

Now let’s talk about ways that cohesion tends to get lost as well as ways to maintain cohesion.

How cohesion gets lost

Fresh new projects are usually pretty easy to work with. This is because a) when you don’t have very much code, it’s easier to keep your code organized, and b) when the total amount of code is small, you can afford to be fairly disorganized without hurting overall understandability too much.

Things get tougher as the project grows. Entropy (the tendency for all things to decline into disorder) unavoidably sets in. Unless there are constant efforts to fight back against entropy, the codebase grows increasingly disordered. The code grows harder to understand and work with.

One common manifestation of entropy is the tendency for developers to hang new methods onto objects like ornaments on a Christmas tree. A developer is tasked with adding a new behavior. He or she goes looking for the object that seems like the most fitting home for that behavior. He or she adds the new behavior, which doesn’t perfectly fit the object where it was placed, but the new code only makes the object 5% less cohesive, and it’s not clear where might be a better place for that behavior, so in it goes.

This ornament-hanging habit is never curtailed because no individual “offense” appears to be all that bad. This is the nature of entropy: disorder sets in not because anything bad was done but simply because no one is going out of their way to stave off disorder.

So, even though no individual change appears to be all that bad, the result of all these changes in aggregate is a surprisingly bad mess. The objects are huge. They confusingly mix unrelated ideas. Their essential points are obscured by incidental details. They’re virtually impossible to understand. They lack cohesion.

How can this problem be prevented?

How cohesion can be preserved

The first key to maintaining cohesion is to make a clear distinction between what’s essential and what’s incidental. More specifically, a distinction must be made between what’s essential and what’s incidental with respect to the object in question.

For example, let’s say I have a class called Appointment. The concerns of Appointment include, among other things, a start time, a client and some matters related to caching.

I would say that the start time and client are essential concerns of the appointment and that the caching is probably incidental. In the story of Appointment, start time and client are important highlights, whereas caching concerns are incidental details and should be tucked away in a footnote or appendix.

That explains how to identify incidental details conceptually but it doesn’t explain how to separate incidental details mechanically. So, how do we do that?

The primary way I do this is to simply move the incidental details into different objects. Let’s say for example that I have a Customer object with certain methods including one called balance.

Over time the balance calculation becomes increasingly complicated to the point that it causes Customer to lose cohesion. No problem: I can just move the guts of the balance method into a new object (a PORO) called CustomerBalance and delegate all the gory details of balance calculation to that object. Now Customer can once again focus on the essential points and forget about the incidental details.

Now, in this case it made perfect sense to recognize the concept of a customer balance as a brand new abstraction. But it doesn’t always work out this way. In our earlier Appointment example, for example, it’s maybe not so natural to take our caching concerns and conceive of them as a new extraction. It’s not particularly clear how that would go.

What we can do in these cases, when we want to move an incidental detail out of an object but we can’t put our finger on a befitting new abstraction, is we can use a mixin instead. I view mixins as a good way to hold a bit of code which has cohesion with itself but which doesn’t quite qualify as an abstraction and so doesn’t make sense as an object. For me, mixins usually don’t have standalone value, and they’re usually only ever “mixed in” to one object as opposed to being reusable.

(I could have said concern instead of mixin, but a) to me it’s a distinction without a meaningful difference, and b) concerns come along with some conceptual baggage that I didn’t want to bring into the picture here.)

So for our Appointment example, we could move the caching code into a mixin in order to get it out of Appointment so that Appointment could once again focus solely on its essential points and forget about its incidental details.

Where to put these newly-sprouted files

When I make an object more cohesive by breaking out its incidental details into new model file, you might wonder where I put that new file.

The short answer is that I put these files into app/models, with additional subfolders based on the meaning of the code.

So for the Appointment, I might have app/models/appointment.rb and app/models/scheduling/appointment_caching.rb, provided that the caching code is related specifically to scheduling. The rationale here is that the caching logic will only ever be relevant to scheduling whereas an appointment might be viewed in multiple contexts, e.g. sometimes scheduling and sometimes billing.

For the customer balance example, I might have app/models/customer.rb and app/models/billing/customer_balance.rb. Again, a customer balance is always a billing concern whereas a customer could be looked at through a billing lens or conceivably through some other sort of lens.

Note that even though appointment_caching.rb is a mixin or concern, I don’t put it in a concerns or mixins folder. That’s because I believe in organizing files by meaning rather than type. I find that doing so makes it easier to find what I want to find when I want to find it.

Takeaways

  • A piece of code is cohesive if a) everything in it shares single idea and b) it doesn’t mix incidental details with essential points.
  • Cohesion naturally erodes over time due to entropy.
  • The first key to maintaining cohesion is to make a clear distinction between what’s essential and what’s incidental.
  • Incidental details can be moved into either new objects or into mixins/concerns in order to help preserve cohesion.

3 thoughts on “Cohesion

Leave a Reply

Your email address will not be published. Required fields are marked *