If we wanted to, we could, of course, write web applications in assembly code. Computers can understand assembly code just as well as Ruby or Python or any other language.
The reason we write programs in higher-level languages like Ruby or Python is that while assembly language is easy for computers to understand, it’s of course not easy for humans to understand.
High-level languages (Ruby, Python, Java, C++, etc.) provide a layer of abstraction. Instead of having to think about a bunch of low-level details that we don’t care about most of the time, we can specify the behavior of our programs at a higher, more abstracted level. Instead of having to expend mental energy on things like memory locations, we can focus on what our program actually does.
In addition to using the abstractions provided by high-level languages, we can also add our own abstractions. A function, for example, is an abstraction that hides low-level details. An object can serve this purpose as well.
We’ll come back to some technical details regarding what abstraction is. First let’s gain a deeper understanding of what an abstraction is using an analogy.
Abstraction at McDonald’s
Let’s say I go to McDonald’s and decide that I want a Quarter Pounder Meal. The way I express my wishes to the cashier is by saying “Quarter Pounder Meal”. I don’t specify the details: that I want a fried beef patty between two buns with cheese, pickles, onions, ketchup and mustard, along with a side of potatoes peeled and cut into strips and deep-fried. Neither me nor the cashier cares about most of those details most of the time. It’s easier and more efficient for us to use a shorthand idea called a “Quarter Pounder Meal”.
The benefit of abstraction
As a customer, I care about a Quarter Pounder Meal at a certain level of abstraction. I don’t particularly care whether the ketchup goes on before the mustard or if the mustard goes on before the ketchup. In fact, I don’t even really think about ketchup and mustard at all most of the time, I just know that I like Quarter Pounders and that’s what I usually get at McDonald’s, so that’s what I’ll get. For me to delve any further into the details would be for me to needlessly waste brainpower. To me, that’s the benefit of abstraction: abstraction lets me go about my business without having to give or receive information that’s more detailed than I need or want. And of course the benefit of not having to work with low-level details is that it’s easier.
Levels of abstraction
Even though neither the customer nor the cashier want to think about most of the low-level details of a Quarter Pounder Meal most of the time, it’s true that sometimes they do want to think about those details. If somebody doesn’t like onions for example, they can drop down a level of abstraction and specify the detail that they would like their Quarter Pounder without onions. Another reason to drop down a level of abstraction may be that you don’t know what toppings come on a Quarter Pounder, and you want to know. So you can ask the cashier what comes on it and they can tell you. (Pickles, onions, ketchup and mustard.)
The cook cares about the Quarter Pounder Meal at a level of abstraction lower. When a cook gets an order for a Quarter Pounder, they have to physically assemble the ingredients, so they of course can’t not care about those details. But there are still lower-level details present that the cook doesn’t think about most of the time. For example, the cook probably usually doesn’t think about the process of pickling a cucumber and then slicing it because those steps are already done by the time the cook is preparing the hamburger.
What would of course be wildly inappropriate is if me as the customer specified to the cashier how thick I wanted the pickles sliced, or that I wanted dijon mustard instead of yellow mustard, or that I wanted my burger cooked medium-rare. Those are details that I’m not even allowed to care about. (At least I assume so. I’ve never tried to order a Quarter Pounder with dijon mustard.)
Consistency in levels
Things tend to be easiest when people don’t jump willy-nilly from one level of abstraction to another. When I’m placing an order at McDonald’s, everything I tell the cashier is more or less a pre-defined menu item or some pre-agreed variation on that item (e.g. no onion). It would probably make things weird if I were to order a Quarter Pounder Meal and also ask the cashier to tell me the expiration dates on their containers of ketchup and mustard. The cashier is used to taking food orders and not answering low-level questions about ingredients. If we jump among levels of abstraction, it’s easy for the question to arise of “Hang on, what are we even talking about right now?” The exchange is clearer and easier to understand if we stick to one level of abstraction the whole time.
Abstraction in Rails
In the same way that abstraction can ease the cognitive burden when ordering a Quarter Pounder, abstraction can ease the cognitive burden when working with Rails apps.
Sadly, many Rails apps have a near-total lack of abstraction. Everything that has anything to do with a user gets shoved into app/models/user.rb
, everything that has anything to do with an order gets shoved into app/models/order.rb
, and the result is that every model file is a mixed bag of wildly varying levels of abstraction.
Soon we’ll discuss how to fix this. First let’s look at an anti-example.
Abstraction anti-example
Forem, the organization behind dev.to, makes its code publicly available on GitHub. At the risk of being impolite, I’m going to use a piece of their code as an example of a failure to take advantage of the benefits of abstraction.
Below is a small snippet from a file called app/models/article.rb. Take a scroll through this snippet, and I’ll meet you at the bottom.
class Article < ApplicationRecord
# The trigger `update_reading_list_document` is used to keep the `articles.reading_list_document` column updated.
#
# Its body is inserted in a PostgreSQL trigger function and that joins the columns values
# needed to search documents in the context of a "reading list".
#
# Please refer to https://github.com/jenseng/hair_trigger#usage in case you want to change or update the trigger.
#
# Additional information on how triggers work can be found in
# => https://www.postgresql.org/docs/11/trigger-definition.html
# => https://www.cybertec-postgresql.com/en/postgresql-how-to-write-a-trigger/
#
# Adapted from https://dba.stackexchange.com/a/289361/226575
trigger
.name(:update_reading_list_document).before(:insert, :update).for_each(:row)
.declare("l_org_vector tsvector; l_user_vector tsvector") do
<<~SQL
NEW.reading_list_document :=
setweight(to_tsvector('simple'::regconfig, unaccent(coalesce(NEW.title, ''))), 'A') ||
setweight(to_tsvector('simple'::regconfig, unaccent(coalesce(NEW.cached_tag_list, ''))), 'B') ||
setweight(to_tsvector('simple'::regconfig, unaccent(coalesce(NEW.body_markdown, ''))), 'C') ||
setweight(to_tsvector('simple'::regconfig, unaccent(coalesce(NEW.cached_user_name, ''))), 'D') ||
setweight(to_tsvector('simple'::regconfig, unaccent(coalesce(NEW.cached_user_username, ''))), 'D') ||
setweight(to_tsvector('simple'::regconfig,
unaccent(
coalesce(
array_to_string(
-- cached_organization is serialized to the DB as a YAML string, we extract only the name attribute
regexp_match(NEW.cached_organization, 'name: (.*)$', 'n'),
' '
),
''
)
)
), 'D');
SQL
end
end
Given that dev.to is largely a blogging site, the concept of an article must be one of the most central concepts in the application. I would imagine that the Article
would have a lot of concerns, and the 800-plus-line article.rb
file, which contains a huge mix of apparently unrelated stuff, shows that the Article
surely in fact does have a lot of concerns connected to it.
Among these concerns, whatever this trigger
thing does is obviously a very peripheral one. If you were unfamiliar with the Article
model and wanted to see what it was all about, this database trigger code wouldn’t help you get the gist of the Article
at all. It’s too peripheral and too low-level. The presence of the trigger code is not only not helpful, it’s distracting.
The trigger code is at a much lower level of abstraction than you would expect to see in the Article
model.
The fix to this particular problem could be a very simple one: just move the trigger code out of article.rb
and put it in a module somewhere.
class Article < ApplicationRecord
include ArticleTriggers
end
The trigger code itself is not that voluminous, and I imagine it probably doesn’t need to be touched that often, so it’s probably most economical to just move that code as-is into ArticleTriggers
without trying to improve it.
Another anti-example
Here’s a different example which we’ll address in a little bit of a different way.
There are a couple methods inside article.rb
, evaluate_markdown
and evaluate_front_matter
.
class Article < ApplicationRecord
def evaluate_markdown
fixed_body_markdown = MarkdownProcessor::Fixer::FixAll.call(body_markdown || "")
parsed = FrontMatterParser::Parser.new(:md).call(fixed_body_markdown)
parsed_markdown = MarkdownProcessor::Parser.new(parsed.content, source: self, user: user)
self.reading_time = parsed_markdown.calculate_reading_time
self.processed_html = parsed_markdown.finalize
if parsed.front_matter.any?
evaluate_front_matter(parsed.front_matter)
elsif tag_list.any?
set_tag_list(tag_list)
end
self.description = processed_description if description.blank?
rescue StandardError => e
errors.add(:base, ErrorMessages::Clean.call(e.message))
end
def evaluate_front_matter(front_matter)
self.title = front_matter["title"] if front_matter["title"].present?
set_tag_list(front_matter["tags"]) if front_matter["tags"].present?
self.published = front_matter["published"] if %w[true false].include?(front_matter["published"].to_s)
self.published_at = parse_date(front_matter["date"]) if published
set_main_image(front_matter)
self.canonical_url = front_matter["canonical_url"] if front_matter["canonical_url"].present?
update_description = front_matter["description"].present? || front_matter["title"].present?
self.description = front_matter["description"] if update_description
self.collection_id = nil if front_matter["title"].present?
self.collection_id = Collection.find_series(front_matter["series"], user).id if front_matter["series"].present?
end
end
These methods seem peripheral from the perspective of the Article
model. They also seem related to each other, but not very related to anything else in Article
.
These qualities to me suggest that this pair of methods are a good candidate for extraction out of Article
in order to help keep Article
at a consistent, high level of abstraction.
“Evaluate markdown” is pretty vague. Evaluate how? It’s not clear exactly what’s supposed to happen. That’s fine though. We can operate under the presumption that the job of evaluate_markdown
is to clean up the article’s body. Here’s how we could change the code under that presumption.
class Article < ApplicationRecord
def evaluate_markdown
body_markdown = ArticleBody.new(body_markdown).cleaned
end
end
With this new, finer-grained abstraction called ArticleBody
, Article
no longer has to be directly concerned with cleaning up the article’s body. Cleaning up the article’s body is a peripheral concern to Article
. Understanding the detail of cleaning up the article’s body is neither necessary nor helpful to the task of trying to understand the essence of the Article
model.
Further abstraction
If we wanted to, we could conceivably take the contents of evaluate_markdown
and evaluate_front_matter
change them to be at a higher level of abstraction.
Right now the bodies of those methods seem to deal at a very low level of abstraction. They deal with how to do the work rather than what the end product should be. In order to understand what evaluate_markdown
does, we have to understand every detail of what evaluate_markdown
does, because it’s just a mixed bag of low-level details.
If evaluate_markdown
had abstraction, then we could take a glance at it and easily understand what it does because everything that happens would be expressed in the high-level terms of what rather than the low-level terms of how. I’m not up to the task of trying to refactor evaluate_markdown
in this blog post, though, because I suspect what’s actually needed is a much deeper change and a different approach altogether, rather than just a superficial polish. Changes of that depth that require time and tests.
How I maintain a consistent level of abstraction in my Rails apps
I try not to let my Active Record models get cluttered up with peripheral concerns. When I add a new piece of behavior to my app, I usually put that behavior in one or more PORO models rather than an Active Record model. Or, sometimes, I put that behavior in a concern or mixin.
The point about PORO models is significant. In the Rails application that I maintain at my job, about two-thirds of my models are POROs. Don’t make the mistake of thinking that a Rails model has to be backed by Active Record.
Takeaways
- Abstraction is the ability to engage with an idea without having to be encumbered by its low-level details.
- The benefit of abstraction is that it’s easier on the brain.
- Active Record models can be made easier to understand by keeping peripheral concerns out of the Active Record models and instead putting them in concerns, mixins or finer-grained PORO models.
This is one of your best article 🔥
Thanks!
I agree with Nicolas, your posts get better and better!
I really like the food comparison for abstractions, it makes it pretty obvious why it is bad to mix different levels together.
Thanks!
Really great article, the point about the McDonalds abstraction really brings the whole concept together.
I have a question, you mentioned:
“In the Rails application that I maintain at my job, about two-thirds of my models are POROs”
I’m curious why do you use models instead of service objects? Is there some benefit you get from using models not backed by ActiveRecord instead of the service object pattern?
Thanks!
I haven’t yet written all that I want to say about service objects, but here’s something I wrote that might help answer that question: https://www.codewithjason.com/code-without-service-objects/
In my opinion, there’s no such thing as “service objects”. What you’re describing is likely the Command pattern, and I feel like that it serves a very specific and not at all generic purposes within a codebase. Unfortunately, people will just generate a huge volume of Command pattern objects which are then used basically like procedural functions in an imperative (rather than declarative) manner. It just doesn’t adhere to good design principles of OOP. I know Jason’s written a lot about this topic and I’m really on board with his observations.