What to do about bloated Rails Active Record models

by Jason Swett,

Overview

It’s a common problem in Rails apps for Active Record models to get bloated as an application grows. This was a problem that I personally struggled with for a number of years. As I’ve gained experience, I’ve figured out a couple tactics for addressing this problem, tactics that I feel have worked well. I’ll share my these in this post.

First let’s talk about how and why this problem arises.

How bloated Active Record models arise

Some background on the Active Record pattern

Many Rails developers might not know that the Active Record pattern existed before Rails. The pattern was named by Martin Fowler in Patterns of Enterprise Application Architecture. The idea behind Active Record (in broad strokes) is that a database table has a corresponding class. For example, a patients table would have a Patient class. An instance of the class represents one row of data for the table. The class is endowed with capabilities like saving an object to the database as a new row, updating an existing row, finding records, and so on.

Rails’ specific implementation of the Active Record pattern is a big part of what gives Rails developers the ability to create so much functionality with such little code. Unfortunately, the Active Record pattern is also somewhat ripe for abuse to a programmer who doesn’t have much experience writing structured code outside of frameworks.

An easy Active Record trap to fall into

There’s a common problem in Rails projects which can perhaps be summarized like this: Rails developers want to keep controllers from getting bloated (rightly), so they push as much domain logic as possible down to the Active Record models.

But because the Active Record models constitute a limited number of “buckets”, the domain logic code accumulates in the Active Record models. If behavior is added to an application ten times faster than database tables are added, then all the behavior will pile up in those ten Active Record classes. The bloat problem isn’t solved, it’s just moved from the controller layer to the model layer.

To be clear, this isn’t a weakness of the Active Record pattern. The root cause of the problem is inexperience. Many Rails developers aren’t aware of other code organization strategies outside of Active Record.

Here are two ways I combat this “bloated Active Record model” problem.

Tactic #1: objects

When I need to add a piece of behavior to an application, I often consider adding it straight to the Active Record model, especially early in the lifecycle of the application. But I tend to also consider adding the behavior as a new object.

Below is an example of when I found a piece of behavior to make sense as a separate object rather than as part of an Active Record model. In the application I work on at work, which is a medical application, we periodically download a file containing all the insurance charges we’ve accumulated since the last time we downloaded charges. At one point we had a new need, a way to see a list of all the past batches of charges that had been downloaded. And for each item, we wanted to see the total dollar amount for that batch as well as the unique insurance claim count for that batch.

I of course could have put the code for this feature in the Charge Active Record object. But this new feature was so peripheral that I didn’t want to clutter up Charge with the code for it. So I conceived of a new object called DownloadedChargeCollection which has methods total_balance and unique_claim_count. Here’s what that object looks like.

module Billing
  class DownloadedChargeCollection
    attr_reader :file_download

    def initialize(file_download)
      @file_download = file_download
    end

    def unique_claim_count
      charges.map(&:appointment).uniq.count
    end

    def total_balance
      Money.new(balances.sum)
    end

    def self.from(file_downloads)
      file_downloads.map { |fd| new(fd) }
    end

    private

    def charges
      Charge.where(file_download: @file_download).includes(
        :insurance_payments,
        :appointment
      )
    end

    def balances
      charges.map(&:balance_cents).compact
    end
  end
end

The way this object is used is that, in the view, we do collection.unique_claim_count and collection.total_balance. This makes the view code very easily understandable. Then if we want to dig into the details DownloadedChargeCollection, that code is pretty understandable as well because the object is pretty small.

Designing good objects

Good object design is a subjective art. As for me, I like to design objects that are crisp abstractions that represent a “thing”. That’s why the above object is called DownloadedChargeCollection as opposed to DownloadedChargeManager or DownloadAllChargesService or something like that. I think class names that end in “-er” are a code smell because it often means that the object represents a fuzzy or confused idea.

I also like to write objects’ code in a declarative style as opposed to an imperative style. Notice how the method names are unique_claim_count and total_balance (nouns) rather than e.g. get_unique_claim_count and calculate_total_balance (commands). I prefer naming methods for what they return rather than for what they do. This is also why I prefer to name my objects as nouns rather than commands.

Coming up with abstractions can feel hard and unnatural if you don’t have much experience with it. One of the biggest abstraction breakthroughs for me was when I realized that an object didn’t have to represent something that already had a name, it could be a new invention. For an example of an object like this, Rails defines an object called ActionController::UnpermittedParameters. That’s not necessarily all that natural of an idea. The idea only exists because someone decided it exists. Similar story with my DownloadedChargeCollection. It’s a totally made-up concept that only exists because I decided it exists, but that doesn’t make it any less valid or useful of an abstraction.

Tactic #2: mixins/concerns

Sometimes I need to add a piece of behavior which is tightly coupled to the attributes of an Active Record model (or other unique capabilities of an Active Record model like scopes) but is highly peripheral to the model. It’s not part of the “essence” of the model.

In those cases a separate object wouldn’t necessarily make a lot of sense due to the tight coupling with the Active Record object. My new object would be chock full of “feature envy” and therefore probably be hard to understand. But I also don’t want to give up and just put the new behavior straight in the Active Record model because that would hurt the understandability of the Active Record model. In cases like this I tend to reach for a mixin or concern.

Below is an example of this. In this case I needed to endow an Appointment model with two scopes and an enum. Putting this code in a concern has two benefits: 1) it keeps the code out of the Appointment model so that I don’t feel like I have to understand this queueing-related code in order to understand Appointment in general and 2) it shows me that my enum and two scopes are all related to one another, something that would have been less clear (probably not clear at all) if I had put the code straight into the Appointment model.

module InReviewAppointmentQueue
  extend ActiveSupport::Concern

  included do
    enum subqueue: %i(clean holding)

    scope :missing_insurance, -> do
      joins(:patient).merge(
        Patient.left_joins(:insurance_accounts).where("insurance_accounts.id is null")
      )
    end

    scope :charge_queue_new, -> do
      where(subqueue: nil) - missing_insurance
    end
  end
end

Reusable concerns vs. one-off concerns

It seems to me that most of the concerns I see other people write have been written for the purpose of extracting common behavior with a goal of DRYing up code. There might be a concern called Archivable, for example, that adds archiving behavior to any class that includes it.

To me, DRYing up code is just one use case for concerns, and not even the main use case. Most of my concerns are only useful to one class. My goal with these concerns isn’t to DRY up my code but rather to hide details and to group related pieces of code together.

Concerns vs. mixins

Sometimes, when I extract peripheral behavior out of a model, I put that behavior into a concern. But at one point I realized that I concern is nothing but a Ruby mixin with a little bit of DSL syntax to make certain things easier, and that sometimes I have to need at all for this DSL syntax. If there’s no need for it, I don’t see why I should use it. So lately I’ve been favoring plain old Ruby mixins over concerns.

I’ve also grown a distaste for the idea of an app/models/concerns directory. I use a lot of namespaces in my application, sometimes nested two deep. If I have a concern that relates to something in the Billing::Eligibility namespace, for example, I’d rather put that concern in app/models/billing/eligibility/my_concern.rb than app/models/concerns/billing/eligibility/my_concern.rb. The latter choice would require me to mirror my whole directory structure inside app/models/concerns and also make it less obvious which model files are related to which concerns. (And again, I also often choose to use a regular old Ruby mixin rather than a concern.)

Criticisms of concerns

You can find a lot of criticisms of concerns online: that concerns are inheritance, and composition is better than inheritance; that concerns don’t necessarily remove dependencies, they just spread them across multiple files; that concerns create circular dependencies; and that when code is spread across multiple files, it can be unclear where methods are defined.

I address all of these criticisms in a separate post called “When used intelligently, Rails concerns are great“. The TL;DR is that concerns can be used either well or poorly, just like any other tool.

A word about service objects

The idea of “service objects” (which means different things depending on who you ask) seems to have grown in popularity in the Rails community in recent years. The most commonly-accepted definition of “service object” seems to be something roughly equivalent to the command pattern.

When service objects are bad

I think the command pattern can have its place. I’ve made use of some small and simple command pattern objects myself. The trouble with service objects (which again is usually the command pattern by a different name, as far as I can tell) is when inexperienced developers reach for service objects reflexively, as a perceived panacea, out of ignorance of the other options available (like regular old OOP abstractions). Not everything is best expressed as a command.

Another problem with service objects, as I’ve already mentioned, is that the term “service object” means different things to different people. When a concept is vague and has multiple meanings, I’d call that a “concept smell”.

When service objects (or rather, the command pattern) can be good

In general I prefer a declarative coding style over an imperative style. But a certain amount of imperative coding is necessary at some point because at some point your declarative code has to get used to actually do something. So maybe you can write 95% of your code in a declarative style but not 100%. The “tip of the pyramid” has to be imperative.

I think service objects—or I think more accurately, the command pattern—can be a decent way to package up the proportion of code in an application that needs to be expressed imperatively. It’s certainly better than stuffing it all in a controller.

Here’s an example of an imperative object I wrote called RemittanceFileParser. The style is imperative but the object is very simple. Most of the “real” work is pushed down to ElectronicRemittanceAdviceFile, which is written in a declarative style. (ElectronicRemittanceAdviceFile in turn delegates its logic to finer-grained declarative objects.)

class RemittanceFileParser
  attr_reader :results

  def initialize(content: nil, insurance_deposit: nil)
    @content = content
    @results = NiceEDI::ElectronicRemittanceAdviceFile.new(@content).parse
    @insurance_deposit = insurance_deposit
  end

  def perform
    ActiveRecord::Base.transaction do
      @results[:claim_payment_items].each do |claim_payment_item|
        claim_payment_item[:services].each do |service|
          save_insurance_payment!(
            service: service,
            claim_payment_item: claim_payment_item,
            remittance_amount: @results[:remittance_amount]
          )
        end
      end
    end
  end

  def string_to_cents(value)
    (value.to_r * 100).to_i
  end

  def save_insurance_payment!(service:, claim_payment_item:, remittance_amount:)
    @insurance_deposit.insurance_payments.create!(
      service_amount_cents: string_to_cents(service[:service_amount]),
      date_of_service: service[:date_of_service],
      cpt_code_freeform: service[:cpt_code],
      npi_code: claim_payment_item[:npi_code],
      patient_control_number: claim_payment_item[:patient_control_number],
      patient_first_name: claim_payment_item[:patient_first_name],
      patient_last_name: claim_payment_item[:patient_last_name],
      ma18_code_present: claim_payment_item[:ma18_code_present],
      medicare_secondary_payer_name: claim_payment_item[:medicare_secondary_payer_name],
      remittance_amount_cents: string_to_cents(remittance_amount)
    )
  end
end

So I think the takeaway here is: it’s fine and even necessary to use imperative code, but putting all your model code into imperative-style objects as a way of life is probably a mistake. I find it better, to the extent possible, to express my domain concepts as small, declarative objects.

The liberating realization that “you’re on your own”

Coding without frameworks

Imagine that you want to write a command-line Ruby program (no framework) that simulates, say, the Wheel of Fortune game show. As you add more and more code, you realize that Wheel of Fortune is actually pretty complicated, and it takes a lot of code to replicate it.

Because there’s a lot of logic, you wouldn’t be able to just put all your code in one big procedural file. You’d create a confusing mess for yourself pretty quickly if you did. You’d have to impose some structure somehow. And because you’re not using a framework, you’d have to come up with that structure yourself.

How would you structure this code? For me, I’d use the principles of object-oriented programming. I would compose my program of small, crisply-defined, declarative objects. If the program got big enough, I’d create some namespaces to make it easier to see what’s related to what. I might make use of mixins as well. But more than anything else, I’d use objects and OOP principles.

Tiny Rails apps

Now imagine a tiny Rails app that just has a few CRUD interfaces. Unlike our Wheel of Fortune Ruby program which is 0% framework code, this Rails app would be almost 100% framework code. You wouldn’t need to make a single design decision. You’d just need to run rails g scaffold a couple times and maybe add a couple lines for associations.

Large Rails apps

Lastly, imagine a huge Rails app with a lot of complicated domain logic. With an app like this, Rails can only help you so much. Rails can help abstract away common jobs like handling HTTP requests and talking to the database but the framework can’t help with the singularly unique domain logic of your application. No framework ever could.

In order to keep your domain logic organized and sufficiently easy to understand, you of course need not just tools but skills. You’re past the point where Rails can help you structure your code and so you need to impose the structure yourself.

Learning design skills (like the principles of OOP for example) is of course not easy. You’re never done learning. But hopefully the realization that design skills, not Rails, is the key to building maintainable Rails apps, is a useful one. It was for me.

Takeaways

  • The “bloated Active Record model” problem often arises when programmers follow the “skinny controllers, fat models” principle and allow all the domain logic to accumulate in Active Record models.
  • A “model” doesn’t have to mean an Active Record model, but can be any piece of code that models a concept.
  • A couple good ways to organize model code are to use objects and mixins.
  • Using service objects isn’t always necessarily a bad idea, but reflexively using service objects out of ignorance of other code organization options probably is.
  • Once a Rails application grows beyond a certain size, you can no longer rely on Rails itself to help keep your design sound but must rely on your own design skills.

One thought on “What to do about bloated Rails Active Record models

Leave a Reply

Your email address will not be published. Required fields are marked *