Category Archives: Programming

What to do about bloated Rails Active Record models

Overview

It’s a common problem in Rails apps for Active Record models to get bloated as an application grows. This was a problem that I personally struggled with for a number of years. As I’ve gained experience, I’ve figured out a couple tactics for addressing this problem, tactics that I feel have worked well. I’ll share my these in this post.

First let’s talk about how and why this problem arises.

How bloated Active Record models arise

Some background on the Active Record pattern

Many Rails developers might not know that the Active Record pattern existed before Rails. The pattern was named by Martin Fowler in Patterns of Enterprise Application Architecture. The idea behind Active Record (in broad strokes) is that a database table has a corresponding class. For example, a patients table would have a Patient class. An instance of the class represents one row of data for the table. The class is endowed with capabilities like saving an object to the database as a new row, updating an existing row, finding records, and so on.

Rails’ specific implementation of the Active Record pattern is a big part of what gives Rails developers the ability to create so much functionality with such little code. Unfortunately, the Active Record pattern is also somewhat ripe for abuse to a programmer who doesn’t have much experience writing structured code outside of frameworks.

An easy Active Record trap to fall into

There’s a common problem in Rails projects which can perhaps be summarized like this: Rails developers want to keep controllers from getting bloated (rightly), so they push as much domain logic as possible down to the Active Record models.

But because the Active Record models constitute a limited number of “buckets”, the domain logic code accumulates in the Active Record models. If behavior is added to an application ten times faster than database tables are added, then all the behavior will pile up in those ten Active Record classes. The bloat problem isn’t solved, it’s just moved from the controller layer to the model layer.

To be clear, this isn’t a weakness of the Active Record pattern. The root cause of the problem is inexperience. Many Rails developers aren’t aware of other code organization strategies outside of Active Record.

Here are two ways I combat this “bloated Active Record model” problem.

Tactic #1: objects

When I need to add a piece of behavior to an application, I often consider adding it straight to the Active Record model, especially early in the lifecycle of the application. But I tend to also consider adding the behavior as a new object.

Below is an example of when I found a piece of behavior to make sense as a separate object rather than as part of an Active Record model. In the application I work on at work, which is a medical application, we periodically download a file containing all the insurance charges we’ve accumulated since the last time we downloaded charges. At one point we had a new need, a way to see a list of all the past batches of charges that had been downloaded. And for each item, we wanted to see the total dollar amount for that batch as well as the unique insurance claim count for that batch.

I of course could have put the code for this feature in the Charge Active Record object. But this new feature was so peripheral that I didn’t want to clutter up Charge with the code for it. So I conceived of a new object called DownloadedChargeCollection which has methods total_balance and unique_claim_count. Here’s what that object looks like.

module Billing
  class DownloadedChargeCollection
    attr_reader :file_download

    def initialize(file_download)
      @file_download = file_download
    end

    def unique_claim_count
      charges.map(&:appointment).uniq.count
    end

    def total_balance
      Money.new(balances.sum)
    end

    def self.from(file_downloads)
      file_downloads.map { |fd| new(fd) }
    end

    private

    def charges
      Charge.where(file_download: @file_download).includes(
        :insurance_payments,
        :appointment
      )
    end

    def balances
      charges.map(&:balance_cents).compact
    end
  end
end

The way this object is used is that, in the view, we do collection.unique_claim_count and collection.total_balance. This makes the view code very easily understandable. Then if we want to dig into the details DownloadedChargeCollection, that code is pretty understandable as well because the object is pretty small.

Designing good objects

Good object design is a subjective art. As for me, I like to design objects that are crisp abstractions that represent a “thing”. That’s why the above object is called DownloadedChargeCollection as opposed to DownloadedChargeManager or DownloadAllChargesService or something like that. I think class names that end in “-er” are a code smell because it often means that the object represents a fuzzy or confused idea.

I also like to write objects’ code in a declarative style as opposed to an imperative style. Notice how the method names are unique_claim_count and total_balance (nouns) rather than e.g. get_unique_claim_count and calculate_total_balance (commands). I prefer naming methods for what they return rather than for what they do. This is also why I prefer to name my objects as nouns rather than commands.

Coming up with abstractions can feel hard and unnatural if you don’t have much experience with it. One of the biggest abstraction breakthroughs for me was when I realized that an object didn’t have to represent something that already had a name, it could be a new invention. For an example of an object like this, Rails defines an object called ActionController::UnpermittedParameters. That’s not necessarily all that natural of an idea. The idea only exists because someone decided it exists. Similar story with my DownloadedChargeCollection. It’s a totally made-up concept that only exists because I decided it exists, but that doesn’t make it any less valid or useful of an abstraction.

Tactic #2: mixins/concerns

Sometimes I need to add a piece of behavior which is tightly coupled to the attributes of an Active Record model (or other unique capabilities of an Active Record model like scopes) but is highly peripheral to the model. It’s not part of the “essence” of the model.

In those cases a separate object wouldn’t necessarily make a lot of sense due to the tight coupling with the Active Record object. My new object would be chock full of “feature envy” and therefore probably be hard to understand. But I also don’t want to give up and just put the new behavior straight in the Active Record model because that would hurt the understandability of the Active Record model. In cases like this I tend to reach for a mixin or concern.

Below is an example of this. In this case I needed to endow an Appointment model with two scopes and an enum. Putting this code in a concern has two benefits: 1) it keeps the code out of the Appointment model so that I don’t feel like I have to understand this queueing-related code in order to understand Appointment in general and 2) it shows me that my enum and two scopes are all related to one another, something that would have been less clear (probably not clear at all) if I had put the code straight into the Appointment model.

module InReviewAppointmentQueue
  extend ActiveSupport::Concern

  included do
    enum subqueue: %i(clean holding)

    scope :missing_insurance, -> do
      joins(:patient).merge(
        Patient.left_joins(:insurance_accounts).where("insurance_accounts.id is null")
      )
    end

    scope :charge_queue_new, -> do
      where(subqueue: nil) - missing_insurance
    end
  end
end

Reusable concerns vs. one-off concerns

It seems to me that most of the concerns I see other people write have been written for the purpose of extracting common behavior with a goal of DRYing up code. There might be a concern called Archivable, for example, that adds archiving behavior to any class that includes it.

To me, DRYing up code is just one use case for concerns, and not even the main use case. Most of my concerns are only useful to one class. My goal with these concerns isn’t to DRY up my code but rather to hide details and to group related pieces of code together.

Concerns vs. mixins

Sometimes, when I extract peripheral behavior out of a model, I put that behavior into a concern. But at one point I realized that I concern is nothing but a Ruby mixin with a little bit of DSL syntax to make certain things easier, and that sometimes I have to need at all for this DSL syntax. If there’s no need for it, I don’t see why I should use it. So lately I’ve been favoring plain old Ruby mixins over concerns.

I’ve also grown a distaste for the idea of an app/models/concerns directory. I use a lot of namespaces in my application, sometimes nested two deep. If I have a concern that relates to something in the Billing::Eligibility namespace, for example, I’d rather put that concern in app/models/billing/eligibility/my_concern.rb than app/models/concerns/billing/eligibility/my_concern.rb. The latter choice would require me to mirror my whole directory structure inside app/models/concerns and also make it less obvious which model files are related to which concerns. (And again, I also often choose to use a regular old Ruby mixin rather than a concern.)

Criticisms of concerns

You can find a lot of criticisms of concerns online: that concerns are inheritance, and composition is better than inheritance; that concerns don’t necessarily remove dependencies, they just spread them across multiple files; that concerns create circular dependencies; and that when code is spread across multiple files, it can be unclear where methods are defined.

I address all of these criticisms in a separate post called “When used intelligently, Rails concerns are great“. The TL;DR is that concerns can be used either well or poorly, just like any other tool.

A word about service objects

The idea of “service objects” (which means different things depending on who you ask) seems to have grown in popularity in the Rails community in recent years. The most commonly-accepted definition of “service object” seems to be something roughly equivalent to the command pattern.

When service objects are bad

I think the command pattern can have its place. I’ve made use of some small and simple command pattern objects myself. The trouble with service objects (which again is usually the command pattern by a different name, as far as I can tell) is when inexperienced developers reach for service objects reflexively, as a perceived panacea, out of ignorance of the other options available (like regular old OOP abstractions). Not everything is best expressed as a command.

Another problem with service objects, as I’ve already mentioned, is that the term “service object” means different things to different people. When a concept is vague and has multiple meanings, I’d call that a “concept smell”.

When service objects (or rather, the command pattern) can be good

In general I prefer a declarative coding style over an imperative style. But a certain amount of imperative coding is necessary at some point because at some point your declarative code has to get used to actually do something. So maybe you can write 95% of your code in a declarative style but not 100%. The “tip of the pyramid” has to be imperative.

I think service objects—or I think more accurately, the command pattern—can be a decent way to package up the proportion of code in an application that needs to be expressed imperatively. It’s certainly better than stuffing it all in a controller.

Here’s an example of an imperative object I wrote called RemittanceFileParser. The style is imperative but the object is very simple. Most of the “real” work is pushed down to ElectronicRemittanceAdviceFile, which is written in a declarative style. (ElectronicRemittanceAdviceFile in turn delegates its logic to finer-grained declarative objects.)

class RemittanceFileParser
  attr_reader :results

  def initialize(content: nil, insurance_deposit: nil)
    @content = content
    @results = NiceEDI::ElectronicRemittanceAdviceFile.new(@content).parse
    @insurance_deposit = insurance_deposit
  end

  def perform
    ActiveRecord::Base.transaction do
      @results[:claim_payment_items].each do |claim_payment_item|
        claim_payment_item[:services].each do |service|
          save_insurance_payment!(
            service: service,
            claim_payment_item: claim_payment_item,
            remittance_amount: @results[:remittance_amount]
          )
        end
      end
    end
  end

  def string_to_cents(value)
    (value.to_r * 100).to_i
  end

  def save_insurance_payment!(service:, claim_payment_item:, remittance_amount:)
    @insurance_deposit.insurance_payments.create!(
      service_amount_cents: string_to_cents(service[:service_amount]),
      date_of_service: service[:date_of_service],
      cpt_code_freeform: service[:cpt_code],
      npi_code: claim_payment_item[:npi_code],
      patient_control_number: claim_payment_item[:patient_control_number],
      patient_first_name: claim_payment_item[:patient_first_name],
      patient_last_name: claim_payment_item[:patient_last_name],
      ma18_code_present: claim_payment_item[:ma18_code_present],
      medicare_secondary_payer_name: claim_payment_item[:medicare_secondary_payer_name],
      remittance_amount_cents: string_to_cents(remittance_amount)
    )
  end
end

So I think the takeaway here is: it’s fine and even necessary to use imperative code, but putting all your model code into imperative-style objects as a way of life is probably a mistake. I find it better, to the extent possible, to express my domain concepts as small, declarative objects.

The liberating realization that “you’re on your own”

Coding without frameworks

Imagine that you want to write a command-line Ruby program (no framework) that simulates, say, the Wheel of Fortune game show. As you add more and more code, you realize that Wheel of Fortune is actually pretty complicated, and it takes a lot of code to replicate it.

Because there’s a lot of logic, you wouldn’t be able to just put all your code in one big procedural file. You’d create a confusing mess for yourself pretty quickly if you did. You’d have to impose some structure somehow. And because you’re not using a framework, you’d have to come up with that structure yourself.

How would you structure this code? For me, I’d use the principles of object-oriented programming. I would compose my program of small, crisply-defined, declarative objects. If the program got big enough, I’d create some namespaces to make it easier to see what’s related to what. I might make use of mixins as well. But more than anything else, I’d use objects and OOP principles.

Tiny Rails apps

Now imagine a tiny Rails app that just has a few CRUD interfaces. Unlike our Wheel of Fortune Ruby program which is 0% framework code, this Rails app would be almost 100% framework code. You wouldn’t need to make a single design decision. You’d just need to run rails g scaffold a couple times and maybe add a couple lines for associations.

Large Rails apps

Lastly, imagine a huge Rails app with a lot of complicated domain logic. With an app like this, Rails can only help you so much. Rails can help abstract away common jobs like handling HTTP requests and talking to the database but the framework can’t help with the singularly unique domain logic of your application. No framework ever could.

In order to keep your domain logic organized and sufficiently easy to understand, you of course need not just tools but skills. You’re past the point where Rails can help you structure your code and so you need to impose the structure yourself.

Learning design skills (like the principles of OOP for example) is of course not easy. You’re never done learning. But hopefully the realization that design skills, not Rails, is the key to building maintainable Rails apps, is a useful one. It was for me.

Takeaways

  • The “bloated Active Record model” problem often arises when programmers follow the “skinny controllers, fat models” principle and allow all the domain logic to accumulate in Active Record models.
  • A “model” doesn’t have to mean an Active Record model, but can be any piece of code that models a concept.
  • A couple good ways to organize model code are to use objects and mixins.
  • Using service objects isn’t always necessarily a bad idea, but reflexively using service objects out of ignorance of other code organization options probably is.
  • Once a Rails application grows beyond a certain size, you can no longer rely on Rails itself to help keep your design sound but must rely on your own design skills.

How I keep my Rails controllers organized

The problem to be solved

As a Rails application grows, its controllers tend to accumulate actions beyond the seven RESTful actions (index, show, new, edit, create, update and destroy). The more “custom” actions there are, the harder it can be to understand and work with the controller.

Here are three tactics I use to keep my Rails controllers organized.

First, a note about “skinny controllers, fat models”

The concept of “skinny controllers, fat models” is well-known in the Rails community at this point. For the sake of people who are new to Rails, I want to mention that one good way to keep controllers small is to put complex business logic in models rather than controllers. For more on this topic, I might suggest my posts “What is a Rails model?” and (since service objects are a common but I think misguided recommendation) “How I code without service objects“, as well as the original skinny controllers/fat models post by Jamis Buck.

But even if you put as much business logic as possible in models rather than controllers, you’re still left with some challenges regarding overly large controllers. Here are the main three tactics I use to address these challenges.

Tactic 1: extracting a “hidden resource”

Sometimes, when a controller collects too many custom actions, it’s a sign that there’s some “hidden resource” that’s waiting to be identified and abstracted.

For example, in the application that I work on at work, we have an Active Record model called Message which exists so that internal employees who use the application can message each other. At one point we added the concept of a PTO request, which under the hood is really just a Message, but created through a different UI than regular messages.

We could have put these PTO-request-related actions right inside the main MessagesController but that would have made MessagesController too big and muddy. Instead of just being about regular messages, MessagesController would contain some code about regular messages, some code about PTO requests, and some code that relates to both things. So we didn’t want to do that.

Instead, we created a separate controller called PTORequestsController. Even though we decided to have a resource called a PTO request, we didn’t create a separate Active Record model for that. The PTORequestsController just uses the Message model. Here’s what the controller looks like.

module Messaging
  class PTORequestsController < ApplicationController
    def new
      @message = Message.new
    end

    def create
      @message = Message.new(
        message_params.merge(
          sender: current_user,
          recipients: [User.practice_administrator],
          body: "PTO request from #{current_user.first_name} #{current_user.last_name}\n\n#{@message.body}"
        )
      )

      if @message.save
        @message.send_any_email_notifications(@message)
        redirect_to submitted_messaging_pto_requests_path
      else
        render :new
      end
    end

    def submitted
    end

    private

    def message_params
      params.require(:messaging_message).permit(:body)
    end
  end
end

Sometimes, like in this case, the “hidden resource” is pretty obvious from the start and so the original controller can just be left untouched. Other times (and probably more commonly), the original controller slowly grows over time and then a “last straw” moment prompts us to identify a hidden resource and move that resource to a new controller. Sometimes it’s easy to identify a hidden resource and sometimes it’s not.

Tactic 2: same resource, different “lenses”

To review the “hidden resource” example, we had one resource (messages) and then we added a new resource (PTO requests). Making the distinction between messages and PTO requests allowed us to think about the two resources separately and keep their code in separate places. This allowed “regular messages” and PTO requests to be thought about and worked with separately, lowering the overall cognitive cost of the code.

This second tactic applies to a different scenario. Sometimes we don’t want to have a different resource but rather we want to treat the same resource differently. I’ll give another example from the application that I work on at work.

In this application, which is an application for running a medical clinic, we have the concept of an appointment, which can have different meanings in different contexts. For example, in a scheduling context, we care about what time the appointment is for. In a clinical context, we care about the notes the doctor makes regarding the patient’s condition. In a billing context, we care about the charges associated with the appointment.

Early in the application’s life, it was okay for there just to be one single AppointmentsController. But over time AppointmentsController started to get cluttered and harder to understand. So we added a couple new controllers, Billing::AppointmentsController and Chart::AppointmentsController, so that each of these concerns could be dealt with separately. As I’m writing this post, I even realize that it would probably be smart for us to rename AppointmentsController to Schedule::AppointmentsController because almost everything that’s in AppointmentsController is related to scheduling.

Unlike the case of messages and PTO requests, the idea here is not to come up with a new resource but rather to look at the same resource through different lenses. There’s no separate model called Billing::AppointmentsController or Chart::AppointmentsController. The benefit comes from being able to have separate places to deal with separate contexts for the same model.

Tactic 3: dealing with collections separately

This one is a simple one but useful and common enough to be worth mentioning. Sometimes I end up with controllers with actions like bulk_create, bulk_update, etc. in addition to the regular create and update actions. In this case I often create a “collections” controller.

For example, in my application I have a Billing::InsurancePaymentsController and also a Billing::InsurancePaymentCollectionsController. Here’s what the latter controller looks like.

module Billing
  class InsurancePaymentCollectionsController < ApplicationController
    before_action { authorize %i(billing insurance_payment) }

    def create
      @insurance_deposit = InsuranceDeposit.find_by(id: params[:insurance_deposit_id])

      if params[:file].present?
        RemittanceFileParser.new(
          content: params[:file].read,
          insurance_deposit: @insurance_deposit
        ).perform

        redirect_to new_billing_insurance_deposit_insurance_deposit_reconciliation_path(
          insurance_deposit_id: @insurance_deposit.id
        )
      else
        redirect_to request.referer
      end
    end

    def destroy
      InsurancePayment.where(id: params[:ids]).destroy_all
      redirect_to request.referer
    end
  end
end

Takeaways

  • Controllers often have a tendency to grow over time. When they do, they usually become hard to understand.
  • It’s helpful to put as much business logic as possible in models rather than controllers.
  • A large controller can sometimes be made smaller by extracting a “hidden resource” that uses the same Active Record model but clothed in a different idea.
  • Another way that large controllers can sometimes be broken up is to think about looking at the same resource through different lenses.
  • It can also sometimes be helpful to deal with “bulk actions” in a separate controller.

How to have a productive programming day

Why productivity is desirable

The way I look at productivity is this: if I have to be at work all day, I might as well get as much done in that chunk of time as I can.

It’s not about running myself ragged. It’s not about “hustling”. It’s simply about not being wasteful with my time.

In fact, my goal is to accomplish more than average developers do, but with less time and effort.

Why bother being productive? Because the more work I’m able to get done, the faster I’ll be able to learn, the smarter I’ll get, the better my job will be, the better my career will be, the more money I’ll earn, and so on. And if I can achieve all that at the same time as reducing effort, then it’s an obvious win.

Here are some high-level productivity tips I’ve learned over the years.

Start your day early

The most helpful productivity tip I’ve learned in life sounds dumb but it actually works: get up early and go to bed early. Ben Franklin was right about that one.

I’m not sure why this tactic works. One hypothesis I have is that if you start your day earlier than other people, then you’re starting your day before other people start interrupting you.

In any case, I don’t care why it works, I just care that it works.

Start real work right away

The first hour of the day has been called the rudder of the day. I’ve found that if my first hour is a productive one, then the rest of the day will be productive also, all the way through. Conversely, if my first hour is lazy and unfocused, the entire rest of the day will be lazy and unfocused as well.

Don’t start your day by checking email, reading news, scrolling Twitter, or anything like that. These activities can cause a state that Mihaly Csikszentmihalyi calls “psychic entropy”. In other words, those activities scramble your brain.

Instead, start doing real work, immediately when you sit down at your computer. It sets a good tone for the day.

Start with something highly tractable

The more ambiguous or open-ended a task is, the harder it will be to get straight to work on that task, and the harder it will be to get that first hour of solid productivity under your belt. So don’t start with a task like that. Start with a task that’s sharply defined.

This is actually not necessarily very easy. In order to have a sharply defined task, someone (possibly you) has to have done some work in advance. In the best case, you have a fine-grained to-do list laid out, which originated from a small and crisp user story, which is part of a clearly-defined project.

We’ll talk more in a moment about how you can ensure that you always have highly tractable work to start your day with. First, a few more “meta” tips.

When possible, exercise before work

On the days when I exercise before work, I often feel sharper and more alert than when I don’t. I don’t drink coffee on a daily basis anymore, but when I used to, I would feel less of a need for coffee in the morning on the days when I exercised beforehand.

Since I hate wasting time, I don’t like to go to the gym, which seems like a really time-inefficient way to exercise. Instead I like to ride my bike to the office. I also have some dumbbells in my office so I can lift weights throughout the day. Obviously, you can do whatever type of exercise works for you.

Don’t eat too much

Eating too much can be an energy-killer. I’ve found that eating too much can make me tired, lazy, and just put me in a worse mood. And of course, eating too much habitually can make you fat. That’s obviously bad.

I used to go out to eat with co-workers for lunch almost every day. Then, around 2pm, the greasy tacos or burger and fries I ate for lunch would catch up to me and I would get so sleepy that I would want to die.

Later in life I started packing a lunch instead of going out to eat. Restaurants always give you too much food. Homemade food is usually not as bad for you as restaurant food. I didn’t feel as bad after a packed lunch as I used to from restaurant food, although I would still get tired a lot, especially if my lunch happened to be leftovers from a heavy dinner like meatloaf. I typically felt pretty good after lunch if I ate a salad instead.

Here’s what I do today. You might think this is crazy, but it actually works out really well for me. I just don’t eat anything until dinnertime. In other words, I skip both breakfast and lunch. I’ve found that I don’t get nearly as hungry as I would expect. Ironically, I find myself much less distracted by hunger throughout the day than I used to when I used to eat lunch. I also feel much more alert than when I used to eat food during the day. Plus, as you might expect, I’ve lost some weight as an added bonus.

Don’t drink too much caffeine

With caffeine, the productivity highs are higher but the lows are lower, at least for me. I personally seem to be especially susceptible to the lows of caffeine. When I switched from coffee to tea (black tea which still contains caffeine, much less than coffee), I noticed that I slept better and felt better during the day. As a result of the fact that I’m not jacked up half the time and lethargic half the time, I feel like I’m smarter on average than when I used to ride the caffeine rollercoaster.

In my experience, the biggest drawback of caffeine is that it negatively affects my sleep.

When possible, keep email, Slack, Twitter, etc. closed

Distractions and interruptions are obviously bad for productivity. Keep these things closed when you can. Despite how obvious this advice sounds, my perception is that a lot of people don’t follow it.

Keep your browser tabs to a minimum

Each browser tab you have open has costs. First, a browser tab costs attention. When you have a tab open, you’re assigning a little bit of “mental RAM” to that tab. That’s a little bit of precious mental RAM that can’t be used for something else, something more useful.

A browser tab often also costs time. I wish I had a dollar for every time I was sitting with a student or co-worker and they click through their various tabs, trying to find the tab they’re interested in. So wasteful.

Instead of keeping a bunch of browser tabs open on the off chance that you’ll need to get back to their contents at some point, just close them. The net cost of re-finding any content you need later is way less than the net cost of always keeping a bunch of browser tabs open.

Work on one thing at a time

The fastest way to get a bunch of things done is to work on one thing at a time.

When you pause task A in order to work on task B, you’re giving yourself an opportunity to forget the details of task A. Then, when you resume task A, you have to refamiliarize yourself with the details of that task. That’s a waste. You could have just loaded those details into your head once instead of twice.

Of course, it’s not always possible to keep on a task until it’s all the way done. Sometimes you have to pause to wait for feedback or for a long-running command, for example.

In these situations I’ve found a way to mitigate the context-switching costs. If I have to switch to a different task, then instead of switching to a different programming task, I’ll work on something that’s entirely different in nature. For example, if I expect to have to wait just a few minutes or up to an hour, I might use that time to read a programming book or work on a blog post. If the task I switch to is totally different, it doesn’t compete for headspace the way a similar task would.

If for some reason I have to set down a task for a few hours or more, then I usually just suck it up and switch to a different programming task. But that’s a last resort, not a Plan A. And the need to switch tasks can be minimized by using good development practices like small, crisp user stories that are shovel-ready by the time developers start to work on them.

Practice automated testing

The productivity benefits of automated testing are numerous. I’ll list some of them.

First and most obvious, writing automated tests saves you from having to do as much manual testing. Now that I know about testing, I dread the idea of coding a feature the old way, where I have to perform a series of manual testing actions after each change I make.

Second, testing forces you to articulate and justify every piece of code you add. testing makes it harder to violate YAGNI. YAGNI violations are pure waste.

Third, testing often helps you to think of all the use cases you need to exercise for your feature. When writing tests, it actually becomes fun to try to exhaustively list all the scenarios under which your feature could possibly fail.

Fourth, the test suite that results from testing helps protect against regressions. Regressions cost time.

Lastly, testing has a tendency to improve the understandability of code. The reason is that code that’s easy to test often takes the form of small and loosely-coupled classes and methods. It also just so happens that small and loosely-coupled classes and methods are easier to understand than large and interwoven classes and methods.

Try hard to write code that’s easy to understand

As Bob Martin has said, “the only way to go fast is to go well”. After all, the reason we call good code “good code” is because good code is faster and less expensive to work with than bad code.

Remember not to fall into the fallacy that you can gain speed by cutting corners. Every extra hour that you spend doing worthwhile refactoring (key word “worthwhile”) saves three hours of future confusion.

Always know what you’re working on

One fairly guaranteed way to not accomplish much is to not even know what you’re trying to accomplish.

The best way of keeping track of what you’re working on is to write down what you’re trying to achieve. This can be a written statement in an automated test (another benefit of testing) or just a note in a note-taking program or even a note on a piece of paper.

The more specific your objective is, the easier it will be to accomplish. Half the difficulty in doing any piece of work is determining exactly what needs to be done.

Keep a to-do list

Mental RAM is a previous resource. It’s wasteful to use up your mental RAM by trying to remember all the things you have to do. Instead, write those things down.

This is another practice that sounds simple and obvious but is often not followed.

End the day with a plan for tomorrow

The ideal morning is one where you can sit down and immediately begin working. If you have to begin your day by making a difficult decision—the decision of what to work on among the infinite possibilities in front of you—then your morning is probably going to go worse, and your day will probably go worse as a result.

So, each day, try to make at least a vague note for the next day to remind yourself what you want to work on tomorrow. If you don’t want to box yourself in, remind yourself that you can always change your mind.

End the day with a deliberate loose end

I used to prefer to stop working when I reached a good “stopping point”. These days I deliberately avoid stopping at a good stopping point.

Instead, I leave a loose end that I can pick up on the next day. One tactic I like to use is to write a failing test so that my obvious first task for the next morning is to get that test to pass.

Takeaways

  • If you have to be at work all day, might as well get as much done as possible.
  • Get up early and go to bed early. For some reason it helps you accomplish more, even if you work the same amount of time.
  • When you sit down in the morning, start real work right away. Don’t start with email, news or social media.
  • Start with a piece of work that’s tractable rather than something ambiguous. This will help you gain momentum faster.
  • When possible, exercise before work. It will probably increase your cognitive capabilities for the day.
  • Don’t eat too much during the day. Especially avoid “heavy” foods. Eating too much can kill your mood, energy and cognitive abilities.
  • Don’t drink too much caffeine. Caffeine provides a “local high” but in my experience the drawbacks of too much caffeine, including especially the negative impact on sleep, make high caffeine intake not worth it.
  • Keep email, Slack, Twitter, etc. closed.
  • Keep your browser tabs to a minimum. They’re not worth what they cost.
  • The fastest way to get a bunch of things done is to work on one thing at a time.
  • Practice test-driven development. TDD helps protect from regressions, helps improve the understandability of your code, and helps keep you focused.
  • Try to write code that’s easy to understand. It’s faster to work with easy-to-understand code than hard-to-understand code. Even though it can take more time and effort to write clean code than messy code, the net effect is a great time savings.
  • Always know what you’re working on. You’re unlikely to accomplish much when you don’t even know what you’re trying to accomplish.
  • Keep a to-do list rather than trying to hold all your to-dos in your head, wasting precious “mental RAM”.
  • End the day with a plan for tomorrow so that you don’t have to spend the first part of tomorrow figuring out what you’re going to do.
  • End the day with a deliberate loose end so that it’s easy to hit the ground running the next day.

Understanding Factory Bot syntax by coding your own Factory Bot

What we’re going to do and why

When you look at a Factory Bot factory definition, the syntax might look somewhat mysterious. Here’s an example of such a factory.

FactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

The goal of this tutorial is to demystify this syntax. The way we’ll do this is to write our own implementation of Factory Bot from scratch. Or, more precisely, we’ll write an implementation of something that behaves indistinguishably from Factory Bot for a few narrow use cases.

Concepts we’ll learn about

Blocks

Factory Bot syntax makes heavy use of blocks. In this post we’ll learn a little bit about how blocks work.

Message sending

We’ll learn the nuanced distinction between calling a method on an object and sending a message to an object.

method_missing

We’ll learn how to use Ruby’s method_missing feature so that we can define methods for objects dynamically.

Our objective

Our goal with this post will be to write some code that makes the factory below actually work. Notice that this factory is indistinguishable from a Factory Bot factory except for the fact that it starts with MyFactoryBot rather than FactoryBot.

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

In addition to this factory code, we’ll also need some code that exercises the factory.

Exercising the factory

Here’s some code that will exercise our factory to make sure it actually works. Just like how our factory definition mirrors an actual Factory Bot factory definition, the below code mirrors how we would use a Factory Bot factory.

user = MyFactoryBot.create(:user)
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"

After we get our factory working properly, the above code should produce the following output.

First name: John
Last name: Smith

Let’s get started.

How to follow along

If you’d like to code along with this tutorial, I have a sample project that you can use.

The code in this tutorial depends on a certain Rails project, so I’ve created a GitHub repo at https://github.com/jasonswett/my_factory_bot where there’s a Dockerized Rails app, the same exact Rails app that I used to write this tutorial. You’ll find instructions to set up the project on the GitHub page.

The approach

We’ll be writing our “Factory Bot clone” code using a (silly) development methodology that I call “error-driven development”. Error-driven development works like this: you write a piece of code and try to run it. If you get an error when you run the code, you write just enough code to fix that particular error, and nothing more. You repeat this process until you have the result you want.

The reason I sometimes like to code this way is that it prevents me from writing any code that hasn’t been (manually) tested. Surprisingly enough, this “methodology” actually works pretty well.

Building the factory

The first thing to do is to create a file called my_factory_bot.rb and put it at the Rails project root.

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"

Then we’ll run the code like this:

$ rails run my_factory_bot.rb

The first thing we’ll see is an error saying that MyFactoryBot is not defined.

uninitialized constant MyFactoryBot (NameError)

This is of course true. We haven’t yet defined something called MyFactoryBot. So, in the spirit of practicing “error-driven development”, let’s write enough code to make this particular error go away and nothing more.

class MyFactoryBot
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"

Now, if we run the code again (using the same rails run command from above), we get a different error.

undefined method `define' for MyFactoryBot:Class (NoMethodError)

This is also true. The MyFactoryBot class doesn’t have a method called define. So let’s define it.

class MyFactoryBot
  def self.define
  end
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"

Now we get a new error.

undefined method `create' for MyFactoryBot:Class (NoMethodError)

This of course comes from the user = MyFactoryBot.create(:user) line. Let’s define a create method in order to make this error go away. Since we’re passing in an argument, :user, when we call create, we’ll need to specify a parameter for the create method. I’m calling the parameter model_sym since it’s a symbol that corresponds to the model that the factory is targeting.

class MyFactoryBot
  def self.define
  end

  def self.create(model_sym)
  end
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"

Now we get an error for the next line.

undefined method `first_name' for nil:NilClass (NoMethodError)

We’ll deal with this error, but not just yet, because this one will be a little bit tricky, and there are some certain other things that will make sense to do first. Let’s temporarily comment out the lines that call first_name and last_name.

class MyFactoryBot
  def self.define
  end

  def self.create(model_sym)
  end
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
#puts "First name: #{user.first_name}"
#puts "Last name: #{user.last_name}"

Now, if we run the file again, we get no errors.

Making it so the factory block gets called

Right now, the block inside of MyFactoryBot.define isn’t getting used at all. Let’s add block.call to the defined method so that the block gets called.

class MyFactoryBot
  def self.define(&block)
    block.call
  end

  def self.create(model_sym)
  end
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
#puts "First name: #{user.first_name}"
#puts "Last name: #{user.last_name}"

Now when we run the file we get the following error.

undefined method `factory' for main:Object (NoMethodError)

It makes sense that we would get an error that says “undefined method factory“. We of course haven’t defined any method called factory

The receiver for the “factory” method

Notice how the error message says undefined method `factory' for main:Object. What’s the main:Object part all about?

Sending a message vs. calling a method

In Ruby you’ll often hear people talk about “sending a message to an object” rather than “calling a method on an object”. The distinction between these things is subtle but significant.

Consider the following bit of code:

a = Array.new(5) # [nil, nil, nil, nil, nil]
a.to_s
a.to_i

The a variable is an instance of the Array class. When we do a.to_s, we’re sending the to_s message to the a object. The a object will happily respond to the to_s message and return a stringified version of the array: "[nil, nil, nil, nil, nil]"

The a object does not respond to (for example) the to_i method. If we send to_i to a, we get an error:

undefined method `to_i' for [nil, nil, nil, nil, nil]:Array

Notice the format of the last part of the error message. It’s value of receiving object:class of receiving object.

Understanding main:Object

In Ruby, every message that gets passed has a receiver, even when it doesn’t seem like there would be. When we do e.g. a.to_s, the receiver is obvious: it’s a. What about when we just call e.g. puts?

When we send a message that doesn’t explicitly have a receiver object, the receiver is a special object called main. That’s why when we call factory we get an error message that says undefined local variable or method `factory' for main:Object. The interpreter sends the factory message to main because we’re not explicitly specifying any other receiver object.

Changing the receiver of the factory message

If we want our program to work, we’re going to have to change the receiver of factory from main to something else. If the receiver were just main, then our factory method would have to just be defined out in the open, not as part of any object. If the factory method is not defined as part of any object, then it can’t easily share any data with any object, and we’ll have a pretty tough time.

We can change the receiver of the factory message by using a Ruby method called instance_exec.

class MyFactoryBot
  def self.define(&block)
    instance_exec(&block)
  end

  def self.create(model_sym)
  end
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
#puts "First name: #{user.first_name}"
#puts "Last name: #{user.last_name}"

The instance_exec method will execute our block in the context of self. We can see that now, when we run our file, our error message has changed. The receiver object is no longer main. It’s now MyFactoryBot.

undefined method `factory' for MyFactoryBot:Class (NoMethodError)

Let’s add the factory method to MyFactoryBot.

class MyFactoryBot
  def self.define(&block)
    instance_exec(&block)
  end

  def self.factory(model_sym)
  end

  def self.create(model_sym)
  end
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
#puts "First name: #{user.first_name}"
#puts "Last name: #{user.last_name}"

Now we don’t get any errors.

We know we’ll also need to invoke the block inside of the factory call, so let’s use another instance_exec inside of the factory method to do that.

class MyFactoryBot
  def self.define(&block)
    instance_exec(&block)
  end

  def self.factory(model_sym, &block)
    instance_exec(&block)
  end

  def self.create(model_sym)
  end
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
#puts "First name: #{user.first_name}"
#puts "Last name: #{user.last_name}"

Now it’s complaining, unsurprisingly, that there’s no method called first_name.

undefined method `first_name' for MyFactoryBot:Class (NoMethodError)

Adding the first_name and last_name methods

We of course need to define methods called first_name and last_name somewhere. We could conceivably add them on the MyFactoryBot class, but it would probably work out better to have a separate instance for each factory that we define, since of course a real application will have way more than just one factory.

Let’s make it so that the factory method creates an instance of a new class called MyFactory and then invokes the block on MyFactory.

class MyFactoryBot
  def self.define(&block)
    instance_exec(&block)
  end

  def self.factory(model_sym, &block)
    factory = MyFactory.new
    factory.instance_exec(&block)
  end

  def self.create(model_sym)
  end
end

class MyFactory
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
#puts "First name: #{user.first_name}"
#puts "Last name: #{user.last_name}"

We of course still haven’t actually defined a method called first_name, so we still get an error about that, although the receiver is not MyFactory rather than MyFactoryBot.

undefined method `first_name' for #<MyFactory:0x0000ffff9a955ae8> (NoMethodError)

Let’s define a first_name and last_name method on MyFactory in order to make this error message go away.

class MyFactoryBot
  def self.define(&block)
    instance_exec(&block)
  end

  def self.factory(model_sym, &block)
    factory = MyFactory.new
    factory.instance_exec(&block)
  end

  def self.create(model_sym)
  end
end

class MyFactory
  def first_name
  end

  def last_name
  end
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
#puts "First name: #{user.first_name}"
#puts "Last name: #{user.last_name}"

Now we get no errors.

Providing values for first_name and last_name

Let’s now come back and uncomment the last two lines of the file, the lines that output values for first_name and last_name.

class MyFactoryBot
  def self.define(&block)
    instance_exec(&block)
  end

  def self.factory(model_sym, &block)
    factory = MyFactory.new
    factory.instance_exec(&block)
  end

  def self.create(model_sym)
  end
end

class MyFactory
  def first_name
  end

  def last_name
  end
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"

We get an error.

undefined method `first_name' for nil:NilClass (NoMethodError)

Remember that every message in Ruby has a receiver. Apparently in this case the receiver for first_name when we do user.first_name is nil. In other words, user is nil. That’s obviously not going to work out. It does make sense, though, because MyFactoryBot.create(:user) has no return value.

Let’s try making it so that MyFactoryBot#create returns our instance of MyFactory.

class MyFactoryBot
  def self.define(&block)
    instance_exec(&block)
  end

  def self.factory(model_sym, &block)
    @factory = MyFactory.new
    @factory.instance_exec(&block)
  end

  def self.create(model_sym)
    @factory
  end
end

class MyFactory
  def first_name
  end

  def last_name
  end
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"

Now there are no errors but, but there are also no values present in the output we see.

First name: 
Last name:

Let’s have a closer look and see exactly what user is.

user = MyFactoryBot.create(:user)
puts user.class.name
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"

The user object is an instance of MyFactory.

MyFactory
First name: 
Last name:

This was maybe a good intermediate step but we of course want user to be an instance of our Active Record User class, just like how the real Factory Bot does. Let’s change MyFactoryBot#create so that it returns an instance of User.

class MyFactoryBot
  def self.define(&block)
    instance_exec(&block)
  end

  def self.factory(model_sym, &block)
    @factory = MyFactory.new
    @factory.instance_exec(&block)
  end

  def self.create(model_sym)
    @factory.user
  end
end

class MyFactory
  attr_reader :user

  def initialize
    @user = User.new
  end

  def first_name
  end

  def last_name
  end
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
puts user.class.name
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"

This gives no errors but there are still no values present.

In order for the first_name and last_name methods to return values, let’s make it so each one calls the block that it’s given.

class MyFactoryBot
  def self.define(&block)
    instance_exec(&block)
  end

  def self.factory(model_sym, &block)
    @factory = MyFactory.new
    @factory.instance_exec(&block)
  end

  def self.create(model_sym)
    @factory.user
  end
end

class MyFactory
  attr_reader :user

  def initialize
    @user = User.new
  end

  def first_name(&block)
    @user.first_name = block.call
  end

  def last_name(&block)
    @user.last_name = block.call
  end
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
puts user.class.name
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"

Now we see “John” and “Smith” as our first and last name values.

User
First name: John
Last name: Smith

Generalizing the factory

What if we wanted to add this? It wouldn’t work.

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
    email { "john.smith@example.com" }
  end
end

Obviously we can’t just have hard-coded first_name and last_name methods in our factory. We need to make it so our factory can respond to any messages that are sent to it (provided of course that those messages correspond to actual attributes on our Active Record models).

Let’s take the first step toward generalizing our methods. Instead of @user.first_name = block.call, we’ll do @user.send("first_name=", block.call), which is equivalent.

class MyFactoryBot
  def self.define(&block)
    instance_exec(&block)
  end

  def self.factory(model_sym, &block)
    @factory = MyFactory.new
    @factory.instance_exec(&block)
  end

  def self.create(model_sym)
    @factory.user
  end
end

class MyFactory
  attr_reader :user

  def initialize
    @user = User.new
  end

  def first_name(&block)
    @user.send("first_name=", block.call)
  end

  def last_name(&block)
    @user.send("last_name=", block.call)
  end
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

user = MyFactoryBot.create(:user)
puts user.class.name
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"

We can go even further than this. Rather than having methods called first_name and last_name, we can use Ruby’s method_missing to dynamically respond to any message that gets sent.

class MyFactoryBot
  def self.define(&block)
    instance_exec(&block)
  end

  def self.factory(model_sym, &block)
    @factory = MyFactory.new
    @factory.instance_exec(&block)
  end

  def self.create(model_sym)
    @factory.user
  end
end

class MyFactory
  attr_reader :user

  def initialize
    @user = User.new
  end

  def method_missing(attr, *args, &block)
    # If the message that's sent is e.g. first_name, then
    # the value of attr will be :first_name, and the value
    # of "#{attr}=" will be "first_name=".
    @user.send("#{attr}=", block.call)
  end
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
    email { "john.smith@example.com" }
  end
end

user = MyFactoryBot.create(:user)
puts user.class.name
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"
puts "Email: #{user.email}"

If we run our file again, we can see that our method_missing code handles not only first_name and last_name but email as well.

First name: John
Last name: Smith
Email: john.smith@example.com

More generalization

What if we want to have more factories besides just one for User? I happen to have a model in my Rails app called Website. What if I wanted to have a factory for that?

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

MyFactoryBot.define do
  factory :website do
    name { "Google" }
    url { "www.google.com" }
  end
end

Right now, it wouldn’t work because I have the User class hard-coded in my factory.

undefined method `name=' for #<User:0x0000ffff8f950d28> (NoMethodError)

Let’s make it so that rather than hard-coding User, we set the class dynamically. Let’s also make it so that the argument you pass to the factory method (e.g. :user or :website) retries the appropriate factory. We can accomplish this by putting our factories into a hash.

class MyFactoryBot
  def self.define(&block)
    instance_exec(&block)
  end

  def self.factory(model_sym, &block)
    @factories ||= {}
    @factories[model_sym] = MyFactory.new(model_sym)
    @factories[model_sym].instance_exec(&block)
  end

  def self.create(model_sym)
    @factories[model_sym].record
  end
end

class MyFactory
  attr_reader :record

  def initialize(model_sym)
    @record = model_sym.to_s.classify.constantize.new
  end

  def method_missing(attr, *args, &block)
    @record.send("#{attr}=", block.call)
  end
end

MyFactoryBot.define do
  factory :user do
    first_name { "John" }
    last_name { "Smith" }
  end
end

MyFactoryBot.define do
  factory :website do
    name { "Google" }
    url { "www.google.com" }
  end
end

user = MyFactoryBot.create(:user)
puts user.class.name
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"
puts

website = MyFactoryBot.create(:website)
puts website.class.name
puts "Name: #{website.name}"
puts "URL: #{website.url}"

Now, when we run our file, both factories work.

First name: John
Last name: Smith
Email: john.smith@example.com

Name: Google
URL: www.google.com

Takeaways

  • Factory Bot syntax (and other DSLs) aren’t magic. They’re just Ruby.
  • Blocks can be a powerful way to make your code more expressive and understandable.
  • In Ruby, it’s often more useful to talk about “sending messages to objects” rather than “calling methods on objects”.
  • Every message sent in Ruby has a receiver. When a receiver is not explicitly specified, the receiver is a special object called main.
  • The method_missing method can allow your objects to respond to messages dynamically.

Close your browser tabs

When I teach programming classes or pair program with colleagues, I often encounter people with a whole bunch of browser tabs open at once.

I want to explain to you why having too many tabs open is bad.

The cost of tabs

Tabs cost mental RAM

As we work, we’re always juggling thoughts in our head. I call this our “mental RAM”. It’s a finite resource.

Each tab you have open occupies some space not only on your browser screen but also in your brain. There’s a part of you that wants to remember “don’t forget, I had that one Stack Overflow answer open in a tab in case I need to go back to it”. You might not be consciously aware of it but that open tab is taking up part of your precious mental RAM, RAM that could and should be used to do actual work, but instead is being wasted on thinking about tabs.

Tabs cost wasted moves

I’ve often had the experience where I’m pairing with someone and their open tabs cause them to go from tab to tab and go “whoops, not that one”, “whoops, not that one”, “whoops…” Each time this happens, my blood pressure rises a little bit. Maybe if you didn’t have 36 tabs open you wouldn’t be wasting so much of my time and yours.

Why people keep tabs open

Presumably, the justification for keeping a tab open is that 1) you think you’ll need the tab’s contents later, and 2) you think it will cost more to re-find the contents of that tab than to keep the tab open.

You’ll need it later (probably wrong)

Most of the tabs people keep open are never actually needed later.

It’s too costly to re-find the tab’s contents later (probably wrong)

Even if you do need the tab’s contents later, it’s usually way cheaper just to re-open the website.

So, close your fucking tabs!

If you have any tabs open right now that aren’t directly related to what you’re currently working on, I invite you to close them. They’re probably costing you more than they’re worth.

How I organize my Rails apps

Overview

Influenced by the experiences I’ve had last over many years of building and maintaining Rails applications, combined with my experiences using other technologies, I’ve developed some ways of structuring Rails applications that have worked out pretty well for me.

Some of my organizational tactics follow conventional wisdom, like keeping controllers thin. Other of my tactics are ones I haven’t really seen in others’ applications but wish I would.

Here’s an overview of the topics I touch on in this post.

  • Controllers
  • Namespaces
  • Models
  • ViewComponents
  • The lib folder
  • Concerns
  • Background jobs
  • JavaScript
  • Tests
  • Service objects
  • How I think about Rails code organization in general

Let’s start with controllers.

Controllers

The most common type of controller in most Rails applications is a controller that’s based on an Active Record resource. For example, if there’s a customers database table then there will also be a Customer model class and a controller called CustomersController.

The problem

Controllers can start to get nasty when there get to be too many “custom” actions beyond the seven RESTful actions of index, new, create, edit, update, show and destroy.

Let’s say we have a CustomersController that, among other things, allows the user to send messages about a customer (by creating instances of a Message, let’s say). The relevant actions might be called new_message and create_message. This is maybe not that bad, but it clutters up the controller a little, and if you have enough custom actions on a controller then the controller can get pretty messy and hard to comprehend.

The solution

What I like to do in these scenarios is create a “custom” controller called e.g. CustomerMessagesController. There’s no database table called customer_messages or class called CustomerMessage. The concept of a “customer message” is just something I made up. But now that this idea exists, my CustomersController#new_message and CustomersController#create_message actions can become CustomerMessagesController#new and CustomerMessagesController#create. I find this much tidier.

And as long as I’m at it, I’ll even create a PORO (plain old Ruby object) called CustomerMessage where I can handle the business of creating a new customer message as not to clutter up either Customer or Message with this stuff which is really not all that relevant to either of those classes. I might put a create or create! method on CustomerMessage which creates the appropriate Message for me.

Furthermore, I’ll also often put include ActiveModel::Model into my PORO so that I can bind the PORO to a form as though it were a regular old Active Record model.

Namespaces

Pieces of code are easier to understand when they don’t require you to also understand other pieces of code as a prerequisite. To use an extreme example to illustrate the point, it would obviously be impossible to understand a program so tangled with dependencies that understanding any of it required understanding all of it.

So, anything we can do to allow small chunks of our programs understandable in isolation is typically going to make our program easier to work with.

Namespaces serve as a signal that certain parts of the application are more related to each other than they are to anything else. For example, in the application I work on at work, I have a namespace called Billing. I have another namespace called Schedule. A developer who’s new to the codebase could look at the Billing and Schedule namespaces and rightly assume that when they’re thinking about one, they can mostly ignore the other.

Contexts

Some of my models are sufficiently fundamental that it doesn’t make sense to put them into any particular namespace. I have a model called Appointment that’s like this. An Appointment is obviously a scheduling concern a lot of the time, but just as often it’s a clinical concern or a billing concern. An appointment can’t justifiably be “owned” by any one namespace.

This doesn’t mean I can’t still benefit from namespaces though. I have a controller called Billing::AppointmentsController which views appointments through a billing lens. I have another controller called Chart::AppointmentsController which views appointments through a clinical lens. For scheduling, we have two calendar views, one that shows one day at a time and one that shows one month at a time. So I have two controllers for that: Schedule::ByDayCalendar::AppointmentsController and Schedule::ByMonthCalendar::AppointmentsController. Imagine trying to cram all this stuff into a single AppointmentsController. This idea of having namespaced contexts for broad models has been very useful.

Models

I of course keep my models in app/models just like everybody else. What’s maybe a little less common is the way I conceive of models. I don’t just think of models as classes that inherit from ApplicationRecord. To me, a model is anything that models something.

So a lot of the models I keep in app/models are just POROs. According to a count I did while writing this post, I have 115 models in app/models that inherit from ApplicationRecord and 439 that don’t. So that’s about 20% Active Record models and 80% POROs.

ViewComponents

Thanks to the structural devices that Rails provides natively (controllers, models, views, concerns, etc.) combined with the structural devices I’ve imposed myself (namespaces, a custom model structure), I’ve found that most code in my Rails apps can easily be placed in a fitting home.

One exception to this for a long time for me was view-related logic. View-related logic is often too voluminous and detail-oriented to comfortably live in the view, but too tightly coupled with the DOM or other particulars of the view to comfortably live in a model, or anywhere else. The view-related code created a disturbance wherever it lived.

The solution I ultimately settled on for this problem is ViewComponents. In my experience, ViewComponents can provide a tidy way to package up a piece of non-trivial view-related logic in a way that allows me to maintain a consistent level of abstraction in both my views and my models.

The lib folder

I have a rough rule of thumb is that if a piece of code could conceivably be extracted into a gem and used in any application, I put it in lib. Things that end up in lib for me include custom form builders, custom API wrappers, custom generators and very general utility classes.

Concerns

In a post of DHH’s regarding concerns, he says “Concerns are also a helpful way of extracting a slice of model that doesn’t seem part of its essence”. I think that’s a great way to put it and that’s how I use concerns as well.

Like any programming device, concerns can be abused or used poorly. I sometimes come across criticisms of concerns, but to me what’s being criticized is not exactly concerns but bad concerns. If you’re interested in what those criticisms are and how I write concerns, I wrote a post about it here.

Background jobs

I keep my background job workers very thin, just like controllers. It’s my belief that workers shouldn’t do things, they should only call things. Background job workers are a mechanical device, not a code organization device.

JavaScript

I use JavaScript as little as possible. Not because I particularly have anything against JavaScript, but because the less “dynamic” an application is, and the fewer technologies it involves, the easier I find it to understand.

When I do write JavaScript, I use a lot of POJOs (plain old JavaScript objects). I use Stimulus to help keep things organized. To test my JavaScript code, I exercise it using system specs. The way I see it, it’s immaterial from a testing perspective whether I implement my features using JavaScript or Ruby. System specs can exercise it all just fine.

Tests

Being “the Rails testing guy”, I of course write a lot of tests. I use RSpec not because I necessarily think it’s the best testing framework from a technical perspective but rather just to swim with the current. I practice TDD a lot of the time but not all the time. Most of my tests are model specs and system specs.

If you’re curious to learn more about how I do testing, you can read my many Rails testing articles or check out my Rails testing book.

Service objects

Since service objects are apparently so popular these days, I feel compelled to mention that I don’t use service objects. Instead, I use regular old OOP. What a lot of people might model as procedural service object code, I model as declarative objects. I write more about this here.

How I think about Rails code organization in general

I see Rails as an amazingly powerful tool to save me from repetitive work via its conventions. Rails also provides really nice ways of organizing certain aspects of code: controllers, views, ORM, database connections, migrations, and many other things.

At the same time, the benefits that Rails provides have a limit. One you find yourself past that limit (which inevitably does happen if you have a non-trivial application) you either need to provide some structure of your own or you’re likely going to end up with a mess. Specifically, once your model layer grows fairly large, Rails is no longer going to help you very much.

The way I’ve chosen to organize my model code is to use OOP. Object-oriented programming is obviously a huge topic and so I won’t try to convey here what I think OOP is all about. But I think if a Rails developer learns good OOP principles, and applies them to their Rails codebase, specifically in the model layer, then it can go a long way toward keeping a Rails app organized, perhaps more than anything else.

How and why to Dockerize your Rails app’s database

Why to Dockerize your database

Dockerizing helps ease the pain of dependencies

Getting a new developer set up with a Rails app (or any app) can be tedious. Part of the tedium is the chore of manually installing all the dependencies: Ruby, RVM, Rails, PostgreSQL, Redis, etc.

It would be really nice if that developer could just run a single command and have all that app’s dependencies installed on their computer rather than having to install dependencies manually.

This ideal is in fact possible if you fully Dockerize your app. If you want to, you can create a Docker setup that will make it so you don’t have to install anything manually: no Ruby, no Rails, no RVM, no PostgreSQL, no Redis, nothing. It’s very nice.

Fully Dockerizing has drawbacks

Unfortunately, fully Dockerizing a Rails app isn’t without trade-offs. When working with a Dockerized app, there’s a performance hit, there are some issues with using binding.pry, and system specs/system tests in such a way that you can see them run in a browser is next to impossible.

None of these obstacles is insurmountable, but if you don’t want to deal with these issues, you can choose to Dockerize just some of your app’s dependencies instead of all of them.

Partial Dockerization

The Docker setup I use at work is a hybrid approach. I let Docker handle my PostgreSQL and Redis dependencies. I install all my other dependencies manually. This makes it so I don’t have to live with the downsides of full Dockerization but I still get to skip installing some of my dependencies. Any dependency I can skip is a win.

The example I’m going to show you shortly is an even simpler case. Rather than Dockerizing PostgreSQL and Redis, we’re only going to Dockerize PostgreSQL. I’m doing it this way in the interest of showing the simplest possible example.

Dockerizing for development vs. production

I want to add a note for clarity. The Docker setups I’ve been discussing so far are all development setups. There are two ways to Dockerize an app: for a development environment and for a production environment. Development environments and production environments of course have vastly different needs and so a different Docker setup is required for each. In a production environment we wouldn’t run PostgreSQL and a Rails server on the same machine. We’d have a separate database server instead. So I want to be clear that this Docker setup is for development only.

How to Dockerize your database

In order to Dockerize our database we’re going to use Docker Compose. Docker Compose is a tool that a) lets you specify and configure your development environment’s dependencies, b) installs those dependencies for you, and c) runs those dependencies.

Initializing the Rails app

Before we do anything Docker-related, let’s initialize a new Rails app that uses PostgreSQL.

$ rails new my_app -d postgresql

Adding a Docker Compose config file

Here’s the Docker Compose config file. It’s called docker-compose.yml and goes at the project root. This file, again, is what specifies our development environment’s dependencies. I’ve annotated the file to help you understand what’s what.

# docker-compose.yml
---
version: '3.8'

# The "services" directive lists all the services your
# app depends on. In this case there's only one: PostgreSQL.
services:

  # We give each service an arbitrary name. I've called
  # our PostgreSQL service "postgresql".
  postgres:

    # Docker Hub hosts images of common services for
    # people to use. The postgres:13.1-alpine is an
    # image that uses the Alpine Linux distribution,
    # very lightweight Linux distribution that people
    # often use when Dockerizing development environments.
    image: postgres:13.1-alpine

    # PostgreSQL has to put its data somewhere. Here
    # we're saying to put the data in /var/lib/postgresql/data.
    # The "delegated" part specifies the strategy for
    # syncing the container's data with our host machine.
    # (Another option would be "cached".)
    volumes:
      - postgresql:/var/lib/postgresql/data:delegated

    # This says to make our PostgreSQL service available
    # on port 5432.
    ports:
      - "127.0.0.1:5432:5432"

    # This section specifies any environment variables
    # that we want to exist on our Docker container.
    environment:
      # Use "my_app" as our PostgreSQL username.
      POSTGRES_USER: my_app

      # Set POSTGRES_HOST_AUTH_METHOD to "trust" to
      # allow passwordless authentication.
      POSTGRES_HOST_AUTH_METHOD: trust

volumes:
  postgresql:
  storage:

Next we’ll have to change config/database.yml ever so slightly in order to get it to be able to talk to our PostgreSQL container. We need to set the username to my_app and set the host to 127.0.0.1.

default: &default
  adapter: postgresql
  encoding: unicode
  pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>

  # This must match what POSTGRES_USER was set to in docker-compose.yml.
  username: my_app

  # This must be 127.0.0.1 and not localhost.
  host: 127.0.0.1

development:
  <<: *default
  database: my_app_development

test:
  <<: *default
  database: my_app_test

production:
  <<: *default
  database: my_app_production
  username: my_app
  password: <%= ENV['DB_PASSWORD'] %>

init.sql

If we put a file called init.sql at the project root, Docker will find it and execute it. It’s necessary to have an SQL script that creates a user called my_app or else PostgreSQL will give us an error saying (truthfully) that there’s no user called my_app.

CREATE USER my_app SUPERUSER;

It’s very important that the init.sql file is in place before we proceed. If the init.sql file is not in place or not correct, it can be a difficult error to recover from.

Using the Dockerized database

Run docker-compose up to start the PostgreSQL service.

$ docker-compose up

Now we can create the database.

$ rails db:create

As long as the creation completed successfully, we can connect to the database.

$ rails db

Now we’re connected to a PostgreSQL database without having had to actually install PostgreSQL.

Rubyists to follow on Twitter

If you’d like to keep your finger on the pulse in the Ruby world, here’s a list of Rubyists you can follow. The list is in alphabetical order by last name.

The list

Bozhidar Batsov
Author of RuboCop.

Rob Bazinet
Rubyist, blogger.

Nate Berkopec
Rails performance expert, author of the Complete Guide to Rails Performance.

Mike Buckbee
Creator of Expedited Security.

Jason Charnes
Co-host of the Remote Ruby podcast.

Ken Collins
Principal Engineer at Custom Ink.

Dave Copeland
Author of Sustainable Web Development
with Ruby on Rails
.

Peter Cooper
Creator of Ruby Weekly.

Andy Croll
Organizer of Brighton Ruby.

Andrew Culver
Creator of Bullet Train.

Vladimir Dementyev
Developer at Evil Martians, blogger and conference speaker.

Andrea Fomera
Creator of Learn Hotwire by Building a Forum.

Rafael França
Rails core team member.

Noah Gibbs
Author of Rebuilding Rails.

Justin Gordon
Founder of ShakaCode.

Avdi Grimm
Creator of Ruby Tapas.

Kirk Haines
Principal DevRel at New Relic.

David Heinemeier Hansson
Creator of Ruby on Rails.

Michael Hartl
Creator of The Ruby on Rails Tutorial.

Nate Hopkins
Creator of CableReady and StimulusReflex.

Jemma Issroff
Writes and speaks on Ruby garbage collection, writes weekly tips for Ruby Weekly.

Ross Kaffenberger
Rubyist and blogger.

Ufuk Kayserilioglu
Production Engineering Manager at Shopify.

Dave Kimura
Creator of Drifting Ruby.

Brittany Martin
Host of the 5by5 Ruby on Rails Podcast.

Joe Masilotti
Creator of Mugshot Bot.

Andrew Mason
Co-host of the Remote Ruby podcast.

Yukihiro Matsumoto AKA Matz
Creator of the Ruby language.

Dan Mayer
Rubyist, blogger.

Adam McCrea
Creator of Rails Autoscale.

Sandi Metz
Author of POODR and 99 Bottles of OOP.

Penelope
Directory at Ruby Central, blogger.

Mike Perham
Creator of Sidekiq.

Chad Pytel
COO of thoughtbot.

Charles Nutter
Co-lead of the JRuby project.

Chris Oliver
Creator of Go Rails, Jumpstart, HatchBox and Rails Bytes. Co-host of the Remote Ruby podcast.

Aaron Patterson
Teller of jokes.

Noel Rappin
Author of Modern Front-End Development for Rails and other Rails-related books.

Frank Rietta
CEO of Rietta Inc..

Mike Rogers
I know Mike best for his Docker-Rails template.

Robby Russell
CEO of Planet Argon.

Richard Schneeman
Creator of CodeTriage, a maintainer of Puma.

Colleen Schnettler
Creator of Simple File Upload.

Justin Searls
Co-founder of Test Double.

Prathamesh Sonpatki
Prolific blogger and conference speaker.

Jesse Spevack
Rubyist, conference speaker.

Kelly Sutton
Blogger, conference speaker, engineer at Gusto.

Matt Swanson
Creator of Boring Rails.

Ernesto Tagwerker
Founder of FastRuby.io and OmbuLabs.

Eileen Uchitelle
Principal Engineer at GitHub, Rails core team member, conference speaker.

Brandon Weaver
Rubyist, blogger, conference speaker.

Jared White
Rubyist, blogger.

Samuel Williams
Member of Ruby core team.

Josh Wood
Co-founder of Honeybadger, co-host of FounderQuest.

Jake Yesbeck
Rubyist, blogger.

Ender Ahmet Yurt
Co-Organizer of Ruby Turkey.

Rob Zolkos
Rails developer, prolific help-provider in the Ruby on Rails Slack.

Jason Swett
That would be me.

A VCR + WebMock “hello world” tutorial

Introduction

VCR and WebMock are tools that help deal with challenges related to tests that make network requests. In this post I’ll explain what VCR and WebMock are and then show a “hello world”-level example of using the two tools.

Why VCR and WebMock exist

Determinism

One of the principles of testing is that tests should be deterministic. The passing or failing of a test should be determined by the content of the application code and nothing else. All the tests in the test suite should pass regardless of what order they were run in, what time of day they were run, or any other factor.

For this reason we have to run tests in their own encapsulated world. If we want determinism, we can’t let tests talk to the network, because the network (and things in the network) are susceptible to change and breakage.

Imagine an app that talks to a third-party API. Imagine that the tests hit that API each time they’re run. Now imagine that on one of the test runs, the third-party API happens to go down for a moment, causing our tests to fail. Our test failure is illogical because the tests are telling us our code is broken, but it’s not our code that’s broken, it’s the outside world that’s broken.

If we’re not to talk to the network, we need a way to simulate our interactions with the network so that our application can still behave normally. This is where tools like VCR and WebMock come in.

Production data

We also don’t want tests to alter actual production data. It would obviously be bad if we for example wrote a test for deleting users and then that test deleted real production users. So another benefit of tools like VCR and WebMock is that they save us from having to touch real production data.

The difference between VCR and WebMock

VCR is a tool that will record your application’s HTTP interactions and play them back later. Very little code is necessary. VCR tricks your application into thinking it’s receiving responses from the network when really the application is just receiving prerecorded VCR data.

WebMock, on the other hand, has no feature for recording and replaying HTTP interactions in the way that VCR does, although HTTP interactions can still be faked. Unlike VCR’s record/playback features, WebMock’s network stubbing is more code-based and fine-grained. In this tutorial we’re going to take a very basic look at WebMock and VCR to show a “hello world” level usage.

What we’re going to do

This post will serve as a simple illustration of how to use VCR and WebMock to meet the need that these tools were designed to meet: running tests that hit the network without actually hitting the network.

If you’d like to follow along with this tutorial, I’d suggest first setting up a Rails app for testing according to the instructions in this post.

The scenario

We’re going to write a small search feature that hits a third-party API. We’re also going to write a test that exercises that search feature and therefore hits the third-party API as well.

WebMock

Once we have our test in place we’ll install and configure WebMock. This will disallow any network requests. As a result, our test will stop working.

VCR

Lastly, we’ll install and configure VCR. VCR knows how to talk with WebMock. Because of this, VCR and WebMock can come to an agreement together that it’s okay for our test to hit the network under certain controlled conditions. VCR will record the HTTP interactions that occur during the test and then, on any subsequent runs of the test, VCR will use the recorded interactions rather than making fresh HTTP interactions for each run.

The feature

The feature we’re going to write for this tutorial is one that searches the NPI registry, a government database of healthcare providers. The user can type a provider’s first and last name, hit Search, and then see any matches.

Below is the controller code.

# app/controllers/npi_searches_controller.rb

ENDPOINT_URL = "https://npiregistry.cms.hhs.gov/api"
TARGET_VERSION = "2.1"

class NPISearchesController < ApplicationController
  def new
    @results = []
    return unless params[:first_name].present? || params[:last_name].present?

    query_string = {
      first_name: params[:first_name],
      last_name: params[:last_name],
      version: TARGET_VERSION,
      address_purpose: ""
    }.to_query

    uri = URI("#{ENDPOINT_URL}/?#{query_string}")
    response = Net::HTTP.get_response(uri)
    @results = JSON.parse(response.body)["results"]
  end
end

Here’s the template that goes along with this controller action.

<!-- app/views/npi_searches/new.html.erb -->

<h1>New NPI Search</h1>

<%= form_tag new_npi_search_path, method: :get do %>
  <%= text_field_tag :first_name, params[:first_name], class: "p-1" %>
  <%= text_field_tag :last_name, params[:last_name], class: "p-1" %>
  <%= submit_tag "Search", class: "p-1" %>
<% end %>

<% @results.each do |result| %>
  <div>
    <%= result["basic"]["name_prefix"] %>
    <%= result["basic"]["first_name"] %>
    <%= result["basic"]["last_name"] %>
    <%= result["number"] %>
  </div>
<% end %>

The test

Our test will be a short system spec that types “joel” and “fuhrman” into the first and last name fields respectively, clicks Search, then asserts that Joel Fuhrman’s NPI code (a unique identifier for a healthcare provider) shows up on the page.

# spec/system/npi_search_spec.rb

require "rails_helper"

RSpec.describe "NPI search", type: :system do
  it "shows the physician's NPI number" do
    visit new_npi_search_path
    fill_in "first_name", with: "joel"
    fill_in "last_name", with: "fuhrman"
    click_on "Search"

    # 1386765287 is the NPI code for Dr. Joel Fuhrman
    expect(page).to have_content("1386765287")
  end
end

If we run this test at this point, it passes.

Installing and Configuring WebMock

We don’t want our tests to be able to just make network requests willy-nilly. We can install WebMock so that no HTTP requests can be made without our noticing.

First we add the webmock gem to our Gemfile.

# Gemfile

group :development, :test do
  gem "webmock"
end

Second, we can create a new file at spec/support/webmock.rb.

# spec/support/webmock.rb

# This line makes it so WebMock and RSpec know how to talk to each other.
require "webmock/rspec"

# This line disables HTTP requests, with the exception of HTTP requests
# to localhost.
WebMock.disable_net_connect!(allow_localhost: true)

Remember that files in spec/support won’t get loaded unless you have the line in spec/rails_helper.rb uncommented that loads these files.

# spec/rails_helper.rb

# Make sure to uncomment this line
Dir[Rails.root.join('spec', 'support', '**', '*.rb')].sort.each { |f| require f }

Seeing the test fail

If we run our test again now that WebMock is installed, it will fail, saying “Real HTTP connections are disabled”. We also get some instructions on how to stub this request if we like. We’re not going to do that, though, because we’re going to use VCR instead.

Failures:

  1) NPI search shows the physician's NPI number
   Failure/Error: response = Net::HTTP.get_response(uri)

   WebMock::NetConnectNotAllowedError:
     Real HTTP connections are disabled. Unregistered request: GET https://npiregistry.cms.hhs.gov/api/?address_purpose=&first_name=joel&last_name=fuhrman&version=2.1 with headers {'Accept'=>'*/*', 'Accept-Encoding'=>'gzip;q=1.0,deflate;q=0.6,identity;q=0.3', 'Host'=>'npiregistry.cms.hhs.gov', 'User-Agent'=>'Ruby'}

     You can stub this request with the following snippet:

     stub_request(:get, "https://npiregistry.cms.hhs.gov/api/?address_purpose=&first_name=joel&last_name=fuhrman&version=2.1").
       with(
         headers: {
           'Accept'=>'*/*',
           'Accept-Encoding'=>'gzip;q=1.0,deflate;q=0.6,identity;q=0.3',
           'Host'=>'npiregistry.cms.hhs.gov',
           'User-Agent'=>'Ruby'
           }).
         to_return(status: 200, body: "", headers: {})

       ============================================================

Installing and Configuring VCR

First we’ll add the vcr gem to our Gemfile.

# Gemfile

group :development, :test do
  gem 'vcr'
end

Next we’ll add the following config file. I’ve added annotations so you can understand what each line is.

# spec/support/vcr.rb

VCR.configure do |c|
  # This is the directory where VCR will store its "cassettes", i.e. its
  # recorded HTTP interactions.
  c.cassette_library_dir = "spec/cassettes"

  # This line makes it so VCR and WebMock know how to talk to each other.
  c.hook_into :webmock

  # This line makes VCR ignore requests to localhost. This is necessary
  # even if WebMock's allow_localhost is set to true.
  c.ignore_localhost = true

  # ChromeDriver will make requests to chromedriver.storage.googleapis.com
  # to (I believe) check for updates. These requests will just show up as
  # noise in our cassettes unless we tell VCR to ignore these requests.
  c.ignore_hosts "chromedriver.storage.googleapis.com"
end

Adding VCR to our test

Now we can add VCR to our test by adding a block to it that starts with VCR.use_cassette "npi_search" do. The npi_search part is just arbitrary and tells VCR what to call our cassette.

# spec/system/npi_search_spec.rb

require "rails_helper"

RSpec.describe "NPI search", type: :system do
  it "shows the physician's NPI number" do
    VCR.use_cassette "npi_search" do # <---------------- add this
      visit new_npi_search_path
      fill_in "first_name", with: "joel"
      fill_in "last_name", with: "fuhrman"
      click_on "Search"

      expect(page).to have_content("1386765287")
    end
  end
end

Last time we ran this test it failed because WebMock was blocking its HTTP request. If we run the test now, it will pass, because VCR and WebMock together are allowing the HTTP request to happen.

If we look in the spec/cassettes directory after running this test, we’ll see that there’s a new file there called npi_search.yml. Its contents look like the following.

---
http_interactions:
- request:
    method: get
    uri: https://npiregistry.cms.hhs.gov/api/?address_purpose=&first_name=joel&last_name=fuhrman&version=2.1
    body:
      encoding: US-ASCII
      string: ''
    headers:
      Accept-Encoding:
      - gzip;q=1.0,deflate;q=0.6,identity;q=0.3
      Accept:
      - "*/*"
      User-Agent:
      - Ruby
      Host:
      - npiregistry.cms.hhs.gov
  response:
    status:
      code: 200
      message: OK
    headers:
      Date:
      - Sat, 17 Apr 2021 11:13:41 GMT
      Content-Type:
      - application/json
      Strict-Transport-Security:
      - max-age=31536000; includeSubDomains
      Set-Cookie:
      - TS017b4e40=01acfeb9489bd3c233ef0e8a55b458849e619bdc886c02193c4772ba662379fa1f8493887950c06233f28bbbaac373afba8b58b00f;
        Path=/; Domain=.npiregistry.cms.hhs.gov
      Transfer-Encoding:
      - chunked
    body:
      encoding: UTF-8
      string: '{"result_count":1, "results":[{"enumeration_type": "NPI-1", "number":
        1386765287, "last_updated_epoch": 1183852800, "created_epoch": 1175472000,
        "basic": {"name_prefix": "DR.", "first_name": "JOEL", "last_name": "FUHRMAN",
        "middle_name": "H", "credential": "MD", "sole_proprietor": "YES", "gender":
        "M", "enumeration_date": "2007-04-02", "last_updated": "2007-07-08", "status":
        "A", "name": "FUHRMAN JOEL"}, "other_names": [], "addresses": [{"country_code":
        "US", "country_name": "United States", "address_purpose": "LOCATION", "address_type":
        "DOM", "address_1": "4 WALTER E FORAN BLVD", "address_2": "SUITE 409", "city":
        "FLEMINGTON", "state": "NJ", "postal_code": "088224664", "telephone_number":
        "908-237-0200", "fax_number": "908-237-0210"}, {"country_code": "US", "country_name":
        "United States", "address_purpose": "MAILING", "address_type": "DOM", "address_1":
        "4 WALTER E FORAN BLVD", "address_2": "SUITE 409", "city": "FLEMINGTON", "state":
        "NJ", "postal_code": "088224664", "telephone_number": "908-237-0200", "fax_number":
        "908-237-0210"}], "taxonomies": [{"code": "207Q00000X", "desc": "Family Medicine",
        "primary": true, "state": "NJ", "license": "25MA05588600"}], "identifiers":
        []}]}'
  recorded_at: Sat, 17 Apr 2021 11:13:41 GMT
recorded_with: VCR 6.0.0

Each time this test is run, VCR will ask, “Is there a cassette called npi_search?” If not, VCR will allow the HTTP request to go out, and a new cassette will be recorded for that HTTP request. If there is an existing cassette called npi_search, VCR will block the HTTP request and just use the recorded cassette in its place.

Takeaways

  • Tests should be deterministic. The passing or failing of a test should be determined by the content of the application code and nothing else.
  • We don’t want tests to be able to alter production data.
  • WebMock can police our app to stop it from making external network requests.
  • VCR can record our tests’ network interactions for playback later.

Rails model spec tutorial, part two

Prerequisites

If you’d like to follow along in this tutorial, I recommend first setting up a Rails application according to my how I set up a Rails application for testing
post. (To make things easier, you could use Instant Rails, a tool I created for generating Rails applications.) It doesn’t matter what you call the application.

The tutorial

Learning objectives

Before we list the learning objectives for Part Two of the tutorial, let’s review the learning objectives for Part One.

  1. How to come up with test cases for a model based on the model’s desired behavior
  2. How to translate those test cases into actual working test code, in a methodical and repeatable manner
  3. How to use a test-first approach to make it easier both to write the tests and to write the application code

In Part One we dealt with plain old Ruby objects (POROs) rather than actual Rails models. Without having done that, it might be unclear where the testing principles stopped and the Rails-specific work began.

In Part Two of the tutorial we’ll layer on the Rails work so you can easily tell which parts are which.

The scenario

We’ll be working on the exact same scenario as Part One: normalizing messy phone numbers. We’ll even be using all the exact same test cases. The reason we’re keeping those things the same is to show the Rails-models-versus-POROs differences in sharp relief.

The PhoneNumber model

As mentioned in the Prerequisites section, you’ll need to create a fresh Rails according to my how I set up a Rails application for testing
post.

Once we’ve done that, we can generate a new model called PhoneNumber.

$ rails g model phone_number value:string
$ rails db:migrate

This will automatically generate a test file at spec/models/phone_number_spec.rb.

# spec/models/phone_number_spec.rb

require "rails_helper"

RSpec.describe PhoneNumber, type: :model do
  pending "add some examples to (or delete) #{__FILE__}"
end

Our first test case

Just like the first test case in Part One, the first test here will verify that a number like 555-856-8075 gets stripped down to 5558568075.

Unlike the test in Part One where instantiating a PhoneNumber object just involved doing PhoneNumber.new, we’ll be using Factory Bot in this test to create an instance of the PhoneNumber model. We could have gotten away with just doing PhoneNumber.new(value: "555-856-8075") instead of using Factory Bot in this instance, but using Factory Bot is so common in Rails model tests that I wanted to show an example using it.

# app/models/phone_number.rb

require "rails_helper"

RSpec.describe PhoneNumber, type: :model do
  context "phone number contains dashes" do
    it "strips out the dashes" do
      phone_number = FactoryBot.create(
        :phone_number,
        value: "555-856-8075"
      )

      expect(phone_number.value).to eq("5558568075")
    end
  end
end

If we run this test it will fail because we haven’t yet added any code to strip out dashes.

Failures:

  1) PhoneNumber phone number contains dashes strips out the dashes
     Failure/Error: expect(phone_number.value).to eq("5558568075")
     
       expected: "5558568075"
            got: "555-856-8075"
     
       (compared using ==)

Let’s strip out the non-numeric characters via a before_validation callback.

# app/models/phone_number.rb

class PhoneNumber < ApplicationRecord
  before_validation :strip_non_numeric_from_value

  def strip_non_numeric_from_value
    self.value = self.value.gsub(/\D/, "")
  end
end

If we run the test again, it passes.

The other two formats

Now let’s add a test scenario for the format of (555) 856-8075.

require "rails_helper"

RSpec.describe PhoneNumber, type: :model do
  context "phone number contains dashes" do
    it "strips out the dashes" do
      phone_number = FactoryBot.create(
        :phone_number,
        value: "555-856-8075"
      )

      expect(phone_number.value).to eq("5558568075")
    end
  end

  context "phone number contains parentheses" do
    it "strips out the non-numeric characters" do
      phone_number = FactoryBot.create(
        :phone_number,
        value: "(555) 856-8075"
      )

      expect(phone_number.value).to eq("5558568075")
    end
  end
end

If we run this test, we’ll see that it already passes, thanks to the before_validation callback we added above.

Let’s now add the final format, +1 555 856 8075.

require "rails_helper"

RSpec.describe PhoneNumber, type: :model do
  context "phone number contains dashes" do
    it "strips out the dashes" do
      phone_number = FactoryBot.create(
        :phone_number,
        value: "555-856-8075"
      )

      expect(phone_number.value).to eq("5558568075")
    end
  end

  context "phone number contains parentheses" do
    it "strips out the non-numeric characters" do
      phone_number = FactoryBot.create(
        :phone_number,
        value: "(555) 856-8075"
      )

      expect(phone_number.value).to eq("5558568075")
    end
  end

  context "phone number contains country code" do
    it "strips out the country code" do
      phone_number = FactoryBot.create(
        :phone_number,
        value: "+1 555 856 8075"
      )

      expect(phone_number.value).to eq("5558568075")
    end
  end
end

This one does not pass. Even though we’re stripping out non-numeric characters, we’re not stripping out country codes.

Failures:

  1) PhoneNumber phone number contains country code strips out the country code
     Failure/Error: expect(phone_number.value).to eq("5558568075")
     
       expected: "5558568075"
            got: "15558568075"
     
       (compared using ==)

We can make this test pass using the same exact code we used in Part One.

class PhoneNumber < ApplicationRecord
  before_validation :strip_non_numeric_from_value

  def strip_non_numeric_from_value
    self.value = self.value.gsub(/\D/, "")
      .split("")
      .last(10)
      .join
  end
end

And also just like in Part One, we don’t want the magic number of 10 sitting there. Let’s assign that number to a constant.

class PhoneNumber < ApplicationRecord
  EXPECTED_NUMBER_OF_DIGITS = 10
  before_validation :strip_non_numeric_from_value

  def strip_non_numeric_from_value
    self.value = self.value.gsub(/\D/, "")
      .split("")
      .last(EXPECTED_NUMBER_OF_DIGITS)
      .join
  end
end

Refactoring

There’s a small way we can make our test a little tidier.

Rather than repeatedly creating a new phone_number variable using FactoryBot.create, we can DRY up our code a little by putting the FactoryBot.create in a let! block at the beginning and then updating the phone number value for each test.

require "rails_helper"

RSpec.describe PhoneNumber, type: :model do
  let!(:phone_number) do
    FactoryBot.create(:phone_number)
  end

  context "phone number contains dashes" do
    before { phone_number.update!(value: "555-856-8075") }

    it "strips out the dashes" do
      expect(phone_number.value).to eq("5558568075")
    end
  end

  context "phone number contains parentheses" do
    before { phone_number.update!(value: "(555) 856-8075") }

    it "strips out the non-numeric characters" do
      expect(phone_number.value).to eq("5558568075")
    end
  end

  context "phone number contains country code" do
    before { phone_number.update!(value: "+1 555 856 8075") }

    it "strips out the country code" do
      expect(phone_number.value).to eq("5558568075")
    end
  end
end

Takeaways

Rails model tests can be written by coming up with a list of desired behaviors and translating that list into test code.

When learning how to write Rails model tests, it can be helpful to first do some tests with plain old Ruby objects (POROs) for practice.

Writing tests before we write the application code can make the process of writing the application code easier.