As a Rails application grows, its controllers tend to accumulate actions beyond the seven RESTful actions (index, show, new, edit, create, update and destroy). The more “custom” actions there are, the harder it can be to understand and work with the controller.
Here are three tactics I use to keep my Rails controllers organized.
First, a note about “skinny controllers, fat models”
The concept of “skinny controllers, fat models” is well-known in the Rails community at this point. For the sake of people who are new to Rails, I want to mention that one good way to keep controllers small is to put complex business logic in models rather than controllers. For more on this topic, I might suggest my posts “What is a Rails model?” and (since service objects are a common but I think misguided recommendation) “How I code without service objects“, as well as the original skinny controllers/fat models post by Jamis Buck.
But even if you put as much business logic as possible in models rather than controllers, you’re still left with some challenges regarding overly large controllers. Here are the main three tactics I use to address these challenges.
Tactic 1: extracting a “hidden resource”
Sometimes, when a controller collects too many custom actions, it’s a sign that there’s some “hidden resource” that’s waiting to be identified and abstracted.
For example, in the application that I work on at work, we have an Active Record model called Message which exists so that internal employees who use the application can message each other. At one point we added the concept of a PTO request, which under the hood is really just a Message, but created through a different UI than regular messages.
We could have put these PTO-request-related actions right inside the main MessagesController but that would have made MessagesController too big and muddy. Instead of just being about regular messages, MessagesController would contain some code about regular messages, some code about PTO requests, and some code that relates to both things. So we didn’t want to do that.
Instead, we created a separate controller called PTORequestsController. Even though we decided to have a resource called a PTO request, we didn’t create a separate Active Record model for that. The PTORequestsController just uses the Message model. Here’s what the controller looks like.
module Messaging
class PTORequestsController < ApplicationController
def new
@message = Message.new
end
def create
@message = Message.new(
message_params.merge(
sender: current_user,
recipients: [User.practice_administrator],
body: "PTO request from #{current_user.first_name} #{current_user.last_name}\n\n#{@message.body}"
)
)
if @message.save
@message.send_any_email_notifications(@message)
redirect_to submitted_messaging_pto_requests_path
else
render :new
end
end
def submitted
end
private
def message_params
params.require(:messaging_message).permit(:body)
end
end
end
Sometimes, like in this case, the “hidden resource” is pretty obvious from the start and so the original controller can just be left untouched. Other times (and probably more commonly), the original controller slowly grows over time and then a “last straw” moment prompts us to identify a hidden resource and move that resource to a new controller. Sometimes it’s easy to identify a hidden resource and sometimes it’s not.
Tactic 2: same resource, different “lenses”
To review the “hidden resource” example, we had one resource (messages) and then we added a new resource (PTO requests). Making the distinction between messages and PTO requests allowed us to think about the two resources separately and keep their code in separate places. This allowed “regular messages” and PTO requests to be thought about and worked with separately, lowering the overall cognitive cost of the code.
This second tactic applies to a different scenario. Sometimes we don’t want to have a different resource but rather we want to treat the same resource differently. I’ll give another example from the application that I work on at work.
In this application, which is an application for running a medical clinic, we have the concept of an appointment, which can have different meanings in different contexts. For example, in a scheduling context, we care about what time the appointment is for. In a clinical context, we care about the notes the doctor makes regarding the patient’s condition. In a billing context, we care about the charges associated with the appointment.
Early in the application’s life, it was okay for there just to be one single AppointmentsController. But over time AppointmentsController started to get cluttered and harder to understand. So we added a couple new controllers, Billing::AppointmentsController and Chart::AppointmentsController, so that each of these concerns could be dealt with separately. As I’m writing this post, I even realize that it would probably be smart for us to rename AppointmentsController to Schedule::AppointmentsController because almost everything that’s in AppointmentsController is related to scheduling.
Unlike the case of messages and PTO requests, the idea here is not to come up with a new resource but rather to look at the same resource through different lenses. There’s no separate model called Billing::AppointmentsController or Chart::AppointmentsController. The benefit comes from being able to have separate places to deal with separate contexts for the same model.
Tactic 3: dealing with collections separately
This one is a simple one but useful and common enough to be worth mentioning. Sometimes I end up with controllers with actions like bulk_create, bulk_update, etc. in addition to the regular create and update actions. In this case I often create a “collections” controller.
For example, in my application I have a Billing::InsurancePaymentsController and also a Billing::InsurancePaymentCollectionsController. Here’s what the latter controller looks like.
module Billing
class InsurancePaymentCollectionsController < ApplicationController
before_action { authorize %i(billing insurance_payment) }
def create
@insurance_deposit = InsuranceDeposit.find_by(id: params[:insurance_deposit_id])
if params[:file].present?
RemittanceFileParser.new(
content: params[:file].read,
insurance_deposit: @insurance_deposit
).perform
redirect_to new_billing_insurance_deposit_insurance_deposit_reconciliation_path(
insurance_deposit_id: @insurance_deposit.id
)
else
redirect_to request.referer
end
end
def destroy
InsurancePayment.where(id: params[:ids]).destroy_all
redirect_to request.referer
end
end
end
Takeaways
Controllers often have a tendency to grow over time. When they do, they usually become hard to understand.
It’s helpful to put as much business logic as possible in models rather than controllers.
A large controller can sometimes be made smaller by extracting a “hidden resource” that uses the same Active Record model but clothed in a different idea.
Another way that large controllers can sometimes be broken up is to think about looking at the same resource through different lenses.
It can also sometimes be helpful to deal with “bulk actions” in a separate controller.
I’ve decided to rename my podcast, formerly known as Rails with Jason, to Code with Jason.
Such a pivotal event in world history of course calls for some commentary. Here’s what you can expect to be different on the show, what you can expect to stay the same, and why I’ve decided to make this change.
What will change
As the name change implies, the scope of the podcast will no longer be limited to just Rails. This is actually less of a change of direction and more of an acknowledgement of existing reality. It’s already the case that probably half or more of my content is not Rails-specific.
One consequence of this change that I’m excited about is that I’ll be able to dramatically broaden the palette of guests that I can have on the show. When the podcast was explicitly Rails-focused, I had reservations about inviting “big names” from outside the Rails world on the show because I didn’t imagine that being on a Rails podcast would necessarily fit into their plans. Now that that limit is gone, I’ll be much more comfortable inviting any guest at all.
What will stay the same
The format of the show will still be the exact same. I’ll still have guests on for (hopefully) interesting technical conversations.
I’m still going to talk about Ruby on Rails. I myself am still a Rails developer and I expect to remain so indefinitely. If you’re a Rails developer and you’re wondering if the show will still be relevant to you, my hope and expectation is that it will.
Lastly, I’ll of course continue to charm you with my hilarious jokes and dazzle you with my towering intellect. As if I could stop if I wanted to.
Why I’m making this change
My motivation for expanding the scope of my podcast is pretty simple. I want to reach more people and broaden the possibilities for what the show can be. The more programmers I can help and the more I can help them, and the broader the field of topics that can be explored, the more fun and worthwhile of an endeavor this will be for me.
If you’re a listener of the podcast, thanks for listening so far. I hope you’ll join me in this next chapter of the show.
The way I look at productivity is this: if I have to be at work all day, I might as well get as much done in that chunk of time as I can.
It’s not about running myself ragged. It’s not about “hustling”. It’s simply about not being wasteful with my time.
In fact, my goal is to accomplish more than average developers do, but with less time and effort.
Why bother being productive? Because the more work I’m able to get done, the faster I’ll be able to learn, the smarter I’ll get, the better my job will be, the better my career will be, the more money I’ll earn, and so on. And if I can achieve all that at the same time as reducing effort, then it’s an obvious win.
Here are some high-level productivity tips I’ve learned over the years.
Start your day early
The most helpful productivity tip I’ve learned in life sounds dumb but it actually works: get up early and go to bed early. Ben Franklin was right about that one.
I’m not sure why this tactic works. One hypothesis I have is that if you start your day earlier than other people, then you’re starting your day before other people start interrupting you.
In any case, I don’t care why it works, I just care that it works.
Start real work right away
The first hour of the day has been called the rudder of the day. I’ve found that if my first hour is a productive one, then the rest of the day will be productive also, all the way through. Conversely, if my first hour is lazy and unfocused, the entire rest of the day will be lazy and unfocused as well.
Don’t start your day by checking email, reading news, scrolling Twitter, or anything like that. These activities can cause a state that Mihaly Csikszentmihalyi calls “psychic entropy”. In other words, those activities scramble your brain.
Instead, start doing real work, immediately when you sit down at your computer. It sets a good tone for the day.
Start with something highly tractable
The more ambiguous or open-ended a task is, the harder it will be to get straight to work on that task, and the harder it will be to get that first hour of solid productivity under your belt. So don’t start with a task like that. Start with a task that’s sharply defined.
This is actually not necessarily very easy. In order to have a sharply defined task, someone (possibly you) has to have done some work in advance. In the best case, you have a fine-grained to-do list laid out, which originated from a small and crisp user story, which is part of a clearly-defined project.
We’ll talk more in a moment about how you can ensure that you always have highly tractable work to start your day with. First, a few more “meta” tips.
When possible, exercise before work
On the days when I exercise before work, I often feel sharper and more alert than when I don’t. I don’t drink coffee on a daily basis anymore, but when I used to, I would feel less of a need for coffee in the morning on the days when I exercised beforehand.
Since I hate wasting time, I don’t like to go to the gym, which seems like a really time-inefficient way to exercise. Instead I like to ride my bike to the office. I also have some dumbbells in my office so I can lift weights throughout the day. Obviously, you can do whatever type of exercise works for you.
Don’t eat too much
Eating too much can be an energy-killer. I’ve found that eating too much can make me tired, lazy, and just put me in a worse mood. And of course, eating too much habitually can make you fat. That’s obviously bad.
I used to go out to eat with co-workers for lunch almost every day. Then, around 2pm, the greasy tacos or burger and fries I ate for lunch would catch up to me and I would get so sleepy that I would want to die.
Later in life I started packing a lunch instead of going out to eat. Restaurants always give you too much food. Homemade food is usually not as bad for you as restaurant food. I didn’t feel as bad after a packed lunch as I used to from restaurant food, although I would still get tired a lot, especially if my lunch happened to be leftovers from a heavy dinner like meatloaf. I typically felt pretty good after lunch if I ate a salad instead.
Here’s what I do today. You might think this is crazy, but it actually works out really well for me. I just don’t eat anything until dinnertime. In other words, I skip both breakfast and lunch. I’ve found that I don’t get nearly as hungry as I would expect. Ironically, I find myself much less distracted by hunger throughout the day than I used to when I used to eat lunch. I also feel much more alert than when I used to eat food during the day. Plus, as you might expect, I’ve lost some weight as an added bonus.
Don’t drink too much caffeine
With caffeine, the productivity highs are higher but the lows are lower, at least for me. I personally seem to be especially susceptible to the lows of caffeine. When I switched from coffee to tea (black tea which still contains caffeine, much less than coffee), I noticed that I slept better and felt better during the day. As a result of the fact that I’m not jacked up half the time and lethargic half the time, I feel like I’m smarter on average than when I used to ride the caffeine rollercoaster.
In my experience, the biggest drawback of caffeine is that it negatively affects my sleep.
When possible, keep email, Slack, Twitter, etc. closed
Distractions and interruptions are obviously bad for productivity. Keep these things closed when you can. Despite how obvious this advice sounds, my perception is that a lot of people don’t follow it.
Keep your browser tabs to a minimum
Each browser tab you have open has costs. First, a browser tab costs attention. When you have a tab open, you’re assigning a little bit of “mental RAM” to that tab. That’s a little bit of precious mental RAM that can’t be used for something else, something more useful.
A browser tab often also costs time. I wish I had a dollar for every time I was sitting with a student or co-worker and they click through their various tabs, trying to find the tab they’re interested in. So wasteful.
Instead of keeping a bunch of browser tabs open on the off chance that you’ll need to get back to their contents at some point, just close them. The net cost of re-finding any content you need later is way less than the net cost of always keeping a bunch of browser tabs open.
Work on one thing at a time
The fastest way to get a bunch of things done is to work on one thing at a time.
When you pause task A in order to work on task B, you’re giving yourself an opportunity to forget the details of task A. Then, when you resume task A, you have to refamiliarize yourself with the details of that task. That’s a waste. You could have just loaded those details into your head once instead of twice.
Of course, it’s not always possible to keep on a task until it’s all the way done. Sometimes you have to pause to wait for feedback or for a long-running command, for example.
In these situations I’ve found a way to mitigate the context-switching costs. If I have to switch to a different task, then instead of switching to a different programming task, I’ll work on something that’s entirely different in nature. For example, if I expect to have to wait just a few minutes or up to an hour, I might use that time to read a programming book or work on a blog post. If the task I switch to is totally different, it doesn’t compete for headspace the way a similar task would.
If for some reason I have to set down a task for a few hours or more, then I usually just suck it up and switch to a different programming task. But that’s a last resort, not a Plan A. And the need to switch tasks can be minimized by using good development practices like small, crisp user stories that are shovel-ready by the time developers start to work on them.
Practice automated testing
The productivity benefits of automated testing are numerous. I’ll list some of them.
First and most obvious, writing automated tests saves you from having to do as much manual testing. Now that I know about testing, I dread the idea of coding a feature the old way, where I have to perform a series of manual testing actions after each change I make.
Second, testing forces you to articulate and justify every piece of code you add. testing makes it harder to violate YAGNI. YAGNI violations are pure waste.
Third, testing often helps you to think of all the use cases you need to exercise for your feature. When writing tests, it actually becomes fun to try to exhaustively list all the scenarios under which your feature could possibly fail.
Fourth, the test suite that results from testing helps protect against regressions. Regressions cost time.
Lastly, testing has a tendency to improve the understandability of code. The reason is that code that’s easy to test often takes the form of small and loosely-coupled classes and methods. It also just so happens that small and loosely-coupled classes and methods are easier to understand than large and interwoven classes and methods.
Try hard to write code that’s easy to understand
As Bob Martin has said, “the only way to go fast is to go well”. After all, the reason we call good code “good code” is because good code is faster and less expensive to work with than bad code.
Remember not to fall into the fallacy that you can gain speed by cutting corners. Every extra hour that you spend doing worthwhile refactoring (key word “worthwhile”) saves three hours of future confusion.
Always know what you’re working on
One fairly guaranteed way to not accomplish much is to not even know what you’re trying to accomplish.
The best way of keeping track of what you’re working on is to write down what you’re trying to achieve. This can be a written statement in an automated test (another benefit of testing) or just a note in a note-taking program or even a note on a piece of paper.
The more specific your objective is, the easier it will be to accomplish. Half the difficulty in doing any piece of work is determining exactly what needs to be done.
Keep a to-do list
Mental RAM is a previous resource. It’s wasteful to use up your mental RAM by trying to remember all the things you have to do. Instead, write those things down.
This is another practice that sounds simple and obvious but is often not followed.
End the day with a plan for tomorrow
The ideal morning is one where you can sit down and immediately begin working. If you have to begin your day by making a difficult decision—the decision of what to work on among the infinite possibilities in front of you—then your morning is probably going to go worse, and your day will probably go worse as a result.
So, each day, try to make at least a vague note for the next day to remind yourself what you want to work on tomorrow. If you don’t want to box yourself in, remind yourself that you can always change your mind.
End the day with a deliberate loose end
I used to prefer to stop working when I reached a good “stopping point”. These days I deliberately avoid stopping at a good stopping point.
Instead, I leave a loose end that I can pick up on the next day. One tactic I like to use is to write a failing test so that my obvious first task for the next morning is to get that test to pass.
Takeaways
If you have to be at work all day, might as well get as much done as possible.
Get up early and go to bed early. For some reason it helps you accomplish more, even if you work the same amount of time.
When you sit down in the morning, start real work right away. Don’t start with email, news or social media.
Start with a piece of work that’s tractable rather than something ambiguous. This will help you gain momentum faster.
When possible, exercise before work. It will probably increase your cognitive capabilities for the day.
Don’t eat too much during the day. Especially avoid “heavy” foods. Eating too much can kill your mood, energy and cognitive abilities.
Don’t drink too much caffeine. Caffeine provides a “local high” but in my experience the drawbacks of too much caffeine, including especially the negative impact on sleep, make high caffeine intake not worth it.
Keep email, Slack, Twitter, etc. closed.
Keep your browser tabs to a minimum. They’re not worth what they cost.
The fastest way to get a bunch of things done is to work on one thing at a time.
Practice test-driven development. TDD helps protect from regressions, helps improve the understandability of your code, and helps keep you focused.
Try to write code that’s easy to understand. It’s faster to work with easy-to-understand code than hard-to-understand code. Even though it can take more time and effort to write clean code than messy code, the net effect is a great time savings.
Always know what you’re working on. You’re unlikely to accomplish much when you don’t even know what you’re trying to accomplish.
Keep a to-do list rather than trying to hold all your to-dos in your head, wasting precious “mental RAM”.
End the day with a plan for tomorrow so that you don’t have to spend the first part of tomorrow figuring out what you’re going to do.
End the day with a deliberate loose end so that it’s easy to hit the ground running the next day.
When you look at a Factory Bot factory definition, the syntax might look somewhat mysterious. Here’s an example of such a factory.
FactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
The goal of this tutorial is to demystify this syntax. The way we’ll do this is to write our own implementation of Factory Bot from scratch. Or, more precisely, we’ll write an implementation of something that behaves indistinguishably from Factory Bot for a few narrow use cases.
Concepts we’ll learn about
Blocks
Factory Bot syntax makes heavy use of blocks. In this post we’ll learn a little bit about how blocks work.
Message sending
We’ll learn the nuanced distinction between calling a method on an object and sending a message to an object.
method_missing
We’ll learn how to use Ruby’s method_missing feature so that we can define methods for objects dynamically.
Our objective
Our goal with this post will be to write some code that makes the factory below actually work. Notice that this factory is indistinguishable from a Factory Bot factory except for the fact that it starts with MyFactoryBot rather than FactoryBot.
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
In addition to this factory code, we’ll also need some code that exercises the factory.
Exercising the factory
Here’s some code that will exercise our factory to make sure it actually works. Just like how our factory definition mirrors an actual Factory Bot factory definition, the below code mirrors how we would use a Factory Bot factory.
After we get our factory working properly, the above code should produce the following output.
First name: John
Last name: Smith
Let’s get started.
How to follow along
If you’d like to code along with this tutorial, I have a sample project that you can use.
The code in this tutorial depends on a certain Rails project, so I’ve created a GitHub repo at https://github.com/jasonswett/my_factory_bot where there’s a Dockerized Rails app, the same exact Rails app that I used to write this tutorial. You’ll find instructions to set up the project on the GitHub page.
The approach
We’ll be writing our “Factory Bot clone” code using a (silly) development methodology that I call “error-driven development”. Error-driven development works like this: you write a piece of code and try to run it. If you get an error when you run the code, you write just enough code to fix that particular error, and nothing more. You repeat this process until you have the result you want.
The reason I sometimes like to code this way is that it prevents me from writing any code that hasn’t been (manually) tested. Surprisingly enough, this “methodology” actually works pretty well.
Building the factory
The first thing to do is to create a file called my_factory_bot.rb and put it at the Rails project root.
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"
Then we’ll run the code like this:
$ rails run my_factory_bot.rb
The first thing we’ll see is an error saying that MyFactoryBot is not defined.
uninitialized constant MyFactoryBot (NameError)
This is of course true. We haven’t yet defined something called MyFactoryBot. So, in the spirit of practicing “error-driven development”, let’s write enough code to make this particular error go away and nothing more.
class MyFactoryBot
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"
Now, if we run the code again (using the same rails run command from above), we get a different error.
undefined method `define' for MyFactoryBot:Class (NoMethodError)
This is also true. The MyFactoryBot class doesn’t have a method called define. So let’s define it.
class MyFactoryBot
def self.define
end
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"
Now we get a new error.
undefined method `create' for MyFactoryBot:Class (NoMethodError)
This of course comes from the user = MyFactoryBot.create(:user) line. Let’s define a create method in order to make this error go away. Since we’re passing in an argument, :user, when we call create, we’ll need to specify a parameter for the create method. I’m calling the parameter model_sym since it’s a symbol that corresponds to the model that the factory is targeting.
class MyFactoryBot
def self.define
end
def self.create(model_sym)
end
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"
Now we get an error for the next line.
undefined method `first_name' for nil:NilClass (NoMethodError)
We’ll deal with this error, but not just yet, because this one will be a little bit tricky, and there are some certain other things that will make sense to do first. Let’s temporarily comment out the lines that call first_name and last_name.
class MyFactoryBot
def self.define
end
def self.create(model_sym)
end
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
#puts "First name: #{user.first_name}"
#puts "Last name: #{user.last_name}"
Now, if we run the file again, we get no errors.
Making it so the factory block gets called
Right now, the block inside of MyFactoryBot.define isn’t getting used at all. Let’s add block.call to the defined method so that the block gets called.
class MyFactoryBot
def self.define(&block)
block.call
end
def self.create(model_sym)
end
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
#puts "First name: #{user.first_name}"
#puts "Last name: #{user.last_name}"
Now when we run the file we get the following error.
undefined method `factory' for main:Object (NoMethodError)
It makes sense that we would get an error that says “undefined method factory“. We of course haven’t defined any method called factory
The receiver for the “factory” method
Notice how the error message says undefined method `factory' for main:Object. What’s the main:Object part all about?
Sending a message vs. calling a method
In Ruby you’ll often hear people talk about “sending a message to an object” rather than “calling a method on an object”. The distinction between these things is subtle but significant.
The a variable is an instance of the Array class. When we do a.to_s, we’re sending the to_smessage to the a object. The a object will happily respond to the to_s message and return a stringified version of the array: "[nil, nil, nil, nil, nil]"
The a object does not respond to (for example) the to_i method. If we send to_i to a, we get an error:
undefined method `to_i' for [nil, nil, nil, nil, nil]:Array
Notice the format of the last part of the error message. It’s value of receiving object:class of receiving object.
Understanding main:Object
In Ruby, every message that gets passed has a receiver, even when it doesn’t seem like there would be. When we do e.g. a.to_s, the receiver is obvious: it’s a. What about when we just call e.g. puts?
When we send a message that doesn’t explicitly have a receiver object, the receiver is a special object called main. That’s why when we call factory we get an error message that says undefined local variable or method `factory' for main:Object. The interpreter sends the factory message to main because we’re not explicitly specifying any other receiver object.
Changing the receiver of the factory message
If we want our program to work, we’re going to have to change the receiver of factory from main to something else. If the receiver were just main, then our factory method would have to just be defined out in the open, not as part of any object. If the factory method is not defined as part of any object, then it can’t easily share any data with any object, and we’ll have a pretty tough time.
We can change the receiver of the factory message by using a Ruby method called instance_exec.
class MyFactoryBot
def self.define(&block)
instance_exec(&block)
end
def self.create(model_sym)
end
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
#puts "First name: #{user.first_name}"
#puts "Last name: #{user.last_name}"
The instance_exec method will execute our block in the context of self. We can see that now, when we run our file, our error message has changed. The receiver object is no longer main. It’s now MyFactoryBot.
undefined method `factory' for MyFactoryBot:Class (NoMethodError)
Let’s add the factory method to MyFactoryBot.
class MyFactoryBot
def self.define(&block)
instance_exec(&block)
end
def self.factory(model_sym)
end
def self.create(model_sym)
end
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
#puts "First name: #{user.first_name}"
#puts "Last name: #{user.last_name}"
Now we don’t get any errors.
We know we’ll also need to invoke the block inside of the factory call, so let’s use another instance_exec inside of the factory method to do that.
class MyFactoryBot
def self.define(&block)
instance_exec(&block)
end
def self.factory(model_sym, &block)
instance_exec(&block)
end
def self.create(model_sym)
end
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
#puts "First name: #{user.first_name}"
#puts "Last name: #{user.last_name}"
Now it’s complaining, unsurprisingly, that there’s no method called first_name.
undefined method `first_name' for MyFactoryBot:Class (NoMethodError)
Adding the first_name and last_name methods
We of course need to define methods called first_name and last_name somewhere. We could conceivably add them on the MyFactoryBot class, but it would probably work out better to have a separate instance for each factory that we define, since of course a real application will have way more than just one factory.
Let’s make it so that the factory method creates an instance of a new class called MyFactory and then invokes the block on MyFactory.
class MyFactoryBot
def self.define(&block)
instance_exec(&block)
end
def self.factory(model_sym, &block)
factory = MyFactory.new
factory.instance_exec(&block)
end
def self.create(model_sym)
end
end
class MyFactory
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
#puts "First name: #{user.first_name}"
#puts "Last name: #{user.last_name}"
We of course still haven’t actually defined a method called first_name, so we still get an error about that, although the receiver is not MyFactory rather than MyFactoryBot.
undefined method `first_name' for #<MyFactory:0x0000ffff9a955ae8> (NoMethodError)
Let’s define a first_name and last_name method on MyFactory in order to make this error message go away.
class MyFactoryBot
def self.define(&block)
instance_exec(&block)
end
def self.factory(model_sym, &block)
factory = MyFactory.new
factory.instance_exec(&block)
end
def self.create(model_sym)
end
end
class MyFactory
def first_name
end
def last_name
end
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
#puts "First name: #{user.first_name}"
#puts "Last name: #{user.last_name}"
Now we get no errors.
Providing values for first_name and last_name
Let’s now come back and uncomment the last two lines of the file, the lines that output values for first_name and last_name.
class MyFactoryBot
def self.define(&block)
instance_exec(&block)
end
def self.factory(model_sym, &block)
factory = MyFactory.new
factory.instance_exec(&block)
end
def self.create(model_sym)
end
end
class MyFactory
def first_name
end
def last_name
end
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"
We get an error.
undefined method `first_name' for nil:NilClass (NoMethodError)
Remember that every message in Ruby has a receiver. Apparently in this case the receiver for first_name when we do user.first_name is nil. In other words, user is nil. That’s obviously not going to work out. It does make sense, though, because MyFactoryBot.create(:user) has no return value.
Let’s try making it so that MyFactoryBot#create returns our instance of MyFactory.
class MyFactoryBot
def self.define(&block)
instance_exec(&block)
end
def self.factory(model_sym, &block)
@factory = MyFactory.new
@factory.instance_exec(&block)
end
def self.create(model_sym)
@factory
end
end
class MyFactory
def first_name
end
def last_name
end
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"
Now there are no errors but, but there are also no values present in the output we see.
First name:
Last name:
Let’s have a closer look and see exactly what user is.
This was maybe a good intermediate step but we of course want user to be an instance of our Active Record User class, just like how the real Factory Bot does. Let’s change MyFactoryBot#create so that it returns an instance of User.
class MyFactoryBot
def self.define(&block)
instance_exec(&block)
end
def self.factory(model_sym, &block)
@factory = MyFactory.new
@factory.instance_exec(&block)
end
def self.create(model_sym)
@factory.user
end
end
class MyFactory
attr_reader :user
def initialize
@user = User.new
end
def first_name
end
def last_name
end
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
puts user.class.name
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"
This gives no errors but there are still no values present.
In order for the first_name and last_name methods to return values, let’s make it so each one calls the block that it’s given.
class MyFactoryBot
def self.define(&block)
instance_exec(&block)
end
def self.factory(model_sym, &block)
@factory = MyFactory.new
@factory.instance_exec(&block)
end
def self.create(model_sym)
@factory.user
end
end
class MyFactory
attr_reader :user
def initialize
@user = User.new
end
def first_name(&block)
@user.first_name = block.call
end
def last_name(&block)
@user.last_name = block.call
end
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
puts user.class.name
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"
Now we see “John” and “Smith” as our first and last name values.
User
First name: John
Last name: Smith
Generalizing the factory
What if we wanted to add this? It wouldn’t work.
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
email { "john.smith@example.com" }
end
end
Obviously we can’t just have hard-coded first_name and last_name methods in our factory. We need to make it so our factory can respond to any messages that are sent to it (provided of course that those messages correspond to actual attributes on our Active Record models).
Let’s take the first step toward generalizing our methods. Instead of @user.first_name = block.call, we’ll do @user.send("first_name=", block.call), which is equivalent.
class MyFactoryBot
def self.define(&block)
instance_exec(&block)
end
def self.factory(model_sym, &block)
@factory = MyFactory.new
@factory.instance_exec(&block)
end
def self.create(model_sym)
@factory.user
end
end
class MyFactory
attr_reader :user
def initialize
@user = User.new
end
def first_name(&block)
@user.send("first_name=", block.call)
end
def last_name(&block)
@user.send("last_name=", block.call)
end
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
user = MyFactoryBot.create(:user)
puts user.class.name
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"
We can go even further than this. Rather than having methods called first_name and last_name, we can use Ruby’s method_missing to dynamically respond to any message that gets sent.
class MyFactoryBot
def self.define(&block)
instance_exec(&block)
end
def self.factory(model_sym, &block)
@factory = MyFactory.new
@factory.instance_exec(&block)
end
def self.create(model_sym)
@factory.user
end
end
class MyFactory
attr_reader :user
def initialize
@user = User.new
end
def method_missing(attr, *args, &block)
# If the message that's sent is e.g. first_name, then
# the value of attr will be :first_name, and the value
# of "#{attr}=" will be "first_name=".
@user.send("#{attr}=", block.call)
end
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
email { "john.smith@example.com" }
end
end
user = MyFactoryBot.create(:user)
puts user.class.name
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"
puts "Email: #{user.email}"
If we run our file again, we can see that our method_missing code handles not only first_name and last_name but email as well.
First name: John
Last name: Smith
Email: john.smith@example.com
More generalization
What if we want to have more factories besides just one for User? I happen to have a model in my Rails app called Website. What if I wanted to have a factory for that?
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
MyFactoryBot.define do
factory :website do
name { "Google" }
url { "www.google.com" }
end
end
Right now, it wouldn’t work because I have the User class hard-coded in my factory.
undefined method `name=' for #<User:0x0000ffff8f950d28> (NoMethodError)
Let’s make it so that rather than hard-coding User, we set the class dynamically. Let’s also make it so that the argument you pass to the factory method (e.g. :user or :website) retries the appropriate factory. We can accomplish this by putting our factories into a hash.
class MyFactoryBot
def self.define(&block)
instance_exec(&block)
end
def self.factory(model_sym, &block)
@factories ||= {}
@factories[model_sym] = MyFactory.new(model_sym)
@factories[model_sym].instance_exec(&block)
end
def self.create(model_sym)
@factories[model_sym].record
end
end
class MyFactory
attr_reader :record
def initialize(model_sym)
@record = model_sym.to_s.classify.constantize.new
end
def method_missing(attr, *args, &block)
@record.send("#{attr}=", block.call)
end
end
MyFactoryBot.define do
factory :user do
first_name { "John" }
last_name { "Smith" }
end
end
MyFactoryBot.define do
factory :website do
name { "Google" }
url { "www.google.com" }
end
end
user = MyFactoryBot.create(:user)
puts user.class.name
puts "First name: #{user.first_name}"
puts "Last name: #{user.last_name}"
puts
website = MyFactoryBot.create(:website)
puts website.class.name
puts "Name: #{website.name}"
puts "URL: #{website.url}"
Now, when we run our file, both factories work.
First name: John
Last name: Smith
Email: john.smith@example.com
Name: Google
URL: www.google.com
Takeaways
Factory Bot syntax (and other DSLs) aren’t magic. They’re just Ruby.
Blocks can be a powerful way to make your code more expressive and understandable.
In Ruby, it’s often more useful to talk about “sending messages to objects” rather than “calling methods on objects”.
Every message sent in Ruby has a receiver. When a receiver is not explicitly specified, the receiver is a special object called main.
The method_missing method can allow your objects to respond to messages dynamically.
When I teach programming classes or pair program with colleagues, I often encounter people with a whole bunch of browser tabs open at once.
I want to explain to you why having too many tabs open is bad.
The cost of tabs
Tabs cost mental RAM
As we work, we’re always juggling thoughts in our head. I call this our “mental RAM”. It’s a finite resource.
Each tab you have open occupies some space not only on your browser screen but also in your brain. There’s a part of you that wants to remember “don’t forget, I had that one Stack Overflow answer open in a tab in case I need to go back to it”. You might not be consciously aware of it but that open tab is taking up part of your precious mental RAM, RAM that could and should be used to do actual work, but instead is being wasted on thinking about tabs.
Tabs cost wasted moves
I’ve often had the experience where I’m pairing with someone and their open tabs cause them to go from tab to tab and go “whoops, not that one”, “whoops, not that one”, “whoops…” Each time this happens, my blood pressure rises a little bit. Maybe if you didn’t have 36 tabs open you wouldn’t be wasting so much of my time and yours.
Why people keep tabs open
Presumably, the justification for keeping a tab open is that 1) you think you’ll need the tab’s contents later, and 2) you think it will cost more to re-find the contents of that tab than to keep the tab open.
You’ll need it later (probably wrong)
Most of the tabs people keep open are never actually needed later.
It’s too costly to re-find the tab’s contents later (probably wrong)
Even if you do need the tab’s contents later, it’s usually way cheaper just to re-open the website.
So, close your fucking tabs!
If you have any tabs open right now that aren’t directly related to what you’re currently working on, I invite you to close them. They’re probably costing you more than they’re worth.
Influenced by the experiences I’ve had last over many years of building and maintaining Rails applications, combined with my experiences using other technologies, I’ve developed some ways of structuring Rails applications that have worked out pretty well for me.
Some of my organizational tactics follow conventional wisdom, like keeping controllers thin. Other of my tactics are ones I haven’t really seen in others’ applications but wish I would.
Here’s an overview of the topics I touch on in this post.
Controllers
Namespaces
Models
ViewComponents
The lib folder
Concerns
Background jobs
JavaScript
Tests
Service objects
How I think about Rails code organization in general
Let’s start with controllers.
Controllers
The most common type of controller in most Rails applications is a controller that’s based on an Active Record resource. For example, if there’s a customers database table then there will also be a Customer model class and a controller called CustomersController.
The problem
Controllers can start to get nasty when there get to be too many “custom” actions beyond the seven RESTful actions of index, new, create, edit, update, show and destroy.
Let’s say we have a CustomersController that, among other things, allows the user to send messages about a customer (by creating instances of a Message, let’s say). The relevant actions might be called new_message and create_message. This is maybe not that bad, but it clutters up the controller a little, and if you have enough custom actions on a controller then the controller can get pretty messy and hard to comprehend.
The solution
What I like to do in these scenarios is create a “custom” controller called e.g. CustomerMessagesController. There’s no database table called customer_messages or class called CustomerMessage. The concept of a “customer message” is just something I made up. But now that this idea exists, my CustomersController#new_message and CustomersController#create_message actions can become CustomerMessagesController#new and CustomerMessagesController#create. I find this much tidier.
And as long as I’m at it, I’ll even create a PORO (plain old Ruby object) called CustomerMessage where I can handle the business of creating a new customer message as not to clutter up either Customer or Message with this stuff which is really not all that relevant to either of those classes. I might put a create or create! method on CustomerMessage which creates the appropriate Message for me.
Furthermore, I’ll also often put include ActiveModel::Model into my PORO so that I can bind the PORO to a form as though it were a regular old Active Record model.
Namespaces
Pieces of code are easier to understand when they don’t require you to also understand other pieces of code as a prerequisite. To use an extreme example to illustrate the point, it would obviously be impossible to understand a program so tangled with dependencies that understanding any of it required understanding all of it.
So, anything we can do to allow small chunks of our programs understandable in isolation is typically going to make our program easier to work with.
Namespaces serve as a signal that certain parts of the application are more related to each other than they are to anything else. For example, in the application I work on at work, I have a namespace called Billing. I have another namespace called Schedule. A developer who’s new to the codebase could look at the Billing and Schedule namespaces and rightly assume that when they’re thinking about one, they can mostly ignore the other.
Contexts
Some of my models are sufficiently fundamental that it doesn’t make sense to put them into any particular namespace. I have a model called Appointment that’s like this. An Appointment is obviously a scheduling concern a lot of the time, but just as often it’s a clinical concern or a billing concern. An appointment can’t justifiably be “owned” by any one namespace.
This doesn’t mean I can’t still benefit from namespaces though. I have a controller called Billing::AppointmentsController which views appointments through a billing lens. I have another controller called Chart::AppointmentsController which views appointments through a clinical lens. For scheduling, we have two calendar views, one that shows one day at a time and one that shows one month at a time. So I have two controllers for that: Schedule::ByDayCalendar::AppointmentsController and Schedule::ByMonthCalendar::AppointmentsController. Imagine trying to cram all this stuff into a single AppointmentsController. This idea of having namespaced contexts for broad models has been very useful.
Models
I of course keep my models in app/models just like everybody else. What’s maybe a little less common is the way I conceive of models. I don’t just think of models as classes that inherit from ApplicationRecord. To me, a model is anything that models something.
So a lot of the models I keep in app/models are just POROs. According to a count I did while writing this post, I have 115 models in app/models that inherit from ApplicationRecord and 439 that don’t. So that’s about 20% Active Record models and 80% POROs.
ViewComponents
Thanks to the structural devices that Rails provides natively (controllers, models, views, concerns, etc.) combined with the structural devices I’ve imposed myself (namespaces, a custom model structure), I’ve found that most code in my Rails apps can easily be placed in a fitting home.
One exception to this for a long time for me was view-related logic. View-related logic is often too voluminous and detail-oriented to comfortably live in the view, but too tightly coupled with the DOM or other particulars of the view to comfortably live in a model, or anywhere else. The view-related code created a disturbance wherever it lived.
The solution I ultimately settled on for this problem is ViewComponents. In my experience, ViewComponents can provide a tidy way to package up a piece of non-trivial view-related logic in a way that allows me to maintain a consistent level of abstraction in both my views and my models.
The lib folder
I have a rough rule of thumb is that if a piece of code could conceivably be extracted into a gem and used in any application, I put it in lib. Things that end up in lib for me include custom form builders, custom API wrappers, custom generators and very general utility classes.
Concerns
In a post of DHH’s regarding concerns, he says “Concerns are also a helpful way of extracting a slice of model that doesn’t seem part of its essence”. I think that’s a great way to put it and that’s how I use concerns as well.
Like any programming device, concerns can be abused or used poorly. I sometimes come across criticisms of concerns, but to me what’s being criticized is not exactly concerns but bad concerns. If you’re interested in what those criticisms are and how I write concerns, I wrote a post about it here.
Background jobs
I keep my background job workers very thin, just like controllers. It’s my belief that workers shouldn’t do things, they should only call things. Background job workers are a mechanical device, not a code organization device.
JavaScript
I use JavaScript as little as possible. Not because I particularly have anything against JavaScript, but because the less “dynamic” an application is, and the fewer technologies it involves, the easier I find it to understand.
When I do write JavaScript, I use a lot of POJOs (plain old JavaScript objects). I use Stimulus to help keep things organized. To test my JavaScript code, I exercise it using system specs. The way I see it, it’s immaterial from a testing perspective whether I implement my features using JavaScript or Ruby. System specs can exercise it all just fine.
Tests
Being “the Rails testing guy”, I of course write a lot of tests. I use RSpec not because I necessarily think it’s the best testing framework from a technical perspective but rather just to swim with the current. I practice TDD a lot of the time but not all the time. Most of my tests are model specs and system specs.
Since service objects are apparently so popular these days, I feel compelled to mention that I don’t use service objects. Instead, I use regular old OOP. What a lot of people might model as procedural service object code, I model as declarative objects. I write more about this here.
How I think about Rails code organization in general
I see Rails as an amazingly powerful tool to save me from repetitive work via its conventions. Rails also provides really nice ways of organizing certain aspects of code: controllers, views, ORM, database connections, migrations, and many other things.
At the same time, the benefits that Rails provides have a limit. One you find yourself past that limit (which inevitably does happen if you have a non-trivial application) you either need to provide some structure of your own or you’re likely going to end up with a mess. Specifically, once your model layer grows fairly large, Rails is no longer going to help you very much.
The way I’ve chosen to organize my model code is to use OOP. Object-oriented programming is obviously a huge topic and so I won’t try to convey here what I think OOP is all about. But I think if a Rails developer learns good OOP principles, and applies them to their Rails codebase, specifically in the model layer, then it can go a long way toward keeping a Rails app organized, perhaps more than anything else.
Getting a new developer set up with a Rails app (or any app) can be tedious. Part of the tedium is the chore of manually installing all the dependencies: Ruby, RVM, Rails, PostgreSQL, Redis, etc.
It would be really nice if that developer could just run a single command and have all that app’s dependencies installed on their computer rather than having to install dependencies manually.
This ideal is in fact possible if you fully Dockerize your app. If you want to, you can create a Docker setup that will make it so you don’t have to install anything manually: no Ruby, no Rails, no RVM, no PostgreSQL, no Redis, nothing. It’s very nice.
Fully Dockerizing has drawbacks
Unfortunately, fully Dockerizing a Rails app isn’t without trade-offs. When working with a Dockerized app, there’s a performance hit, there are some issues with using binding.pry, and system specs/system tests in such a way that you can see them run in a browser is next to impossible.
None of these obstacles is insurmountable, but if you don’t want to deal with these issues, you can choose to Dockerize just some of your app’s dependencies instead of all of them.
Partial Dockerization
The Docker setup I use at work is a hybrid approach. I let Docker handle my PostgreSQL and Redis dependencies. I install all my other dependencies manually. This makes it so I don’t have to live with the downsides of full Dockerization but I still get to skip installing some of my dependencies. Any dependency I can skip is a win.
The example I’m going to show you shortly is an even simpler case. Rather than Dockerizing PostgreSQL and Redis, we’re only going to Dockerize PostgreSQL. I’m doing it this way in the interest of showing the simplest possible example.
Dockerizing for development vs. production
I want to add a note for clarity. The Docker setups I’ve been discussing so far are all development setups. There are two ways to Dockerize an app: for a development environment and for a production environment. Development environments and production environments of course have vastly different needs and so a different Docker setup is required for each. In a production environment we wouldn’t run PostgreSQL and a Rails server on the same machine. We’d have a separate database server instead. So I want to be clear that this Docker setup is for development only.
How to Dockerize your database
In order to Dockerize our database we’re going to use Docker Compose. Docker Compose is a tool that a) lets you specify and configure your development environment’s dependencies, b) installs those dependencies for you, and c) runs those dependencies.
Initializing the Rails app
Before we do anything Docker-related, let’s initialize a new Rails app that uses PostgreSQL.
$ rails new my_app -d postgresql
Adding a Docker Compose config file
Here’s the Docker Compose config file. It’s called docker-compose.yml and goes at the project root. This file, again, is what specifies our development environment’s dependencies. I’ve annotated the file to help you understand what’s what.
# docker-compose.yml
---
version: '3.8'
# The "services" directive lists all the services your
# app depends on. In this case there's only one: PostgreSQL.
services:
# We give each service an arbitrary name. I've called
# our PostgreSQL service "postgresql".
postgres:
# Docker Hub hosts images of common services for
# people to use. The postgres:13.1-alpine is an
# image that uses the Alpine Linux distribution,
# very lightweight Linux distribution that people
# often use when Dockerizing development environments.
image: postgres:13.1-alpine
# PostgreSQL has to put its data somewhere. Here
# we're saying to put the data in /var/lib/postgresql/data.
# The "delegated" part specifies the strategy for
# syncing the container's data with our host machine.
# (Another option would be "cached".)
volumes:
- postgresql:/var/lib/postgresql/data:delegated
# This says to make our PostgreSQL service available
# on port 5432.
ports:
- "127.0.0.1:5432:5432"
# This section specifies any environment variables
# that we want to exist on our Docker container.
environment:
# Use "my_app" as our PostgreSQL username.
POSTGRES_USER: my_app
# Set POSTGRES_HOST_AUTH_METHOD to "trust" to
# allow passwordless authentication.
POSTGRES_HOST_AUTH_METHOD: trust
volumes:
postgresql:
storage:
Next we’ll have to change config/database.yml ever so slightly in order to get it to be able to talk to our PostgreSQL container. We need to set the username to my_app and set the host to 127.0.0.1.
default: &default
adapter: postgresql
encoding: unicode
pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
# This must match what POSTGRES_USER was set to in docker-compose.yml.
username: my_app
# This must be 127.0.0.1 and not localhost.
host: 127.0.0.1
development:
<<: *default
database: my_app_development
test:
<<: *default
database: my_app_test
production:
<<: *default
database: my_app_production
username: my_app
password: <%= ENV['DB_PASSWORD'] %>
init.sql
If we put a file called init.sql at the project root, Docker will find it and execute it. It’s necessary to have an SQL script that creates a user called my_app or else PostgreSQL will give us an error saying (truthfully) that there’s no user called my_app.
CREATE USER my_app SUPERUSER;
It’s very important that the init.sql file is in place before we proceed. If the init.sql file is not in place or not correct, it can be a difficult error to recover from.
Using the Dockerized database
Run docker-compose up to start the PostgreSQL service.
$ docker-compose up
Now we can create the database.
$ rails db:create
As long as the creation completed successfully, we can connect to the database.
$ rails db
Now we’re connected to a PostgreSQL database without having had to actually install PostgreSQL.
If you’d like to keep your finger on the pulse in the Ruby world, here’s a list of Rubyists you can follow. The list is in alphabetical order by last name.
VCR and WebMock are tools that help deal with challenges related to tests that make network requests. In this post I’ll explain what VCR and WebMock are and then show a “hello world”-level example of using the two tools.
Why VCR and WebMock exist
Determinism
One of the principles of testing is that tests should be deterministic. The passing or failing of a test should be determined by the content of the application code and nothing else. All the tests in the test suite should pass regardless of what order they were run in, what time of day they were run, or any other factor.
For this reason we have to run tests in their own encapsulated world. If we want determinism, we can’t let tests talk to the network, because the network (and things in the network) are susceptible to change and breakage.
Imagine an app that talks to a third-party API. Imagine that the tests hit that API each time they’re run. Now imagine that on one of the test runs, the third-party API happens to go down for a moment, causing our tests to fail. Our test failure is illogical because the tests are telling us our code is broken, but it’s not our code that’s broken, it’s the outside world that’s broken.
If we’re not to talk to the network, we need a way to simulate our interactions with the network so that our application can still behave normally. This is where tools like VCR and WebMock come in.
Production data
We also don’t want tests to alter actual production data. It would obviously be bad if we for example wrote a test for deleting users and then that test deleted real production users. So another benefit of tools like VCR and WebMock is that they save us from having to touch real production data.
The difference between VCR and WebMock
VCR is a tool that will record your application’s HTTP interactions and play them back later. Very little code is necessary. VCR tricks your application into thinking it’s receiving responses from the network when really the application is just receiving prerecorded VCR data.
WebMock, on the other hand, has no feature for recording and replaying HTTP interactions in the way that VCR does, although HTTP interactions can still be faked. Unlike VCR’s record/playback features, WebMock’s network stubbing is more code-based and fine-grained. In this tutorial we’re going to take a very basic look at WebMock and VCR to show a “hello world” level usage.
What we’re going to do
This post will serve as a simple illustration of how to use VCR and WebMock to meet the need that these tools were designed to meet: running tests that hit the network without actually hitting the network.
We’re going to write a small search feature that hits a third-party API. We’re also going to write a test that exercises that search feature and therefore hits the third-party API as well.
WebMock
Once we have our test in place we’ll install and configure WebMock. This will disallow any network requests. As a result, our test will stop working.
VCR
Lastly, we’ll install and configure VCR. VCR knows how to talk with WebMock. Because of this, VCR and WebMock can come to an agreement together that it’s okay for our test to hit the network under certain controlled conditions. VCR will record the HTTP interactions that occur during the test and then, on any subsequent runs of the test, VCR will use the recorded interactions rather than making fresh HTTP interactions for each run.
The feature
The feature we’re going to write for this tutorial is one that searches the NPI registry, a government database of healthcare providers. The user can type a provider’s first and last name, hit Search, and then see any matches.
Below is the controller code.
# app/controllers/npi_searches_controller.rb
ENDPOINT_URL = "https://npiregistry.cms.hhs.gov/api"
TARGET_VERSION = "2.1"
class NPISearchesController < ApplicationController
def new
@results = []
return unless params[:first_name].present? || params[:last_name].present?
query_string = {
first_name: params[:first_name],
last_name: params[:last_name],
version: TARGET_VERSION,
address_purpose: ""
}.to_query
uri = URI("#{ENDPOINT_URL}/?#{query_string}")
response = Net::HTTP.get_response(uri)
@results = JSON.parse(response.body)["results"]
end
end
Here’s the template that goes along with this controller action.
Our test will be a short system spec that types “joel” and “fuhrman” into the first and last name fields respectively, clicks Search, then asserts that Joel Fuhrman’s NPI code (a unique identifier for a healthcare provider) shows up on the page.
# spec/system/npi_search_spec.rb
require "rails_helper"
RSpec.describe "NPI search", type: :system do
it "shows the physician's NPI number" do
visit new_npi_search_path
fill_in "first_name", with: "joel"
fill_in "last_name", with: "fuhrman"
click_on "Search"
# 1386765287 is the NPI code for Dr. Joel Fuhrman
expect(page).to have_content("1386765287")
end
end
If we run this test at this point, it passes.
Installing and Configuring WebMock
We don’t want our tests to be able to just make network requests willy-nilly. We can install WebMock so that no HTTP requests can be made without our noticing.
First we add the webmock gem to our Gemfile.
# Gemfile
group :development, :test do
gem "webmock"
end
Second, we can create a new file at spec/support/webmock.rb.
# spec/support/webmock.rb
# This line makes it so WebMock and RSpec know how to talk to each other.
require "webmock/rspec"
# This line disables HTTP requests, with the exception of HTTP requests
# to localhost.
WebMock.disable_net_connect!(allow_localhost: true)
Remember that files in spec/support won’t get loaded unless you have the line in spec/rails_helper.rb uncommented that loads these files.
# spec/rails_helper.rb
# Make sure to uncomment this line
Dir[Rails.root.join('spec', 'support', '**', '*.rb')].sort.each { |f| require f }
Seeing the test fail
If we run our test again now that WebMock is installed, it will fail, saying “Real HTTP connections are disabled”. We also get some instructions on how to stub this request if we like. We’re not going to do that, though, because we’re going to use VCR instead.
Failures:
1) NPI search shows the physician's NPI number
Failure/Error: response = Net::HTTP.get_response(uri)
WebMock::NetConnectNotAllowedError:
Real HTTP connections are disabled. Unregistered request: GET https://npiregistry.cms.hhs.gov/api/?address_purpose=&first_name=joel&last_name=fuhrman&version=2.1 with headers {'Accept'=>'*/*', 'Accept-Encoding'=>'gzip;q=1.0,deflate;q=0.6,identity;q=0.3', 'Host'=>'npiregistry.cms.hhs.gov', 'User-Agent'=>'Ruby'}
You can stub this request with the following snippet:
stub_request(:get, "https://npiregistry.cms.hhs.gov/api/?address_purpose=&first_name=joel&last_name=fuhrman&version=2.1").
with(
headers: {
'Accept'=>'*/*',
'Accept-Encoding'=>'gzip;q=1.0,deflate;q=0.6,identity;q=0.3',
'Host'=>'npiregistry.cms.hhs.gov',
'User-Agent'=>'Ruby'
}).
to_return(status: 200, body: "", headers: {})
============================================================
Installing and Configuring VCR
First we’ll add the vcr gem to our Gemfile.
# Gemfile
group :development, :test do
gem 'vcr'
end
Next we’ll add the following config file. I’ve added annotations so you can understand what each line is.
# spec/support/vcr.rb
VCR.configure do |c|
# This is the directory where VCR will store its "cassettes", i.e. its
# recorded HTTP interactions.
c.cassette_library_dir = "spec/cassettes"
# This line makes it so VCR and WebMock know how to talk to each other.
c.hook_into :webmock
# This line makes VCR ignore requests to localhost. This is necessary
# even if WebMock's allow_localhost is set to true.
c.ignore_localhost = true
# ChromeDriver will make requests to chromedriver.storage.googleapis.com
# to (I believe) check for updates. These requests will just show up as
# noise in our cassettes unless we tell VCR to ignore these requests.
c.ignore_hosts "chromedriver.storage.googleapis.com"
end
Adding VCR to our test
Now we can add VCR to our test by adding a block to it that starts with VCR.use_cassette "npi_search" do. The npi_search part is just arbitrary and tells VCR what to call our cassette.
# spec/system/npi_search_spec.rb
require "rails_helper"
RSpec.describe "NPI search", type: :system do
it "shows the physician's NPI number" do
VCR.use_cassette "npi_search" do # <---------------- add this
visit new_npi_search_path
fill_in "first_name", with: "joel"
fill_in "last_name", with: "fuhrman"
click_on "Search"
expect(page).to have_content("1386765287")
end
end
end
Last time we ran this test it failed because WebMock was blocking its HTTP request. If we run the test now, it will pass, because VCR and WebMock together are allowing the HTTP request to happen.
If we look in the spec/cassettes directory after running this test, we’ll see that there’s a new file there called npi_search.yml. Its contents look like the following.
Each time this test is run, VCR will ask, “Is there a cassette called npi_search?” If not, VCR will allow the HTTP request to go out, and a new cassette will be recorded for that HTTP request. If there is an existing cassette called npi_search, VCR will block the HTTP request and just use the recorded cassette in its place.
Takeaways
Tests should be deterministic. The passing or failing of a test should be determined by the content of the application code and nothing else.
We don’t want tests to be able to alter production data.
WebMock can police our app to stop it from making external network requests.
VCR can record our tests’ network interactions for playback later.