One necessity of automated testing is having some data to test with. There are three ways I know of to generate test data in Rails:
All three can be viable solutions depending on the situation. Let’s first explore manual creation.
Manual data creation
Manual data creation can be convenient enough if you only have a few attributes on a model and no dependencies. For example, let’s say I have a
PaymentType model with just one attribute,
name. If I want to test both valid and invalid states, that’s easy:
valid_payment_type = PaymentType.new(name: 'Visa')
invalid_payment_type = PaymentType.new(name: '')
But now let’s say we have the idea of an
Order which is made up of multiple
Payments, each of which has a
order = Order.create!(
LineItem.create!(name: 'Electric dog polisher', price_cents: 40000)
payment_method: PaymentMethod.create!(name: 'Visa')
That’s annoying. We had to expend mental energy to arbitrarily come up with details (“Electric dog polisher” with a price of $400.00, paid by Visa) that aren’t even necessarily relevant.
What if all we wanted to test was that when the payment total equals the line item total,
order.balance_due returns zero?
This is where factories come in handy.
Factories are actually not a test-specific concept. “Factory method” is a design pattern that you can find in the “Gang of Four” design patterns book. The pattern just happens to come in handy for the purpose of testing.
The idea with a factory is basically that you have a method/function that generates new class instances for you.
In the Gang of Four book they use an example where a factory called
Creator will return either an instance of
YourProduct depending on certain conditions. So rather than having to say “if this case, then instantiate a
MyProduct, otherwise, instantiate a
YourProduct” all over the place, you can just say
Creator.create(relevant_data) and get back the appropriate class instance.
You might be able to imagine how such a factory would be useful. In the case of testing, the kind of factories we want will be a little bit different. We’re not concerned with abstracting away whether class A or class B gets instantiated. We want to abstract away the generation of irrelevant details and tedious-to-set-up model associations.
Here’s an example of how the setup for an
Order instance might look if we used a factory, specifically Factory Bot. (By the way, Factory Bot is the most popular Rails testing factory but there’s nothing particularly special about the way it works. The ideas in Factory Bot could have been implemented any number of ways.)
order = FactoryBot.create(
line_items: [FactoryBot.create(:line_item, price_cents: 40000)],
payments: [FactoryBot.create(:payment, amount_cents: 40000)]
In this case we’re specifying only the details that are relevant to the test. We don’t care about the line item name or the payment method. As long as we have a payment total that matches the line item total, that’s all where care about.
How would we achieve this same thing using fixtures?
First off let me say that I’m much less experienced with fixtures than I am with factories. If you want to learn more about fixtures I suggest you read something from someone who knows what they’re talking about.
Having said that, if we wanted to set up the same test data that we did above using factories, here’s how I believe we’d do it with a fixture. Typically fixtures are expressed in terms of YAML files.
# no attributes needed
name: 'Electric dog polisher'
Once the fixture data is established, instantiating an object that uses the data is as simple as referring to the key for that piece of data:
order = orders(:payments_equal_line_item_total)
So, what’s best? Factories, fixtures, or manual creation?
Which is best?
I think I’ve demonstrated that manually generating test data can quickly become prohibitively tedious. (You can often mitigate this problem, though, by using loose coupling and dependency injection in your project.) But just because manual data generation doesn’t work out great in all scenarios doesn’t mean it’s never useful. I still use manual data generation when I only need to spin up something small. It has a benefit of clarity and low overhead.
Like I said above, I’ve definitely used factories more than fixtures. My main reason is that when I use factories, the place where I specify the test data and the place where I use the test data are close to each other. With fixtures, on the other hand, the setup is hidden. If I want to see exactly what test data is being generated, I have to go into the YAML files and check. I find this too tedious.
Another issue I have with fixtures in practice is that, in my experience, teams will set up a “world of data” and then use that big and complicated world as the basis for all the tests. I prefer to start each test with a clean slate, to the extent possible, and generate for each test the bare minimum data that’s needed for that test. I find that this makes the tests easier to understand. I want to be clear that I think that’s a usage problem, though, not an inherent flaw in the concept of fixtures.
There’s also no reason why you couldn’t use both factories and fixtures in a project. I’m not sure what the use case might be but I could imagine a case where fixtures provide some minimal baseline of fixed data (the payment types of Visa, MasterCard, cash, etc. come to mind as something that would be often-needed and never changed) and then factories take the baton from there and generate dynamic data on a per-test basis.
My go-to test generation method is factories, and that’s what I generally recommend to others. But if the use case were right I wouldn’t hesitate to use fixtures instead of or in addition to factories.