Faster Rails fixtures
Have you ever noticed that Rails always reloads test fixtures into the database, even if they haven't changed? You might have also seen the Rails development server compiling browser assets on-the-fly only when needed. So, why doesn't Rails handle fixtures the same way? If you've worked on a large enough codebase, this question may have crossed your mind more than once, because reloading fixtures is a time consuming operation.
And if you're still with me, keep reading - that's the exact issue we're tackling here.
Problem
Before running a test (or multiple tests), rails truncates test database(s) and inserts fixtures. This only happens once before all tests, and does not happen between individual tests (at least, that's the behavior with use_transactional_tests = true
). Depending on the size of the project, this can take a noticable amount of time, and when running a single test, it can take a significant chunk of the entire test time. It gets even worse when running tests in parallel, since all this is happening x number of times, where x is the number of parallel workers.
Now, in my opinion, this isn't a huge problem when running all tests, but where it gets truly painful is when iterating on a single test.
One thing to note though, in theory, this is only a problem when tests are using all fixtures (fixtures :all
) and so there's a lot of data to insert. If, on the other hand, individual tests pick what fixtures they use, then the amount of data to insert should stay relatively small no matter the size of the project. Still, the higher up the test pyramid we go, the more fixtures are involved, and so fixtures :all
becomes more likely.
Solution
We're going to patch some Rails code to skip fixture inserts when all of the following conditions are met:
- fixture files haven't changed since the previous test run.
- the database schema hasn't changed since the previous test run.
- a fixtures digest is present (indicating the state of the fixtures from the previous run).
As a result, on a project with 195 fixture files, running an individual test (no parallelization) time dropped by over 30% (from 5 to 3.2 seconds on my machine). Running a test file (with parallelization) dropped by about 25% (from 11 to 8.2 seconds).
A nice bonus is that test logs are now free from thousands of lines of db inserts and other db setup gubbins.
I've been running this for a long time, and the cache invalidation strategy seems pretty solid. On an off chance you need to force insert fixtures, simply rm tmp/fixtures_digest*
.
Implementation
Sadly, I wasn't able to come up with anything elegant, so brace yourself for a bunch of duct tape patchwork. However, it works and it's saving me a bit of time and a lot of sanity. I hope that some day this makes it into Rails, but in the mean time, you can just paste this code into your project and enjoy faster tests:
Copied!# test/support/preloaded_fixtures.rb
module PreloadedFixtures
extend ActiveSupport::Concern
mattr_accessor :parallel_worker_number
class Cache
class << self
def hit?
last_digest&.== current_digest
end
def record_digest
digest_path.write(current_digest)
end
private
def current_digest
@current_digest ||= begin
files = Dir[Rails.root.join('test', 'fixtures', '**', '*.yml')]
file_ids = files.sort.map { |f| "#{File.basename(f)}/#{Digest::SHA1.file(f).hexdigest}" }
Digest::SHA1.hexdigest(file_ids.join('/'))
end
end
def last_digest
digest_path.read rescue Errno::ENOENT
end
def digest_path
Rails.root.join('tmp', "fixtures_digest#{PreloadedFixtures.parallel_worker_number}")
end
end
end
class << self
# skips fixtures insert into the database
def patch_active_record_fixture_set
return if ActiveRecord::FixtureSet.singleton_class.method_defined?(:original_insert)
ActiveRecord::FixtureSet.singleton_class.class_eval do
alias_method :original_insert, :insert
define_method(:insert) do |fixture_sets, connection|
if PreloadedFixtures.schema_up_to_date? && PreloadedFixtures::Cache.hit? && fixture_sets.first.model_class.count.positive?
puts 'Using preloaded fixtures' if [nil, 0].include?(PreloadedFixtures.parallel_worker_number)
# This magic line populates primary keys
fixture_sets.each(&:table_rows)
else
PreloadedFixtures::Cache.record_digest
original_insert(fixture_sets, connection)
end
end
end
end
def schema_up_to_date?
env_name = ActiveRecord::ConnectionHandling::DEFAULT_ENV.call
ActiveRecord::Base.configurations.configs_for(env_name:).all? do |db_config|
ActiveRecord::Tasks::DatabaseTasks.schema_up_to_date?(db_config)
rescue ActiveRecord::NoDatabaseError
false
end
end
end
class_methods do
def preloaded_fixtures
ENV['SKIP_TEST_DATABASE_TRUNCATE'] = '1'
PreloadedFixtures.patch_active_record_fixture_set
# HACK: ensure `parallel_worker_number` is assigned before rails' setup fixtures `after_hook`.
# Because `after_hook` calls into our patched code that relies on `parallel_worker_number`.
ActiveSupport::Testing::Parallelization.class_variable_get('@@after_fork_hooks').unshift(proc do |worker_number|
PreloadedFixtures.parallel_worker_number = worker_number
end)
end
end
end
And then use it in test/test_helper.rb
:
Copied!require_relative './support/preloaded_fixtures'
module ActiveSupport
class TestCase
...
include PreloadedFixtures
preloaded_fixtures
fixtures :all
...
patch_active_record_fixture_set
skips the inserts, but also, crucially, adds primary keys to the fixtures. So that when you call users(:bob)
the underlying select statement contains the correct id to find user record for Bob.
Caveat: SKIP_TEST_DATABASE_TRUNCATE
flag was introduced in Rails 7.2. If you want to use this for earlier versions, either skip preloaded_fixtures
for parallel tests - e.g.:
Copied! # only call preloaded_fixtures for individual tests
preloaded_fixtures if Rails::TestUnit::Runner.compose_filter(self, nil)
or you'll need to add one more patch:
Copied! # skips parallel databases truncation
def patch_active_record_tasks_database_tasks_reconstruct_from_schema
return if ActiveRecord::Tasks::DatabaseTasks.singleton_class.method_defined?(:original_reconstruct_from_schema)
ActiveRecord::Tasks::DatabaseTasks.singleton_class.class_eval do
alias_method :original_reconstruct_from_schema, :reconstruct_from_schema
define_method(:reconstruct_from_schema) do |*args|
unless PreloadedFixtures.schema_up_to_date? && PreloadedFixtures::Cache.hit?
original_reconstruct_from_schema(*args)
end
end
end
end
And then swap ENV[SKIP_TEST_DATABASE_TRUNCATE] = '1'
for PreloadedFixtures.patch_active_record_tasks_database_tasks_reconstruct_from_schema
.
On a side note, I also use the above trick to turn off parallelization for individual tests. When running an individual test (e.g. rails test test/model/users_test:8
) Rails still considers the total number of tests in the file when deciding whether to run tests in parallel or not. As a result, if there are more than fifty tests in a file, individual test will be run with the overhead of parallel setup. Use this code to address that:
Copied! # skip parallelization for individual tests
parallelize unless Rails::TestUnit::Runner.compose_filter(self, nil)
Rant
Whilst testing SKIP_TEST_DATABASE_TRUNCATE
(added in Rails 7.2), I kept wondering how it was possible to skip the cleanup part of the setup - truncation of database table - without simultaneously skipping the insert part. How come it doesn't keep inserting more and more of the same fixtures? Well, turns out, after truncating tables in parallelization setup, fixtures setup code kicks in and deletes from all tables anyway. In other words, it looks like truncation is just a complete waste of time to begin with (and a lot of it too). Unless I am missing something, fixtures should be implicitely turning on SKIP_TEST_DATABASE_TRUNCATE
.