Faster Rails fixtures

6 min readrailstesting

Have you ever noticed that Rails always reloads test fixtures into the database, even if they haven't changed? You might have also seen the Rails development server compiling browser assets on-the-fly only when needed. So, why doesn't Rails handle fixtures the same way? If you've worked on a large enough codebase, this question may have crossed your mind more than once, because reloading fixtures is a time consuming operation.

And if you're still with me, keep reading - that's the exact issue we're tackling here.

Problem

Before running a test (or multiple tests), rails truncates test database(s) and inserts fixtures. This only happens once before all tests, and does not happen between individual tests (at least, that's the behavior with use_transactional_tests = true). Depending on the size of the project, this can take a noticable amount of time, and when running a single test, it can take a significant chunk of the entire test time. It gets even worse when running tests in parallel, since all this is happening x number of times, where x is the number of parallel workers.

Now, in my opinion, this isn't a huge problem when running all tests, but where it gets truly painful is when iterating on a single test.

One thing to note though, in theory, this is only a problem when tests are using all fixtures (fixtures :all) and so there's a lot of data to insert. If, on the other hand, individual tests pick what fixtures they use, then the amount of data to insert should stay relatively small no matter the size of the project. Still, the higher up the test pyramid we go, the more fixtures are involved, and so fixtures :all becomes more likely.

Solution

We're going to patch some Rails code to skip fixture inserts when all of the following conditions are met:

  • fixture files haven't changed since the previous test run.
  • the database schema hasn't changed since the previous test run.
  • a fixtures digest is present (indicating the state of the fixtures from the previous run).

As a result, on a project with 195 fixture files, running an individual test (no parallelization) time dropped by over 30% (from 5 to 3.2 seconds on my machine). Running a test file (with parallelization) dropped by about 25% (from 11 to 8.2 seconds).

A nice bonus is that test logs are now free from thousands of lines of db inserts and other db setup gubbins.

I've been running this for a long time, and the cache invalidation strategy seems pretty solid. On an off chance you need to force insert fixtures, simply rm tmp/fixtures_digest*.

Implementation

Sadly, I wasn't able to come up with anything elegant, so brace yourself for a bunch of duct tape patchwork. However, it works and it's saving me a bit of time and a lot of sanity. I hope that some day this makes it into Rails, but in the mean time, you can just paste this code into your project and enjoy faster tests:

# test/support/preloaded_fixtures.rb module PreloadedFixtures extend ActiveSupport::Concern mattr_accessor :parallel_worker_number class Cache class << self def hit? last_digest&.== current_digest end def record_digest digest_path.write(current_digest) end private def current_digest @current_digest ||= begin files = Dir[Rails.root.join('test', 'fixtures', '**', '*.yml')] file_ids = files.sort.map { |f| "#{File.basename(f)}/#{Digest::SHA1.file(f).hexdigest}" } Digest::SHA1.hexdigest(file_ids.join('/')) end end def last_digest digest_path.read rescue Errno::ENOENT end def digest_path Rails.root.join('tmp', "fixtures_digest#{PreloadedFixtures.parallel_worker_number}") end end end class << self # skips fixtures insert into the database def patch_active_record_fixture_set return if ActiveRecord::FixtureSet.singleton_class.method_defined?(:original_insert) ActiveRecord::FixtureSet.singleton_class.class_eval do alias_method :original_insert, :insert define_method(:insert) do |fixture_sets, connection| if PreloadedFixtures.schema_up_to_date? && PreloadedFixtures::Cache.hit? && fixture_sets.first.model_class.count.positive? puts 'Using preloaded fixtures' if [nil, 0].include?(PreloadedFixtures.parallel_worker_number) # This magic line populates primary keys fixture_sets.each(&:table_rows) else PreloadedFixtures::Cache.record_digest original_insert(fixture_sets, connection) end end end end def schema_up_to_date? env_name = ActiveRecord::ConnectionHandling::DEFAULT_ENV.call ActiveRecord::Base.configurations.configs_for(env_name:).all? do |db_config| ActiveRecord::Tasks::DatabaseTasks.schema_up_to_date?(db_config) rescue ActiveRecord::NoDatabaseError false end end end class_methods do def preloaded_fixtures ENV['SKIP_TEST_DATABASE_TRUNCATE'] = '1' PreloadedFixtures.patch_active_record_fixture_set # HACK: ensure `parallel_worker_number` is assigned before rails' setup fixtures `after_hook`. # Because `after_hook` calls into our patched code that relies on `parallel_worker_number`. ActiveSupport::Testing::Parallelization.class_variable_get('@@after_fork_hooks').unshift(proc do |worker_number| PreloadedFixtures.parallel_worker_number = worker_number end) end end end

And then use it in test/test_helper.rb:

require_relative './support/preloaded_fixtures' module ActiveSupport class TestCase ... include PreloadedFixtures preloaded_fixtures fixtures :all ...

patch_active_record_fixture_set skips the inserts, but also, crucially, adds primary keys to the fixtures. So that when you call users(:bob) the underlying select statement contains the correct id to find user record for Bob.

Caveat: SKIP_TEST_DATABASE_TRUNCATE flag was introduced in Rails 7.2. If you want to use this for earlier versions, either skip preloaded_fixtures for parallel tests - e.g.:

# only call preloaded_fixtures for individual tests preloaded_fixtures if Rails::TestUnit::Runner.compose_filter(self, nil)

or you'll need to add one more patch:

# skips parallel databases truncation def patch_active_record_tasks_database_tasks_reconstruct_from_schema return if ActiveRecord::Tasks::DatabaseTasks.singleton_class.method_defined?(:original_reconstruct_from_schema) ActiveRecord::Tasks::DatabaseTasks.singleton_class.class_eval do alias_method :original_reconstruct_from_schema, :reconstruct_from_schema define_method(:reconstruct_from_schema) do |*args| unless PreloadedFixtures.schema_up_to_date? && PreloadedFixtures::Cache.hit? original_reconstruct_from_schema(*args) end end end end

And then swap ENV[SKIP_TEST_DATABASE_TRUNCATE] = '1' for PreloadedFixtures.patch_active_record_tasks_database_tasks_reconstruct_from_schema.

On a side note, I also use the above trick to turn off parallelization for individual tests. When running an individual test (e.g. rails test test/model/users_test:8) Rails still considers the total number of tests in the file when deciding whether to run tests in parallel or not. As a result, if there are more than fifty tests in a file, individual test will be run with the overhead of parallel setup. Use this code to address that:

# skip parallelization for individual tests parallelize unless Rails::TestUnit::Runner.compose_filter(self, nil)

Rant

Whilst testing SKIP_TEST_DATABASE_TRUNCATE (added in Rails 7.2), I kept wondering how it was possible to skip the cleanup part of the setup - truncation of database table - without simultaneously skipping the insert part. How come it doesn't keep inserting more and more of the same fixtures? Well, turns out, after truncating tables in parallelization setup, fixtures setup code kicks in and deletes from all tables anyway. In other words, it looks like truncation is just a complete waste of time to begin with (and a lot of it too). Unless I am missing something, fixtures should be implicitely turning on SKIP_TEST_DATABASE_TRUNCATE.