Faster Rails fixtures

December 9, 2024 • 6 min read • railstesting

Have you ever noticed that Rails always reloads test fixtures into the database, even if they haven't changed? You might have also seen the Rails development server compiling browser assets on-the-fly only when needed. So, why doesn't Rails handle fixtures the same way? If you've worked on a large enough codebase, this question may have crossed your mind more than once, because reloading fixtures is a time consuming operation.

And if you're still with me, keep reading - that's the exact issue we're tackling here.

Problem

Before running a test (or multiple tests), rails truncates test database(s) and inserts fixtures. This only happens once before all tests, and does not happen between individual tests (at least, that's the behavior with use_transactional_tests = true). Depending on the size of the project, this can take a noticable amount of time, and when running a single test, it can take a significant chunk of the entire test time. It gets even worse when running tests in parallel, since all this is happening x number of times, where x is the number of parallel workers.

Now, in my opinion, this isn't a huge problem when running all tests, but where it gets truly painful is when iterating on a single test.

One thing to note though, in theory, this is only a problem when tests are using all fixtures (fixtures :all) and so there's a lot of data to insert. If, on the other hand, individual tests pick what fixtures they use, then the amount of data to insert should stay relatively small no matter the size of the project. Still, the higher up the test pyramid we go, the more fixtures are involved, and so fixtures :all becomes more likely.

Solution

We're going to patch some Rails code to skip fixture inserts when all of the following conditions are met:

fixture files haven't changed since the previous test run.
the database schema hasn't changed since the previous test run.
a fixtures digest is present (indicating the state of the fixtures from the previous run).

As a result, on a project with 195 fixture files, running an individual test (no parallelization) time dropped by over 30% (from 5 to 3.2 seconds on my machine). Running a test file (with parallelization) dropped by about 25% (from 11 to 8.2 seconds).

A nice bonus is that test logs are now free from thousands of lines of db inserts and other db setup gubbins.

I've been running this for a long time, and the cache invalidation strategy seems pretty solid. On an off chance you need to force insert fixtures, simply rm tmp/fixtures_digest*.

Implementation

Sadly, I wasn't able to come up with anything elegant, so brace yourself for a bunch of duct tape patchwork. However, it works and it's saving me a bit of time and a lot of sanity. I hope that some day this makes it into Rails, but in the mean time, you can just paste this code into your project and enjoy faster tests:

Copied!
# test/support/preloaded_fixtures.rb
module PreloadedFixtures
  extend ActiveSupport::Concern

  mattr_accessor :parallel_worker_number, :enabled

  class Cache
    class << self
      def hit?
        last_digest&.== current_digest
      end

      def record_digest
        digest_path.write(current_digest)
      end

      def clear
        digest_path.delete if digest_path.exist?
      end

      private

      def current_digest
        @current_digest ||= begin
          files = Rails.root.glob('test/fixtures/**/*.yml')
          file_ids = files.sort.map { |f| "#{File.basename(f)}/#{Digest::SHA1.file(f).hexdigest}" }
          Digest::SHA1.hexdigest(file_ids.join('/'))
        end
      end

      def last_digest
        digest_path.read
      rescue StandardError
        Errno::ENOENT
      end

      def digest_path
        Rails.root.join('tmp', "fixtures_digest#{PreloadedFixtures.parallel_worker_number}")
      end
    end
  end

  class << self
    # skips fixtures insert into the database
    def patch_active_record_fixture_set
      return if ActiveRecord::FixtureSet.singleton_class.method_defined?(:original_insert)

      ActiveRecord::FixtureSet.singleton_class.class_eval do
        alias_method :original_insert, :insert

        define_method(:insert) do |fixture_sets, connection|
          if PreloadedFixtures.schema_up_to_date? &&
             PreloadedFixtures::Cache.hit? &&
             fixture_sets.first.model_class.count.positive?

            puts 'Using preloaded fixtures' if [nil, 0].include?(PreloadedFixtures.parallel_worker_number)

            # Clearing cache while the tests are running ensures that in case of program abort
            # (e.g. Kernel.exit from debugger) fixtures will be re-inserted next time we're running tests.
            # Which is what we want since the test db isn't rolled back on abort.
            #
            # In case of normal clean exit (where we can safely assume that rollback took place),
            # cache will be restored in the after_teardown below.
            PreloadedFixtures::Cache.clear

            # This magic line populates primary keys
            fixture_sets.each(&:table_rows)
          else
            original_insert(fixture_sets, connection)
          end
        end
      end
    end

    def schema_up_to_date?
      env_name = ActiveRecord::ConnectionHandling::DEFAULT_ENV.call

      ActiveRecord::Base.configurations.configs_for(env_name:).all? do |db_config|
        ActiveRecord::Tasks::DatabaseTasks.schema_up_to_date?(db_config)
      rescue ActiveRecord::NoDatabaseError
        false
      end
    end
  end

  class_methods do
    def preloaded_fixtures
      PreloadedFixtures.enabled = true

      ENV['SKIP_TEST_DATABASE_TRUNCATE'] = '1'
      PreloadedFixtures.patch_active_record_fixture_set

      # HACK: ensure `parallel_worker_number` is assigned before rails' setup fixtures `after_hook`.
      # Because `after_hook` calls into our patched code that relies on `parallel_worker_number`.
      ActiveSupport::Testing::Parallelization.class_variable_get('@@after_fork_hooks').unshift(proc do |worker_number|
        PreloadedFixtures.parallel_worker_number = worker_number
      end)
    end
  end

  def after_teardown
    PreloadedFixtures::Cache.record_digest if PreloadedFixtures.enabled
    super
  end
end

And then use it in test/test_helper.rb:

Copied!
require_relative './support/preloaded_fixtures'

module ActiveSupport
  class TestCase
    ...
    include PreloadedFixtures

    preloaded_fixtures
    fixtures :all
    ...

patch_active_record_fixture_set skips the inserts, but also, crucially, adds primary keys to the fixtures. So that when you call users(:bob) the underlying select statement contains the correct id to find user record for Bob.

Caveat: SKIP_TEST_DATABASE_TRUNCATE flag was introduced in Rails 7.2. If you want to use this for earlier versions, either skip preloaded_fixtures for parallel tests - e.g.:

Copied!
    # only call preloaded_fixtures for individual tests
    preloaded_fixtures if Rails::TestUnit::Runner.compose_filter(self, nil)

or you'll need to add one more patch:

Copied!
    # skips parallel databases truncation
    def patch_active_record_tasks_database_tasks_reconstruct_from_schema
      return if ActiveRecord::Tasks::DatabaseTasks.singleton_class.method_defined?(:original_reconstruct_from_schema)

      ActiveRecord::Tasks::DatabaseTasks.singleton_class.class_eval do
        alias_method :original_reconstruct_from_schema, :reconstruct_from_schema

        define_method(:reconstruct_from_schema) do |*args|
          unless PreloadedFixtures.schema_up_to_date? && PreloadedFixtures::Cache.hit?
            original_reconstruct_from_schema(*args)
          end
        end
      end
    end

And then swap ENV[SKIP_TEST_DATABASE_TRUNCATE] = '1' for PreloadedFixtures.patch_active_record_tasks_database_tasks_reconstruct_from_schema.

On a side note, I also use the above trick to turn off parallelization for individual tests. When running an individual test (e.g. rails test test/model/users_test:8) Rails still considers the total number of tests in the file when deciding whether to run tests in parallel or not. As a result, if there are more than fifty tests in a file, individual test will be run with the overhead of parallel setup. Use this code to address that:

Copied!
    # skip parallelization for individual tests
    parallelize unless Rails::TestUnit::Runner.compose_filter(self, nil)

Rant

Whilst testing SKIP_TEST_DATABASE_TRUNCATE (added in Rails 7.2), I kept wondering how it was possible to skip the cleanup part of the setup - truncation of database table - without simultaneously skipping the insert part. How come it doesn't keep inserting more and more of the same fixtures? Well, turns out, after truncating tables in parallelization setup, fixtures setup code kicks in and deletes from all tables anyway. In other words, it looks like truncation is just a complete waste of time to begin with (and a lot of it too). Unless I am missing something, fixtures should be implicitely turning on SKIP_TEST_DATABASE_TRUNCATE.