Testing Database Migrations in CI: Catching Broken Schema Changes Before They Hit Production

You wrote a migration. It passed code review. The reviewer checked the SQL syntax, confirmed the column name is not a typo, and approved the PR. CI was green. Then it ran on the staging database and everything looked fine.

Then it hit production and locked a 30-million-row table for four minutes while the ALTER TABLE rewrote every row. The deploy timed out. The rollback took another two minutes because the reverse migration was never tested either.

Database migrations are the highest-risk code you deploy. They touch production data directly, they cannot be rolled back with a simple git revert, and the consequences of a bad one range from “five-minute read-only outage” to “the backup restore from last night.” Despite this, most teams test migrations with nothing more than “did the SQL parse?”

This post is the CI pipeline I wish every team had before their first migration incident. It covers the four things you should validate for every migration: syntax, rollback, data preservation, and performance impact. Each check runs automatically on every PR, and the whole thing takes about an hour to set up.

Why code review is not enough for migrations

A migration PR looks like one file with a few lines of SQL. It is easy to review for typos and logic errors. It is almost impossible to review for the things that actually cause production incidents:

Lock duration: ALTER TABLE on a large table acquires an ACCESS EXCLUSIVE lock. Will it finish before your connection timeout?
Data loss: Does the down migration actually restore the data the up migration transformed? Or does it just drop the column and hope nobody noticed?
Hidden dependencies: Does some other team’s cron job query a column you are dropping? Does a materialized view reference a table you are renaming?
Performance regression: Does adding that index cause a write-throughput drop? Does dropping that index make a query that used to run in 5ms run in 5 seconds?
Partial failure: The migration runs five ALTER TABLE statements. The third one fails. Is the database in a consistent state?

None of these are visible from reading the SQL. A CI pipeline that actually tests migrations catches all of them.

The four-layer migration test pipeline

Every migration PR triggers four stages, each building on the last:

Stage 1: Syntax validation         ---  5 seconds
Stage 2: Rollback verification      --- 30 seconds
Stage 3: Data-preservation audit    ---  2 minutes
Stage 4: Performance impact check   ---  5 minutes

The pipeline spins up a fresh Postgres container for every PR branch, applies the migration, runs the checks, and tears it down. No shared state. No leftover schemas from a previous run.

Stage 1: Syntax validation

This is the floor. It catches typos, missing semicolons, and reference errors. Most teams run this. Some teams skip it and then wonder why the migration failed with ERROR: syntax error at or near "AERT".

The implementation is trivial with any CI runner. Here is a GitHub Actions job that validates every migration file against a fresh Postgres:

migration-validate:
  runs-on: ubuntu-latest
  services:
    postgres:
      image: postgres:16
      env:
        POSTGRES_DB: test_migrations
        POSTGRES_PASSWORD: test
      options: >-
        --health-cmd pg_isready
        --health-interval 5s
        --health-timeout 5s
        --health-retries 5
  steps:
    - uses: actions/checkout@v4
    - name: Validate migrations
      run: |
        for f in migrations/*.sql; do
          echo "Validating $f..."
          psql "$DATABASE_URL" -f "$f" > /dev/null
          if [ $? -ne 0 ]; then
            echo "FAILED: $f"
            exit 1
          fi
        done
      env:
        DATABASE_URL: postgres://postgres:test@localhost:5432/test_migrations

This runs every SQL file in order against a fresh database. If any file has a syntax error, the job fails and the PR cannot merge.

One refinement: run each migration file in a separate transaction and roll it back afterward, so the validation does not leave artifacts that affect the next file. This catches errors in individual migrations without depending on ordering:

for f in migrations/*.sql; do
  echo "Validating $f..."
  psql "$DATABASE_URL" -c "BEGIN;"
  psql "$DATABASE_URL" -f "$f" > /dev/null
  if [ $? -ne 0 ]; then
    psql "$DATABASE_URL" -c "ROLLBACK;"
    echo "FAILED: $f"
    exit 1
  fi
  psql "$DATABASE_URL" -c "ROLLBACK;"
done

Stage 2: Rollback verification

Up migrations get all the attention. Down migrations get a “I will write it later” that never comes. When the incident hits and the team needs to roll back, the down migration either does not exist or was never tested.

Rollback verification applies the up migration, then applies the down migration, then checks that the database schema matches the original state. The exact schema is checked, not just “did the SQL run without errors.”

# Capture the schema before the migration.
pg_dump --schema-only "$DATABASE_URL" > schema_before.sql

# Apply the up migration.
psql "$DATABASE_URL" -f migrations/20260613_add_status_column.up.sql

# Apply the down migration.
psql "$DATABASE_URL" -f migrations/20260613_add_status_column.down.sql

# Capture the schema after rollback.
pg_dump --schema-only "$DATABASE_URL" > schema_after.sql

# Compare. They should be identical.
diff schema_before.sql schema_after.sql || {
  echo "ROLLBACK TEST FAILED: schema does not match original"
  exit 1
}

A diff on pg_dump --schema-only catches the common rollback bugs: the down migration drops a column the up migration created but forgets to restore the original default value, or it drops the index but the up migration created two indexes and the down only drops one.

This test saved my team when a migration added a CHECK constraint with a down migration that only dropped the constraint by name, but the up migration had run after the constraint was renamed by an earlier migration. The schema diff caught the mismatch immediately.

Stage 3: Data-preservation audit

The first two stages validate the schema. They do not validate the data. A migration that transforms data (splitting a column, backfilling values, moving data between tables) can run without errors and silently corrupt or lose data.

Data-preservation testing starts from a seed database containing representative data. This is not a full production restore (that is expensive and may contain PII). It is a purpose-built fixture set that exercises every edge case your migration might encounter.

// test/migrations/data-preservation.test.ts
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { Client } from 'pg';

const client = new Client({
  connectionString: process.env.DATABASE_URL,
});

// Seed data that exercises edge cases.
const USERS = [
  { id: 1, email: 'alice@example.com', email_address: null },
  { id: 2, email: null, email_address: 'bob@example.com' },
  { id: 3, email: 'carol@example.com', email_address: 'carol-work@example.com' },
  { id: 4, email: '', email_address: null },
];

beforeAll(async () => {
  await client.connect();
  // Apply the migration under test.
  await client.query(`
    INSERT INTO users (id, email, email_address)
    VALUES ${USERS.map((u, i) => `($${i * 3 + 1}, $${i * 3 + 2}, $${i * 3 + 3})`).join(', ')}
  `, USERS.flatMap(u => [u.id, u.email, u.email_address]));
});

afterAll(async () => {
  await client.end();
});

describe('merge_email_addresses migration', () => {
  it('combines email and email_address into a single field', async () => {
    // Apply the migration that merges the two columns.
    // ... apply migration logic ...

    const result = await client.query('SELECT id, email FROM users ORDER BY id');

    expect(result.rows[0].email).toBe('alice@example.com');     // preferred field
    expect(result.rows[1].email).toBe('bob@example.com');       // only had email_address
    expect(result.rows[2].email).toBe('carol-work@example.com');// email_address takes priority
    expect(result.rows[3].email).toBe('');                       // both were empty/null
  });

  it('produces the same result after rollback and re-apply', async () => {
    // Roll back, seed fresh, apply again, verify idempotency.
    const before = await client.query('SELECT count(*) as cnt FROM users');
    // Roll down then up again...
    const after = await client.query('SELECT count(*) as cnt FROM users');
    expect(after.rows[0].cnt).toBe(before.rows[0].cnt);
  });
});

The fixture data covers four cases: only the old column populated, only the new column populated, both columns populated (conflict resolution), and the empty-string edge case. Most migration bugs hide in these edge cases, not in the happy path.

Stage 4: Performance impact check

The most expensive thing you can do in a CI pipeline is guess how long a migration will take on production. The second most expensive thing is not checking at all.

A migration that runs fine on an empty test database can take twenty minutes on a production table with 50 million rows. The CI pipeline cannot replicate your production data volume, but it can flag migrations that will obviously cause problems.

Use EXPLAIN to check for table rewrites before the migration runs:

# For ALTER TABLE statements, check if they require a rewrite.
# Postgres 11+ logs this. For older versions, check the ALTER TABLE docs.
# Use EXPLAIN to estimate the work involved:
psql "$DATABASE_URL" -c "
  SELECT 'Table rewrite required' AS warning
  FROM pg_class
  WHERE relname = 'users'
    AND reltuples > 100000
" | grep 'rewrite'

A more practical approach is to run the migration against a table with a realistic number of rows. Restore a subset of production data (without PII) into the CI database, time the migration, and fail if it exceeds a threshold.

# Insert 500k rows to create a realistic table size for benchmarking.
psql "$DATABASE_URL" -c "
  INSERT INTO users (email, created_at)
  SELECT
    'user' || generate_series(1, 500000) || '@example.com',
    now() - random() * interval '365 days'
"

# Time the migration.
START=$(date +%s%N)
psql "$DATABASE_URL" -f migrations/20260613_add_index.up.sql
END=$(date +%s%N)
DURATION=$(( (END - START) / 1000000 ))

echo "Migration took ${DURATION}ms"

if [ "$DURATION" -gt 10000 ]; then
  echo "WARNING: Migration took >10s on 500k rows. Check production impact."
fi

The 500k-row benchmark does not tell you exactly how long it will take on a 50-million-row production table. But it gives you an order-of-magnitude estimate. If it takes 8 seconds on 500k rows, it will take roughly 800 seconds on 50 million rows. That is enough information to block the merge and ask the author to refactor the migration using the expand-migrate-contract pattern.

Putting it together: a full PR pipeline

Here is the complete GitHub Actions workflow that combines all four stages:

name: Migration CI

on:
  pull_request:
    paths:
      - 'migrations/**'

jobs:
  migrate:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_DB: test_migrations
          POSTGRES_PASSWORD: test
        options: >-
          --health-cmd pg_isready
          --health-interval 5s
          --health-timeout 5s
          --health-retries 5

    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4

      - name: Stage 1 - Syntax validation
        run: |
          for f in migrations/*.up.sql; do
            psql "$DATABASE_URL" -c "BEGIN;" > /dev/null
            psql "$DATABASE_URL" -f "$f" > /dev/null
            psql "$DATABASE_URL" -c "ROLLBACK;" > /dev/null
            echo "OK: $f"
          done
        env:
          DATABASE_URL: postgres://postgres:test@localhost:5432/test_migrations

      - name: Stage 2 - Rollback verification
        run: |
          pg_dump --schema-only "$DATABASE_URL" > schema_before.sql
          for f in migrations/*.up.sql; do
            psql "$DATABASE_URL" -f "$f" > /dev/null
          done
          # Apply down migrations in reverse order.
          for f in $(ls migrations/*.down.sql | sort -r); do
            psql "$DATABASE_URL" -f "$f" > /dev/null
          done
          pg_dump --schema-only "$DATABASE_URL" > schema_after.sql
          diff schema_before.sql schema_after.sql || {
            echo "ROLLBACK FAILED"
            diff schema_before.sql schema_after.sql
            exit 1
          }
          echo "Rollback test passed"
        env:
          DATABASE_URL: postgres://postgres:test@localhost:5432/test_migrations

      - name: Stage 3 - Data preservation
        run: |
          npm ci
          npm run test:migrations
        env:
          DATABASE_URL: postgres://postgres:test@localhost:5432/test_migrations

      - name: Stage 4 - Performance check
        run: |
          # Seed 500k rows for a realistic table size.
          psql "$DATABASE_URL" -c "
            INSERT INTO users (email, created_at)
            SELECT 'benchmark' || generate_series(1, 500000) || '@test.com', now();
          "
          START=$(date +%s%N)
          psql "$DATABASE_URL" -f migrations/20260613_add_index.up.sql
          END=$(date +%s%N)
          DURATION=$(( (END - START) / 1000000 ))
          echo "Migration duration: ${DURATION}ms"
          if [ "$DURATION" -gt 10000 ]; then
            echo "FAIL: Migration took >10s on 500k rows. Refactor required."
            exit 1
          fi
        env:
          DATABASE_URL: postgres://postgres:test@localhost:5432/test_migrations

This pipeline runs in about eight minutes on a fresh Postgres container. The syntax check is nearly instant. The rollback test takes about thirty seconds. The data-preservation tests take a minute or two. The performance check takes five minutes including the seed step.

What this pipeline catches that reviews miss

In the six months since my team adopted this pipeline, it caught:

A migration that dropped a default value the down migration did not restore (stage 2).
A data backfill that used COALESCE(email, email_address) but null coalesces to null when both are null, losing the empty-string distinction (stage 3).
An index migration on a table with 500k synthetic rows that took 12 seconds, extrapolating to a 20-minute lock on production (stage 4).
A migration that added a NOT NULL column without a default on a table that already had rows, which would have failed immediately on a populated table (stage 1 caught this as syntax-valid but the actual error would have been runtime).

Each of these was caught in CI, not in an incident postmortem.

When to skip these tests

Not every migration needs the full pipeline. An ALTER TABLE that adds a nullable column with no default on a table with 100 rows is low risk. Running the performance benchmark on it is noise.

Use a convention-based approach: if the migration touches a table in the critical_tables list (defined in a config file at the repo root), run the full pipeline. Otherwise, run stages 1 and 2 only.

- name: Skip performance check for non-critical tables
  run: |
    table_name=$(grep -oP '(?<=TABLE\s+)(\w+)' migrations/*.up.sql)
    if grep -q "$table_name" critical_tables.txt 2>/dev/null; then
      echo "Running full pipeline for table: $table_name"
    else
      echo "Skipping performance check for: $table_name"
      exit 0
    fi

The investment is small, the payoff is immediate

A CI pipeline for migrations costs about an hour of setup time and eight minutes per PR run. The alternative is a production incident that costs hours of emergency debugging, a rollback that might not work, and a postmortem that says “we should have tested the migration.” Hour for hour, testing migrations in CI is one of the highest-return investments you can make in your deployment pipeline.

The next time someone opens a PR with a single ALTER TABLE ADD COLUMN and says “it’s just one line, it’s fine,” point them at this pipeline. One line of SQL that runs on a table with 50 million rows is the most dangerous change you will merge all week. Test it like it is.

A note from Yojji

The discipline of testing database migrations before they reach production is the kind of engineering rigor that separates teams who ship confidently from teams who treat every deploy as a prayer. Yojji’s teams build CI pipelines that validate schema changes, test rollbacks against real data shapes, and catch performance regressions before they lock a production table. It is the unglamorous, high-return work that keeps production stable and deploys boring.