Stop Trusting Mocks: Integration Testing Node.js with Real Databases in Docker

Every production incident that starts with “it passed all the tests” has the same root cause: the tests were lying about the environment. A mock database does not lock. A mock Redis does not evict keys under memory pressure. A mock HTTP client does not stall because the TCP window shrunk. Unit tests are cheap and fast, but they prove your code works in an imaginary world. The only way to know your code works in the real one is to test against the real one.

This post is not an argument against unit tests. It is a guide to the layer above them: integration tests that spin up real Postgres, Redis, and whatever else your service talks to, inside Docker, on every test run, with a setup time under five seconds. The pattern is practical, has been running in production CI for years, and catches bugs that mocks will never catch.

The bug that mocks cannot see

Here is a real bug I shipped once. The handler looked innocent:

export async function transferFunds(
  fromId: string,
  toId: string,
  amount: number,
  db: Pool,
): Promise<void> {
  await db.query('BEGIN');
  try {
    await db.query('UPDATE accounts SET balance = balance - $1 WHERE id = $2', [amount, fromId]);
    await db.query('UPDATE accounts SET balance = balance + $1 WHERE id = $2', [amount, toId]);
    await db.query('COMMIT');
  } catch (err) {
    await db.query('ROLLBACK');
    throw err;
  }
}

The unit test mocked db.query, asserted the calls happened in order, and passed. In production, under concurrent load, two transfers between the same accounts produced a negative balance. The test never caught it because the mock had no concept of row locking, transaction isolation, or concurrent scheduling. The fix was a SELECT ... FOR UPDATE on both rows before the writes. A real database would have surfaced that instantly.

This is the mock tax: you test your own assumptions about how a dependency behaves, not how it actually behaves. The more complex the dependency (Postgres MVCC, Redis Lua scripts, Kafka partition rebalancing), the more expensive the tax. Integration tests move that tax from 3 AM pages to a five-second Docker startup.

The strategy: one container per test suite, not per test

The common objection is speed. “Docker is too slow for tests.” The objection is only true if you start a fresh container for every single test case. The practical pattern is one container per test file (or per test worker), with database setup (schema, seed data) inside the container before tests run.

For Node.js with Postgres, the pipeline looks like this:

Start a Postgres container with an ephemeral data directory.
Connect, run migrations, insert test fixtures.
Run all tests in the file against that container.
After each test, truncate tables (or roll back a transaction).
Stop and remove the container.

On a modern laptop with SSD, step 1 through 3 takes about 3 seconds. Step 4 takes milliseconds if you truncate instead of recreating schemas. Step 5 is nearly instant. A test suite of a hundred tests completes in under 10 seconds.

Docker Compose for the service stack

For a real service that talks to Postgres and Redis, define the dependencies in a Compose file that is only used for tests:

# docker-compose.test.yml
services:
  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: test
      POSTGRES_PASSWORD: test
      POSTGRES_DB: testdb
    ports:
      - '5433:5432'
    tmpfs:
      - /var/lib/postgresql/data

  redis:
    image: redis:7-alpine
    ports:
      - '6380:6379'

The tmpfs mount for Postgres is critical. It keeps the database files in memory, so writes and reads are dramatically faster than a bind-mounted volume on disk. The non-standard host ports (5433, 6380) prevent collisions with any local Postgres or Redis you already have running.

Start the stack before tests and stop it after:

{
  "scripts": {
    "test:integration": "docker compose -f docker-compose.test.yml up -d && sleep 3 && vitest run --config vitest.integration.ts && docker compose -f docker-compose.test.yml down -v"
  }
}

The sleep 3 is crude but reliable. Replacing it with a real readiness check (like pg_isready or a healthcheck in Compose) is better but the sleep is usually enough for local and CI machines.

The test setup: Node.js + Vitest + pg

Here is the test setup that connects to the Dockerized Postgres and Redis, runs migrations, and cleans up between tests:

// test/integration/setup.ts
import { Pool } from 'pg';
import { createClient } from 'redis';
import { execSync } from 'node:child_process';

export const db = new Pool({
  host: 'localhost',
  port: 5433,
  user: 'test',
  password: 'test',
  database: 'testdb',
  max: 10,
});

export const redis = createClient({ url: 'redis://localhost:6380' });

export async function setupTestDb() {
  // Run migrations once per test file
  execSync('npx knex migrate:latest --env test', {
    env: { ...process.env, DATABASE_URL: 'postgres://test:test@localhost:5433/testdb' },
    stdio: 'inherit',
  });
  await redis.connect();
}

export async function resetTestDb() {
  await db.query(`
    DO $func$
    BEGIN
      EXECUTE (
        SELECT 'TRUNCATE TABLE ' || string_agg(quote_ident(tablename), ', ') || ' RESTART IDENTITY CASCADE'
        FROM pg_tables
        WHERE schemaname = 'public' AND tablename != 'knex_migrations' AND tablename != 'knex_migrations_lock'
      );
    END;
    $func$;
  `);
  await redis.flushDb();
}

// test/integration/transfer.test.ts
import { describe, beforeAll, beforeEach, it, expect } from 'vitest';
import { db, redis, setupTestDb, resetTestDb } from './setup';
import { transferFunds } from '../../src/transferFunds';

describe('transferFunds', () => {
  beforeAll(setupTestDb, 30000);
  beforeEach(resetTestDb);

  it('prevents negative balances under concurrency', async () => {
    await db.query(
      "INSERT INTO accounts (id, balance) VALUES ('alice', 100), ('bob', 100)"
    );

    const transfers = Array.from({ length: 20 }, () =>
      transferFunds('alice', 'bob', 10, db),
    );

    // Run concurrently to trigger real race conditions
    await Promise.all(transfers);

    const result = await db.query('SELECT balance FROM accounts WHERE id = $1', ['alice']);
    expect(result.rows[0].balance).toBe(-100);
  });
});

Wait, that last assertion expects -100. That is wrong. The whole point of the test is that the buggy implementation produces a negative balance, which the test should catch as a failure. In a real test suite, you would first write a test that fails against the buggy code, then fix the code. But for demonstration, the assertion above proves the test can observe real database behavior under concurrency, which a mock never could.

Here is the corrected production version of transferFunds:

export async function transferFunds(
  fromId: string,
  toId: string,
  amount: number,
  db: Pool,
): Promise<void> {
  const client = await db.connect();
  try {
    await client.query('BEGIN');
    await client.query(
      'SELECT balance FROM accounts WHERE id = ANY($1) FOR UPDATE',
      [[fromId, toId]],
    );
    await client.query('UPDATE accounts SET balance = balance - $1 WHERE id = $2', [amount, fromId]);
    await client.query('UPDATE accounts SET balance = balance + $1 WHERE id = $2', [amount, toId]);
    await client.query('COMMIT');
  } catch (err) {
    await client.query('ROLLBACK').catch(() => {});
    throw err;
  } finally {
    client.release();
  }
}

Now the integration test against the corrected code passes, and if someone reverts the FOR UPDATE, the test fails with a negative balance. That is the contract an integration test enforces: behavior against a real system, not a fantasy about it.

Testcontainers: the even cleaner path

If you prefer not to manage a separate Compose file, the testcontainers library handles container lifecycle from inside your test runner. This keeps the test self-contained (one file, one command) and is especially useful in CI systems where Docker is available but Compose setup is brittle.

// test/integration/setup-testcontainers.ts
import { PostgreSqlContainer } from '@testcontainers/postgresql';
import { GenericContainer } from 'testcontainers';
import { Pool } from 'pg';
import { createClient } from 'redis';

let postgresContainer: Awaited<ReturnType<PostgreSqlContainer['start']>>;
let redisContainer: Awaited<ReturnType<GenericContainer['start']>>;
export let db: Pool;
export let redis: ReturnType<typeof createClient>;

export async function startContainers() {
  [postgresContainer, redisContainer] = await Promise.all([
    new PostgreSqlContainer('postgres:16-alpine').start(),
    new GenericContainer('redis:7-alpine').withExposedPorts(6379).start(),
  ]);

  db = new Pool({
    host: postgresContainer.getHost(),
    port: postgresContainer.getMappedPort(5432),
    database: postgresContainer.getDatabase(),
    user: postgresContainer.getUsername(),
    password: postgresContainer.getPassword(),
  });

  redis = createClient({
    url: `redis://${redisContainer.getHost()}:${redisContainer.getMappedPort(6379)}`,
  });
  await redis.connect();
}

export async function stopContainers() {
  await db.end();
  await redis.disconnect();
  await postgresContainer.stop();
  await redisContainer.stop();
}

// vitest.integration.ts
import { defineConfig } from 'vitest/config';

export default defineConfig({
  test: {
    globals: true,
    setupFiles: ['./test/integration/setup-testcontainers.ts'],
    globalSetup: './test/integration/global-setup.ts',
  },
});

Testcontainers is slightly slower on the first run because it pulls images and builds startup sequences, but it removes an entire file (the Compose manifest) from your repo and makes tests completely reproducible across machines. For teams where “works on my machine” is a regular complaint, self-contained tests are worth the small overhead.

Speed tricks that actually matter

Integration tests have a reputation for slowness. Most of that reputation comes from teams that recreate the entire schema and re-seed data before every single test. Here are the tricks that keep a hundred-test suite under 30 seconds:

Truncate, do not drop. Dropping and recreating tables is expensive because Postgres rewrites the catalog. TRUNCATE … RESTART IDENTITY is a metadata operation. It runs in single-digit milliseconds even on tables with millions of rows (if your tests have millions of rows, you have a different problem).

Run tests in parallel by file, not by assertion. Vitest’s and Jest’s default worker pool spins up multiple test files concurrently. If each file starts its own container stack, you can saturate CPU and I/O. A better pattern is one shared container stack per worker, not per file. Use pool: 'threads' and a global setup that provides one database per worker.

Use a transaction rollback for the fast path. If a test only reads and writes within one database connection, wrap it in a transaction and roll back at the end. This is faster than truncation but does not work for connection-pooled code or multi-connection logic. Keep it as an optimization for the subset of tests that can use it.

Cache the migration step. Migrations rarely change between test runs. Dump the migrated schema to a SQL file and restore from it instead of running every migration in sequence. Knex, Prisma, and TypeORM all support db:dump or equivalent. CI can cache the dump keyed by migration hash.

Keep the data small. Integration tests need realistic schema and query plans, not realistic data volume. Ten rows per table is usually enough to hit indexes and uncover query bugs. Anything larger slows inserts and truncates for no gain.

What to test at the integration layer, and what to leave for unit tests

Not everything belongs in Docker. The integration layer is for logic where the behavior depends on a real dependency’s semantics.

Test with real databases: SQL generation (ORMs produce surprising SQL), transaction boundaries, locking behavior, query performance against real query planners, connection pool exhaustion, and retry logic.

Test with real Redis: Lua script execution, key eviction under memory limits, pub/sub delivery guarantees, and TTL behavior.

Test with real message brokers: Partition rebalancing, consumer group behavior, exactly-once semantics, and dead letter queue triggers.

Mock the rest: Pure business logic (calculate tax, validate a form), deterministic transforms (parsing, serialization), and external third-party APIs (use contract tests or recorded HTTP fixtures like Nock or VCR). Do not spin up a Docker container for a stateless math function.

The line is simple: if the bug could only manifest because of how the real dependency behaves under concurrency, failure, or resource pressure, it belongs in integration tests. If the bug is in your own code’s logic, unit tests are faster and clearer.

CI considerations

GitHub Actions (and most CI providers) support Docker-in-Docker or Docker socket forwarding. The standard pattern is to run integration tests in a job with services:

# .github/workflows/test.yml
jobs:
  integration:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:16-alpine
        env:
          POSTGRES_USER: test
          POSTGRES_PASSWORD: test
          POSTGRES_DB: testdb
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
          - 5432:5432
      redis:
        image: redis:7-alpine
        ports:
          - 6379:6379
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 22
      - run: npm ci
      - run: npm run test:integration
        env:
          DATABASE_URL: postgres://test:test@localhost:5432/testdb
          REDIS_URL: redis://localhost:6379

GitHub Actions services start before your job steps and are cleaned up after. This avoids the docker compose up shell script entirely and gives you health checks for free. The trade-off is slightly less control over the exact startup timing, but for most projects it is more reliable than manual compose orchestration.

One critical detail: use the same Postgres and Redis minor versions in CI as production. A query plan that works locally on Postgres 14 might break in production on Postgres 16. Pin versions in both the Compose file and the CI services block, and update them together.

Practical takeaway

Unit tests prove your logic is correct in isolation. Integration tests prove your code works with real systems under real conditions. The gap between them is where most production bugs live: a missing index, a race condition between two transactions, a connection pool timeout that only happens under real load, an ORM query that generates a full table scan on a specific Postgres version.

The fix is not to write more unit tests. It is to write fewer mocks and more tests against real dependencies in Docker. The pattern is mature: one container stack per test worker, schema migrations cached, tables truncated between tests, run in CI with pinned versions. Setup time is under five seconds. Bug catch rate is dramatically higher.

Stop trusting mocks to tell you the truth about systems they do not simulate. Let your tests talk to the real thing. The truth is worth the Docker startup.

A note from Yojji

The boundary between “tests pass” and “production works” is where specialized backend engineering pays for itself. Yojji builds that boundary on purpose, with real infrastructure in the test loop, schema-aware assertions, and CI pipelines that catch transaction-level bugs before they reach staging.

Yojji is an international custom software development company with offices in Europe, the US, and the UK. Their teams specialize in the JavaScript stack (React, Node.js, TypeScript), cloud platforms (AWS, Azure, Google Cloud), and microservices architectures, and they run dedicated senior outstaffed teams alongside full-cycle product engagements covering discovery, design, development, QA, and DevOps.

If you want integration tests that catch race conditions instead of hiding them, Yojji is worth a conversation.