Postgres SKIP LOCKED: An 80-Line Job Queue You Can Run Without Redis

You added a feature that needs to send an email after a user signs up. Five months later your codebase has Redis, BullMQ, a separate worker deployment, three retry policies that disagree with each other, and a Sunday-morning Slack thread about why a job got stuck in active for six hours.

The thing nobody tells you when you reach for a “proper” job queue is that Postgres — the database you already pay for, monitor, back up, and have a connection pool to — has had a battle-tested job queue primitive built in since version 9.5. It is called SELECT ... FOR UPDATE SKIP LOCKED, and you can build a multi-worker, retry-safe, visibility-timeout-respecting queue in about 80 lines of code.

Here is exactly what that looks like, why each line is there, and how to convince yourself it actually does not double-process jobs under heavy concurrency.

Why most “use Postgres as a queue” attempts fail

The naive version of this looks fine for about two days:

-- worker pseudo-code
BEGIN;
SELECT * FROM jobs WHERE status = 'pending' ORDER BY id LIMIT 1;
UPDATE jobs SET status = 'running' WHERE id = $1;
COMMIT;
-- ... do work ...
UPDATE jobs SET status = 'done' WHERE id = $1;

Run two workers. Both SELECT simultaneously, both see the same row, both UPDATE. One job runs twice. You add FOR UPDATE to the SELECT and now the second worker blocks on the row lock — except it blocks on the same row the first worker is processing, so worker two does nothing while worker one runs. Throughput collapses to single-worker speed. Add five workers and four of them sit idle on the lock.

SKIP LOCKED fixes both problems in one tiny phrase. It tells Postgres: “give me the next row that is not currently locked, and skip past anything that is.” Two workers each get a different row. Five workers get five different rows. The database does the dispatching for you, atomically, with no scheduler, no leader election, no Lua script.

This is the same primitive Que uses for its dispatcher in Ruby, the one Graphile Worker ships in Node.js, and the one Oban uses in Elixir. It has been in Postgres since 9.5 and has none of the rough edges you would expect from a “use the database as a queue” suggestion. The technique is mature. Almost nobody outside of those projects uses it.

The schema

Five columns of state plus an index. That is the entire storage layer.

CREATE TABLE jobs (
  id            bigserial PRIMARY KEY,
  kind          text       NOT NULL,
  payload       jsonb      NOT NULL,
  status        text       NOT NULL DEFAULT 'pending'
                CHECK (status IN ('pending', 'running', 'done', 'failed')),
  attempts      int        NOT NULL DEFAULT 0,
  max_attempts  int        NOT NULL DEFAULT 5,
  run_at        timestamptz NOT NULL DEFAULT now(),
  locked_at     timestamptz,
  last_error    text,
  created_at    timestamptz NOT NULL DEFAULT now()
);

-- The one index that makes the dispatcher fast.
CREATE INDEX jobs_dispatch_idx
  ON jobs (run_at)
  WHERE status = 'pending';

A few decisions in there are deliberate.

run_at lets you schedule jobs in the future (run_at = now() + interval '5 minutes') — the dispatcher already filters on it, so delayed jobs cost nothing extra. locked_at is what makes the visibility timeout work: if a worker dies mid-job, locked_at is the timestamp we compare against to decide “this row is stuck, take it back.” attempts and max_attempts per row let you set retry policy per job kind without a separate config. The partial index — WHERE status = 'pending' — is the difference between an O(1) dispatcher and a query that scans the entire done history every time.

The 80 lines

This is the core. Drop it in a file, point it at any Postgres, and you have a worker.

// queue.ts
import { Pool, PoolClient } from 'pg';

export const pool = new Pool({ connectionString: process.env.DATABASE_URL });

export type Handler = (payload: unknown) => Promise<void>;

const VISIBILITY_TIMEOUT_SEC = 300; // 5 min — must exceed your slowest job
const POLL_INTERVAL_MS       = 500;
const BACKOFF_BASE_SEC       = 5;

export async function enqueue(
  kind: string,
  payload: unknown,
  opts: { runAt?: Date; maxAttempts?: number } = {},
) {
  await pool.query(
    `INSERT INTO jobs (kind, payload, run_at, max_attempts)
     VALUES ($1, $2, $3, $4)`,
    [kind, payload, opts.runAt ?? new Date(), opts.maxAttempts ?? 5],
  );
}

export function startWorker(handlers: Record<string, Handler>) {
  let stopped = false;

  async function loop() {
    while (!stopped) {
      const ran = await tick(handlers);
      if (!ran) await sleep(POLL_INTERVAL_MS);
    }
  }

  loop().catch((e) => console.error('[worker] crashed', e));
  return () => { stopped = true; };
}

async function tick(handlers: Record<string, Handler>): Promise<boolean> {
  const client = await pool.connect();
  try {
    await client.query('BEGIN');

    const job = await claim(client);
    if (!job) {
      await client.query('COMMIT');
      return false;
    }

    const handler = handlers[job.kind];
    if (!handler) {
      await fail(client, job, `no handler for kind=${job.kind}`);
      await client.query('COMMIT');
      return true;
    }

    try {
      await handler(job.payload);
      await client.query(
        `UPDATE jobs SET status = 'done', locked_at = NULL WHERE id = $1`,
        [job.id],
      );
    } catch (err: any) {
      await fail(client, job, err?.message ?? String(err));
    }

    await client.query('COMMIT');
    return true;
  } catch (e) {
    await client.query('ROLLBACK').catch(() => {});
    throw e;
  } finally {
    client.release();
  }
}

async function claim(client: PoolClient) {
  const { rows } = await client.query(
    `UPDATE jobs SET
        status    = 'running',
        attempts  = attempts + 1,
        locked_at = now()
      WHERE id = (
        SELECT id FROM jobs
         WHERE (status = 'pending' AND run_at <= now())
            OR (status = 'running'
                AND locked_at < now() - interval '${VISIBILITY_TIMEOUT_SEC} seconds')
         ORDER BY run_at
         FOR UPDATE SKIP LOCKED
         LIMIT 1
      )
      RETURNING id, kind, payload, attempts, max_attempts`,
  );
  return rows[0];
}

async function fail(client: PoolClient, job: any, message: string) {
  if (job.attempts >= job.max_attempts) {
    await client.query(
      `UPDATE jobs SET status = 'failed', last_error = $2, locked_at = NULL
        WHERE id = $1`,
      [job.id, message],
    );
    return;
  }
  // Exponential backoff: 5s, 10s, 20s, 40s, ...
  const delaySec = BACKOFF_BASE_SEC * 2 ** (job.attempts - 1);
  await client.query(
    `UPDATE jobs SET
        status     = 'pending',
        last_error = $2,
        locked_at  = NULL,
        run_at     = now() + interval '1 second' * $3
      WHERE id = $1`,
    [job.id, message, delaySec],
  );
}

const sleep = (ms: number) => new Promise((r) => setTimeout(r, ms));

Wire it in any process — your API, a dedicated worker container, a CLI script, anywhere with a Postgres connection.

import { startWorker, enqueue } from './queue';
import { sendWelcomeEmail } from './email';

startWorker({
  'send-welcome-email': async (payload: any) => {
    await sendWelcomeEmail(payload.userId);
  },
});

// Anywhere in your app:
await enqueue('send-welcome-email', { userId: 42 });

That is the whole system. Schema, three functions, and a polling loop.

Why each piece is there

Five lines look optional and are not.

UPDATE ... WHERE id = (SELECT ... FOR UPDATE SKIP LOCKED). The single most important construct in the file. The inner SELECT finds a candidate row, locks it for the duration of the transaction, and skips any rows already locked by another transaction. The outer UPDATE then atomically marks it running. Two workers running this query at the same nanosecond each get a different row — Postgres handles the dispatch with zero coordination on your end.

The visibility-timeout subquery: OR (status = 'running' AND locked_at < now() - interval '300 seconds'). This is the line that recovers from worker crashes. If a worker process dies (OOM, SIGKILL, network partition, kubelet eviction) while holding a job, the job sits in running forever — except when another worker polls, this clause picks it up after 5 minutes. Set VISIBILITY_TIMEOUT_SEC to comfortably exceed your slowest legitimate job, otherwise a slow job gets re-claimed by a second worker and runs twice.

attempts = attempts + 1 inside claim, before the handler runs. The increment has to happen when the row is claimed, not after success or failure. If you increment only on failure, a worker that crashes after running the side effect but before updating attempts will retry forever. Incrementing on claim caps the retry count even when the worker dies between “I did the work” and “I told the database I did the work.”

Exponential backoff in fail. Without it, a transient downstream outage (your email provider returns 502 for 10 seconds) means every queued job retries immediately, fails, retries immediately, fails — the queue runs the wrong way and your error rate spikes. The 5s/10s/20s/40s schedule means a temporary outage clears itself before max-attempts is reached.

The partial index (WHERE status = 'pending'). Without it, the dispatch query scans done rows too. With it, the dispatcher reads only the small subset of pending work, even when the table has millions of completed rows. Skipping this index turns the “Postgres queue” into the slow blog-post-style version that gets people to give up and install Redis.

COMMIT before returning the row. The whole tick runs inside one transaction so that the claim and the status update happen atomically with whatever your handler did to the database. If your handler writes to the same Postgres database that owns the queue, the job’s side effects and its done marker commit together — no chance of “the email sent but the job is still pending.”

How to test it (the hammer)

A unit test that says “enqueue then worker tick processes one job” misses every interesting bug. The interesting bugs only happen under concurrency. The right test fires hundreds of jobs into the queue and runs many workers in parallel, then asserts that every job ran exactly once.

import { test, expect } from 'vitest';
import { pool, enqueue, startWorker } from './queue';

test('100 jobs across 10 workers run exactly once each', async () => {
  await pool.query('TRUNCATE jobs');

  const N = 100;
  const seen = new Map<number, number>();

  // Enqueue 100 jobs.
  for (let i = 0; i < N; i++) {
    await enqueue('count', { i });
  }

  // 10 concurrent workers.
  const stops = Array.from({ length: 10 }, () =>
    startWorker({
      'count': async (payload: any) => {
        seen.set(payload.i, (seen.get(payload.i) ?? 0) + 1);
        // Pretend the work takes some time so the SKIP LOCKED race is interesting.
        await new Promise((r) => setTimeout(r, 5 + Math.random() * 20));
      },
    }),
  );

  // Wait for the queue to drain.
  while (true) {
    const { rows } = await pool.query(
      `SELECT count(*)::int AS n FROM jobs WHERE status != 'done'`,
    );
    if (rows[0].n === 0) break;
    await new Promise((r) => setTimeout(r, 50));
  }

  stops.forEach((stop) => stop());

  // The actual assertion that matters.
  expect(seen.size).toBe(N);
  for (let i = 0; i < N; i++) {
    expect(seen.get(i)).toBe(1); // exactly once, not at-least-once
  }
});

Run it ten times in a row. If even one run produces a duplicate, you have a race somewhere — and almost always the cause is forgetting FOR UPDATE on the inner SELECT (which makes SKIP LOCKED silently no-op) or omitting the surrounding UPDATE ... WHERE id = (...) wrapper.

For a heavier check, scale N to 10,000 and watch pg_stat_activity while it runs. You should see workers picking different rows in lock-step, no Lock waits in the activity view, and CPU dominated by your handler — not by the queue.

What you give up vs. Redis-based queues

The honest list, because there is no free lunch.

Throughput ceiling. A well-tuned SKIP LOCKED queue saturates somewhere around 10,000 jobs/sec on a single Postgres primary. If you genuinely need more — high-volume ad-tech, telemetry ingestion — you want Kafka or NATS, not a queue. Below 10k/sec, which covers basically every SaaS product, the Postgres version is faster than your network round-trips to Redis anyway.

Polling vs. push. The simple version polls every 500ms, so worst-case latency on a freshly enqueued job is half a second. For most “send an email after signup” use cases that is invisible. If you need lower, Postgres has LISTEN/NOTIFY and you can wire enqueue to fire a NOTIFY queue_jobs and have workers LISTEN for it and only poll on the wakeup. That is another 30 lines of code.

Long-running jobs. A 30-minute video-encoding job inside a database transaction is a bad idea — the transaction holds a connection for 30 minutes, and idle_in_transaction_timeout will kill it. The fix is to use the Postgres queue only as the dispatch layer: claim the job, commit immediately, then run the work outside the transaction with a heartbeat that updates locked_at periodically. Most queues including BullMQ work this way under the hood.

Cross-database side effects. If your handler writes to a different database than the one storing jobs, you do not get the atomic “job done + side effect committed” property. Use idempotency keys on the downstream side and accept at-least-once. (See the idempotency-key post — the same fingerprint trick applies.)

Migrations and noisy neighbors. A misbehaving handler that does a 60-second pg_dump-equivalent query on the queue table is going to hurt your application traffic too. Either run the queue on a logical replica or accept that your queue table needs the same care as any hot OLTP table.

The metrics that prove it shipped

Two queries you should run in a daily Slack digest.

-- 1. Are jobs piling up? Should be near zero in steady state.
SELECT count(*) FROM jobs WHERE status = 'pending' AND run_at <= now();

-- 2. Are jobs getting stuck in running? Should also be zero.
SELECT count(*) FROM jobs
 WHERE status = 'running'
   AND locked_at < now() - interval '5 minutes';

The second query is the alert that catches a worker pool that has crashed without restarting. Wire it to a pg_stat_activity check (“are there any workers connected at all?”) and you will know within minutes when nothing is processing — instead of finding out via an angry customer four hours later.

For longer-term visibility, log job duration on every completion (UPDATE jobs SET ..., finished_at = now() and a histogram of finished_at - created_at - run_at per kind). Per-kind p99 duration is the metric that catches a slow-creep regression in a downstream API.

The takeaway

The reflex to add Redis the moment a feature needs background work is one of the cleanest examples of premature infrastructure. You inherit a second persistence system, a second backup story, a second monitoring stack, a second deployment pipeline, and a second on-call surface — to run code that Postgres can dispatch with one keyword.

Start with SKIP LOCKED. The 80 lines above will run a real product to a million users. When (if) you outgrow it, the migration to a “real” queue is straightforward because your handlers are already idempotent, your retry policy is already explicit, and your visibility timeout is already a number you have tuned. Skipping the Redis-shaped detour means the next time you spin up a service that needs to send an email, you do not also have to provision a cluster.

The next time someone on your team opens an RFC titled “Choosing a Job Queue,” forward them the table schema. They will get further faster.

A note from Yojji

The work that decides whether a queue silently drops jobs at 3 a.m. — visibility timeouts, retry policies, dead-letter handling, cross-database transactional outboxes — is the kind of unglamorous backend reliability work that adds up to whether a product feels solid in production. It is also the kind of work Yojji has been shipping since 2016.

Yojji is an international custom software development company with offices across Europe, the US, and the UK. Their teams specialize in the JavaScript stack (React, Node.js, TypeScript), cloud platforms (AWS, Azure, Google Cloud), and microservices architecture, and run dedicated outstaffed teams alongside full-cycle product engagements covering discovery, design, development, QA, and DevOps.

If your team is choosing between adding Redis and adding a senior backend engineer who has built three production queues, Yojji is a sensible second option to consider.