The Practical Developer

Cursor Pagination: Why Offset Queries Explode at Scale and How to Fix Them

Offset pagination looks fine on page one and falls apart on page two hundred. Here is the exact SQL and Node.js code to replace it with cursor-based pagination that stays fast, avoids duplicate rows, and survives concurrent writes.

Stack of open books with pages turning — the difference between jumping to a random page and reading in order

The support ticket said users were seeing the same product twice on the catalog page. The frontend team checked the React keys. The backend team checked the API response. Both looked correct. Then somebody noticed the duplication only happened when two users loaded the catalog at the same time — one on page 12, one on page 13. A new product had been inserted between the queries. Every row after the insertion point shifted down by one. The user on page 12 saw a duplicate. The user on page 13 missed an item entirely.

That is not a frontend bug. It is the physics of OFFSET pagination in a database with concurrent writes. And it gets worse: OFFSET 24000 LIMIT 20 does not jump to row 24000. It scans 24020 rows and throws away 24000 of them. The p95 of your product catalog endpoint is not 80ms. It is a linear function of the page number, and by page 200 it is over two seconds.

Cursor pagination — sometimes called keyset pagination — fixes both problems. This post is the exact SQL and Node.js code to replace OFFSET, avoid shifted rows, and keep page 200 as fast as page one.

Why OFFSET is a time bomb

Most pagination starts like this:

SELECT * FROM products
ORDER BY created_at DESC
LIMIT 20 OFFSET 24000;

It returns 20 rows, so it looks correct. But the query plan tells the truth. Postgres scans the index in created_at order until it has read 24020 rows, sorts them if needed, then returns the last 20. The cost is O(offset + limit). Page 1 costs 20 rows. Page 1000 costs 20,020 rows. The latency graph is a straight line upward, and your database CPU graph follows it.

The row-shifting problem is harder to see in metrics and easier to see in angry tickets. If a row is inserted or deleted between the query for page 12 and the query for page 13, the window slides. A product that was row 240 on page 12 becomes row 241 on page 13. Page 12 shows it at the bottom. Page 13 shows it at the top. To the user, the same item appears twice. If a row is deleted, a slot disappears entirely. Neither bug is catchable by unit tests unless your test suite simulates concurrent inserts during paginated reads, which almost nobody’s does.

How cursor pagination actually works

Instead of asking the database to skip 24,000 rows, you remember where you stopped and ask for the next 20 rows after that point.

SELECT * FROM products
WHERE (created_at, id) < ('2024-06-01T10:30:00Z', 48291)
ORDER BY created_at DESC, id DESC
LIMIT 21;

The database uses the index to find the starting position, then walks forward exactly 21 rows. The cost is O(limit) regardless of how deep you are in the list. Page 1 and page 10,000 cost the same.

Because the query is anchored to a real value — a timestamp and an ID — inserting a new row at the top does not shift the boundary between page 12 and page 13. The new row falls before the cursor, so it simply does not appear in the current traversal. No duplicates, no missing items. The trade-off is that you can no longer jump directly to page 437. That is usually fine. Users almost never jump to page 437 unless you are building an admin tool.

Choosing the cursor column

The cursor must be a combination of columns that is:

  1. Strictly ordered — the ORDER BY clause you expose to the client.
  2. Indexed — otherwise the WHERE clause is a sequential scan and you have solved nothing.
  3. Unique — or combined with a unique tie-breaker. If you sort by created_at alone and two rows have the same timestamp, the relative order is undefined between queries. The database can return them in different orders on different pages, producing the same duplication bug you were trying to escape.

The safest default is a composite sort: the business column you want to sort by, plus the primary key as a tie-breaker.

CREATE INDEX idx_products_cursor
ON products (created_at DESC, id DESC);

If your primary query is ORDER BY created_at DESC, id DESC, the index covers the sort and the filter. Use EXPLAIN (ANALYZE, BUFFERS) to confirm an index scan, not a heap scan with a filter node. If you see a sort step in the plan, the index does not match the ORDER BY and the database is reordering every page.

Forward pagination: the query builder

The API receives a ?cursor= string. Decode it, run the query, encode the last row of the result set as the next cursor, and return it. Request 21 rows instead of 20. If the 21st row exists, there is another page. Strip it before returning the response.

import { Pool } from 'pg';
import crypto from 'node:crypto';

const ALG = 'aes-256-gcm';
const KEY = Buffer.from(process.env.CURSOR_KEY!, 'hex'); // 32 bytes

function encodeCursor(payload: Record<string, unknown>): string {
  const iv = crypto.randomBytes(12);
  const cipher = crypto.createCipheriv(ALG, KEY, iv);
  const plaintext = JSON.stringify(payload);
  const encrypted = Buffer.concat([cipher.update(plaintext, 'utf8'), cipher.final()]);
  const authTag = cipher.getAuthTag();
  const combined = Buffer.concat([iv, authTag, encrypted]);
  return combined.toString('base64url');
}

function decodeCursor(cursor: string): Record<string, unknown> {
  const combined = Buffer.from(cursor, 'base64url');
  const iv = combined.subarray(0, 12);
  const authTag = combined.subarray(12, 28);
  const encrypted = combined.subarray(28);
  const decipher = crypto.createDecipheriv(ALG, KEY, iv);
  decipher.setAuthTag(authTag);
  const plaintext = Buffer.concat([decipher.update(encrypted), decipher.final()]);
  return JSON.parse(plaintext.toString('utf8'));
}

Do not base64 the raw primary key and expose it to the client. A cursor is an opaque token. Encrypt it so users cannot reverse-engineer IDs, guess the next page by incrementing a counter, or construct arbitrary cursors that bypass access controls. AES-256-GCM gives you both secrecy and tamper detection in one standard call.

The query builder:

type SortDir = 'asc' | 'desc';

interface CursorPageArgs {
  sort: Array<{ field: string; dir: SortDir }>;
  limit: number;
  cursor?: string;
}

async function cursorPage<T extends Record<string, unknown>>(
  db: Pool,
  table: string,
  args: CursorPageArgs,
  additionalWhere?: string,
  additionalParams?: unknown[],
): Promise<{ rows: T[]; nextCursor: string | null }> {
  const limit = Math.min(args.limit, 100); // hard ceiling
  const fetchLimit = limit + 1;
  const sort = args.sort;
  const cursorValues = args.cursor ? decodeCursor(args.cursor) : null;

  // Build ORDER BY
  const orderClauses = sort.map((s) => `${s.field} ${s.dir.toUpperCase()}`).join(', ');

  // Build WHERE for cursor continuation
  const whereParts: string[] = [];
  const params: unknown[] = [];
  let paramIndex = 1;

  if (additionalWhere) {
    whereParts.push(`(${additionalWhere})`);
    params.push(...(additionalParams ?? []));
    paramIndex = params.length + 1;
  }

  if (cursorValues) {
    // (a, b) > (1, 2) means a > 1 OR (a = 1 AND b > 2)
    const comparisons: string[] = [];
    const eqPrefix: string[] = [];

    for (let i = 0; i < sort.length; i++) {
      const s = sort[i];
      const col = s.field;
      const op = s.dir === 'asc' ? '>' : '<';
      comparisons.push(
        `(${eqPrefix.join(' AND ')}${eqPrefix.length ? ' AND ' : ''}${col} ${op} $${paramIndex++})`,
      );
      eqPrefix.push(`${col} = $${paramIndex - 1}`);
    }

    whereParts.push(`(${comparisons.join(' OR ')})`);
    sort.forEach((s) => params.push(cursorValues[s.field]));
  }

  const whereClause = whereParts.length ? `WHERE ${whereParts.join(' AND ')}` : '';

  const sql = `
    SELECT * FROM ${table}
    ${whereClause}
    ORDER BY ${orderClauses}
    LIMIT ${fetchLimit}
  `;

  const result = await db.query<T>(sql, params);
  const hasNext = result.rows.length > limit;
  const rows = hasNext ? result.rows.slice(0, limit) : result.rows;

  let nextCursor: string | null = null;
  if (hasNext && rows.length > 0) {
    const last = rows[rows.length - 1];
    const payload: Record<string, unknown> = {};
    sort.forEach((s) => {
      payload[s.field] = last[s.field];
    });
    nextCursor = encodeCursor(payload);
  }

  return { rows, nextCursor };
}

The comparisons logic builds the standard tuple-comparison pattern: (a, b, c) > (1, 2, 3) expands to a > 1 OR (a = 1 AND b > 2) OR (a = 1 AND b = 2 AND c > 3). Postgres optimizes this correctly when the index column order matches the tuple. Keep the ORDER BY and the index in the same column order.

The Express route

app.get('/api/products', async (req, res) => {
  const cursor = req.query.cursor as string | undefined;
  const limit = Number(req.query.limit ?? 20);

  const { rows, nextCursor } = await cursorPage(
    dbPool,
    'products',
    {
      sort: [
        { field: 'created_at', dir: 'desc' },
        { field: 'id', dir: 'desc' },
      ],
      limit,
      cursor,
    },
    'status = $1',
    ['active'],
  );

  res.json({
    data: rows,
    pagination: {
      next_cursor: nextCursor,
      has_next_page: !!nextCursor,
    },
  });
});

This shape is clean for both mobile infinite scroll and frontend table “load more” buttons. The client passes next_cursor back as the cursor query param for the next request. No page numbers, no math, no drift.

Backward pagination (optional, but expected)

If the client wants a “previous” button or bidirectional infinite scroll, you also need to paginate backward. The cleanest approach is to reverse the sort direction, run the same cursor logic, and reverse the result set.

async function cursorPageBidirectional<T extends Record<string, unknown>>(
  db: Pool,
  table: string,
  args: CursorPageArgs & { direction?: 'next' | 'prev' },
  additionalWhere?: string,
  additionalParams?: unknown[],
): Promise<{ rows: T[]; nextCursor: string | null; prevCursor: string | null }> {
  const direction = args.direction ?? 'next';
  const sort = args.sort.map((s) => ({
    ...s,
    dir: direction === 'prev' ? flipDir(s.dir) : s.dir,
  }));

  const { rows, nextCursor } = await cursorPage<T>(
    db, table, { ...args, sort }, additionalWhere, additionalParams,
  );

  const data = direction === 'prev' ? rows.reverse() : rows;

  // prevCursor: if we fetched in forward direction, the first row of the
  // current page becomes the backward cursor from the previous set.
  // A full implementation also stores the prior cursor in the response.
  // For simplicity, return the nextCursor only; clients that need prev
  // can keep a stack of prior cursors.
  return { rows: data, nextCursor, prevCursor: null };
}

function flipDir(d: SortDir): SortDir {
  return d === 'asc' ? 'desc' : 'asc';
}

A simpler pattern that scales: do not support backward cursors at all. Return only next_cursor. If the client needs a previous page, it keeps its own history of visited cursors. This is what Slack, Twitter, and GitHub do in their timeline APIs. It removes the complexity of symmetric cursors and the edge cases around boundary rows appearing in both directions.

The edge cases that break naive implementations

Timestamps with collisions. If 100 products are imported in a batch and share the same created_at value to the microsecond, the tie-breaker primary key is not optional. Without it, the database can return the same timestamp slice in a different order on the next query. Always include the primary key as the final sort column.

Nulls in the sort column. Postgres sorts nulls as larger than all other values by default (NULLS LAST in ascending, NULLS FIRST in descending — actually the default is NULLS FIRST for DESC and NULLS LAST for ASC). If your cursor column allows nulls, the client-visible behavior changes depending on which side of the list nulls land. Either forbid nulls in cursor columns, or normalize them in the query:

ORDER BY COALESCE(created_at, '1970-01-01') DESC, id DESC

Changing the query shape after cursors have been issued breaks existing cursors. Schema changes that affect ordering must be deployed with a cursor-version bump or a forced reset (ignore the cursor and restart from the top).

Filter changes mid-traversal. If the user switches from “show all” to “in stock only” while holding a cursor from the unfiltered list, the cursor is invalid for the new filter. The correct behavior is to drop the cursor and restart from the first page. Do not try to adapt the cursor to a changed filter; the math is brittle and the benefit is zero.

Non-unique sort without tie-breaker. If someone removes the primary key from the sort tuple, the database is allowed to return the same row on two consecutive pages. Your test suite should include a test that inserts 50 rows with identical timestamps, paginates through them, and asserts no duplicate IDs in the union of all pages.

When OFFSET is still the right tool

Cursor pagination is not strictly better. It is better for linear traversal. OFFSET is still correct when:

  • The dataset is small and static enough that scanning is not expensive and rows will not shift. A 300-row admin table is fine with OFFSET.
  • The user needs direct random access: “jump to page 47.” Search engine result pages sometimes still use OFFSET because users expect numbered pagination. The performance cliff is accepted because the dataset is bounded and cached.
  • You are exporting or reporting, not serving interactive UI. An ETL job that reads 10,000-row chunks by OFFSET is acceptable if it runs once a night and the table is not concurrently modified during the export.

Everything else should use cursors.

Testing the correctness

A pagination implementation is not done when the first page returns 20 rows. It is done when the following tests pass:

  1. Exhaustive traversal. Insert exactly 55 rows. Paginate with limit 10. Assert 6 pages, 10 rows each except the last with 5, and no duplicate IDs across all pages.
  2. Concurrent mutation. Start paginating with limit 5. After page 2, insert a new row that belongs at the top of the sort. Continue paginating. Assert no duplicate rows and the total count of unique rows seen equals the original count plus the new row minus any deleted rows.
  3. Empty cursor restart. Send a garbage cursor. Expect the API to return either a clean 400 or page one. Never return a 500 or an opaque database syntax error.
  4. Deep page latency. Insert 100,000 rows. Measure the latency of the first page, the 1,000th page, and the 5,000th page with OFFSET and with a cursor. The OFFSET latency should grow. The cursor latency should stay flat.

The practical takeaway

OFFSET pagination is the default because ORMs generate it automatically and because it looks correct in a unit test. It is not correct in production. It is slow, resource-heavy, and produces duplicate or missing rows the moment your database has more than one concurrent connection.

Replace it with cursor pagination: an encrypted token that encodes the last sort values, a WHERE clause that uses tuple comparison, a composite index that covers the sort and the filter, and a query that fetches limit + 1 rows to detect the next page. The SQL stays flat. The user stops seeing ghosts. The database does not die on page 500.

A note from Yojji

The difference between an API that works in staging and one that survives production traffic is usually not the framework — it is the query pattern under the endpoint. Cursor pagination, connection pooling, and the index design that makes them fast are the kind of backend fundamentals Yojji’s teams build into the products they deliver.

Yojji is an international custom software development company founded in 2016, with offices in Europe, the US, and the UK. Their dedicated teams ship across the JavaScript ecosystem, cloud platforms, and full-cycle product engineering — including the database design decisions that keep APIs fast at production scale.