Stale-While-Revalidate: The Caching Pattern That Makes 0ms Feel Normal

You have a perfectly tuned cache-aside layer. Redis responds in 2ms. Your p50 latency is 8ms. You are proud of the cache stampede protection you added after the last incident, the single-flight lock that keeps Postgres from getting 4,000 identical queries at once when a hot key expires.

But here is the thing you still feel: every cache miss. Even with single-flight, the first request after a key expires waits for the full database query. 380ms if the query is fast. 2 seconds if it is slow. The user staring at a loading spinner does not care about the elegant single-flight logic that saved the other 3,999 requests. They care that this one request took 2 seconds.

Stale-while-revalidate (SWR) is the pattern that eliminates that wait entirely. Instead of waiting for a cache miss to trigger a refresh, you serve the stale (expired) value immediately and refresh the cache in the background. The user always gets a response in cache-hit time. The database gets updated asynchronously. The word “miss” effectively disappears from your vocabulary.

This post is the SWR pattern implemented in Node.js with Redis: the TTL strategy that enables it, the revalidation lock that prevents thundering herds on the async refresh, the stale buffer that keeps data fresh enough, and the metrics that tell you whether your SWR window is right. By the end you will have a ready-to-deploy cache client that never makes a user wait for a cold query.

The problem with cache-expire-and-block

Standard cache-aside works like this:

1. Check cache for key
2. If found and not expired -> return it (hit)
3. If expired or missing -> query database, write cache, return (miss)

Step 3 is the problem. The user cannot get a response until the database finishes. Even with single-flight, where only one request hits the database and the rest wait on a promise, the first user in line waits for the full query.

Here is what that looks like in code you have probably written:

async function getWithTTL<T>(key: string, ttlSec: number, fetch: () => Promise<T>): Promise<T> {
  const cached = await redis.get(key);
  if (cached !== null) {
    return JSON.parse(cached);
  }

  // Cache miss. User waits.
  const value = await fetch();
  await redis.set(key, JSON.stringify(value), 'EX', ttlSec);
  return value;
}

That await fetch() is the pause. For a 200ms database query, the user waits 200ms. For a 2-second aggregation, they wait 2 seconds. If your TTL is 60 seconds and your key is hot (1,000 req/s), this happens 1,440 times a day for every key. Over a fleet of 200 endpoints, that is a lot of slow responses.

The naive fix is to increase the TTL. But long TTLs mean stale data. The dashboard shows numbers from 15 minutes ago. The inventory count is wrong. The user refreshes and sees a different result. Long TTLs trade freshness for speed, and eventually someone files a bug about “the data is not updating.”

SWR solves this by separating the cache into two layers: the “serve from” value and the “fresh until” timestamp.

How SWR works

The core idea comes from HTTP RFC 5861 (which defines the stale-while-revalidate Cache-Control directive) and has been popularized by client libraries like SWR and React Query. The server-side version works like this:

1. Check cache for key
2. If found and TTL not expired -> return immediately (fresh hit)
3. If found but TTL is expired but within stale window -> return immediately (stale hit), schedule async refresh
4. If not found or beyond stale window -> query database, write cache, return (cold miss)

The key insight: step 3 returns the stale data instantly. The user gets a response in cache-read time (2ms). Meanwhile, a background promise refreshes the cache. The next request gets fresh data.

This changes the cache hit/miss curve from a binary cliff into a soft slope:

Fresh hit: 0 wait
Stale hit (SWR): 0 wait
Cold miss (cache never populated): full query wait (rare, happens once per key)

In practice, cold misses only happen on first access after a deployment or for long-tail keys that expire completely. For hot keys, the SWR window keeps them perpetually warm.

The Redis SWR implementation

The trick is storing the expiration timestamp alongside the data so you can distinguish “fresh” from “stale but usable.” A Redis hash works perfectly:

interface SwrEntry<T> {
  data: T;
  expiresAt: number;   // Unix ms: when fresh -> stale
  staleAt: number;     // Unix ms: when stale -> dead
}

// Cache module
import { createClient } from 'redis';

const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();

// Track in-flight revalidations to avoid duplicate refreshes
const pendingRefreshes = new Map<string, Promise<void>>();

export async function swrGet<T>(
  key: string,
  ttlSec: number,
  staleSec: number,
  fetch: () => Promise<T>
): Promise<T> {
  const now = Date.now();

  // Read the raw hash
  const raw = await redis.hGetAll(key);

  // Cold start: never cached, or stale window expired
  if (!raw || !raw.data) {
    const value = await fetch();
    const entry: SwrEntry<T> = {
      data: value,
      expiresAt: now + ttlSec * 1000,
      staleAt: now + (ttlSec + staleSec) * 1000,
    };
    await redis.hSet(key, {
      data: JSON.stringify(value),
      expiresAt: String(entry.expiresAt),
      staleAt: String(entry.staleAt),
    });
    await redis.expire(key, ttlSec + staleSec);
    return value;
  }

  const expiresAt = Number(raw.expiresAt);
  const staleAt = Number(raw.staleAt);
  const data: T = JSON.parse(raw.data);

  // Case 1: Still fresh, return immediately
  if (now < expiresAt) {
    return data;
  }

  // Case 2: Stale but within SWR window, return stale + refresh async
  if (now < staleAt) {
    // Fire-and-forget refresh (with dedup)
    scheduleRefresh(key, ttlSec, staleSec, fetch);
    return data;
  }

  // Case 3: Beyond stale window, cold refresh
  const value = await fetch();
  const entry: SwrEntry<T> = {
    data: value,
    expiresAt: now + ttlSec * 1000,
    staleAt: now + (ttlSec + staleSec) * 1000,
  };
  await redis.hSet(key, {
    data: JSON.stringify(value),
    expiresAt: String(entry.expiresAt),
    staleAt: String(entry.staleAt),
  });
  await redis.expire(key, ttlSec + staleSec);
  return value;
}

The scheduleRefresh function handles the async revalidation:

function scheduleRefresh<T>(
  key: string,
  ttlSec: number,
  staleSec: number,
  fetch: () => Promise<T>
): void {
  // Deduplicate: if a refresh is already in flight, skip
  if (pendingRefreshes.has(key)) return;

  const refreshPromise = (async () => {
    try {
      const value = await fetch();
      const now = Date.now();
      const entry: SwrEntry<T> = {
        data: value,
        expiresAt: now + ttlSec * 1000,
        staleAt: now + (ttlSec + staleSec) * 1000,
      };
      await redis.hSet(key, {
        data: JSON.stringify(value),
        expiresAt: String(entry.expiresAt),
        staleAt: String(entry.staleAt),
      });
      await redis.expire(key, ttlSec + staleSec);
    } catch (err) {
      // Refresh failed. The stale data stays in cache.
      // The next request will try again.
      console.error(`SWR refresh failed for key ${key}:`, err);
    } finally {
      pendingRefreshes.delete(key);
    }
  })();

  pendingRefreshes.set(key, refreshPromise);

  // Prevent unhandled rejection by attaching a noop catch
  refreshPromise.catch(() => {});
}

This is deliberately simple and synchronous-looking for the caller. The swrGet function always returns a value in cache-read time (cold start aside). The refresh happens out of band. If the refresh fails, the stale data remains in cache and the next request will schedule another refresh. The cache never goes empty.

What about the thundering herd on the async refresh?

Single-flight on cache miss is the standard defense against a thundering herd. SWR has a similar problem: if the background refresh catches an error (e.g., database timeout) and 10,000 requests are all serving stale data, they will all schedule a refresh at once. The pendingRefreshes Map handles this: only one refresh per key is ever in flight, regardless of how many requests read stale data.

But there is a subtler problem. What if the SWR window is 30 seconds, the database is slow (800ms per query), and all 10,000 requests arrive within that 800ms window? The first request triggers the async refresh. The other 9,999 skip because pendingRefreshes already has the key. One database query. Good.

But what if the refresh takes 2 seconds and the stale window expires before the refresh completes? Then the next batch of requests after the stale window expires hits a cold miss and blocks on a synchronous refresh. This is the SWR equivalent of a cache stampede.

The fix is to extend the stale window when a refresh is in flight:

async function scheduleRefresh<T>(
  key: string,
  ttlSec: number,
  staleSec: number,
  fetch: () => Promise<T>
): Promise<void> {
  if (pendingRefreshes.has(key)) return;

  const refreshPromise = (async () => {
    try {
      // Extend staleAt so concurrent readers keep getting stale data
      const now = Date.now();
      await redis.hSet(key, 'staleAt', String(now + staleSec * 1000));

      const value = await fetch();
      const freshNow = Date.now();
      const entry: SwrEntry<T> = {
        data: value,
        expiresAt: freshNow + ttlSec * 1000,
        staleAt: freshNow + (ttlSec + staleSec) * 1000,
      };
      await redis.hSet(key, {
        data: JSON.stringify(value),
        expiresAt: String(entry.expiresAt),
        staleAt: String(entry.staleAt),
      });
      await redis.expire(key, ttlSec + staleSec);
    } catch (err) {
      console.error(`SWR refresh failed for key ${key}:`, err);
    } finally {
      pendingRefreshes.delete(key);
    }
  })();

  pendingRefreshes.set(key, refreshPromise);
  refreshPromise.catch(() => {});
}

The await redis.hSet(key, 'staleAt', ...) at the start of the refresh pushes the stale window forward. Any request that arrives while the refresh is in flight will still return the stale data. The only way to hit a cold miss is if both the cache is empty and there is no refresh in flight, which only happens on initial population or after a Redis eviction.

Picking your TTL and stale window

The two numbers that control SWR behavior are the TTL (how long data is “fresh”) and the stale window (how long after the TTL you accept stale data before forcing a refresh).

A good starting point for most endpoints:

TTL: 30 seconds. Aggressive enough that data is never more than 30 seconds stale. Most dashboards and API responses tolerate this.
Stale window: 60 seconds. A full minute of stale-serve coverage. The async refresh has 60 seconds to complete before any request would block.

This means a key is cacheable for 90 seconds total (30 fresh + 60 stale). The data age at the user’s eyes is at most 90 seconds. The refresh rate is determined by the request rate: every request that reads stale data schedules a refresh, but the dedup ensures only one refresh per TTL+stale window cycle.

For slower endpoints (5-second database queries), increase the stale window to 120 seconds to give the refresh plenty of time. For endpoints where data freshness matters (inventory counts, available balances), reduce TTL to 5 seconds and stale window to 10 seconds. At 5/10, the data is at most 15 seconds old and the user never waits for a cache miss unless Redis evicts the key.

// Configuration presets
const swrConfigs = {
  dashboard:    { ttlSec: 30,  staleSec: 60  },  // Typical dashboard
  realtime:     { ttlSec: 5,   staleSec: 10  },  // Near realtime
  reference:    { ttlSec: 300, staleSec: 300 },  // Slow-changing reference data
  coldTolerant: { ttlSec: 10,  staleSec: 120 },  // Slow queries need large stale window
} as const;

Metrics that matter

SWR hides latency, which means it can also hide problems. If your database is slow and every refresh takes 5 seconds but the stale window is 10 seconds, users never see the slowness. They see 2ms responses from stale data. Great for the user. Terrible for your ability to notice the database degrading.

You need three metrics exposed from your SWR client:

// Counters for your metrics system (Prometheus, OpenTelemetry, etc.)
const swrHitsFresh = new Counter({ name: 'swr_hits_fresh_total', help: 'Served from fresh cache' });
const swrHitsStale = new Counter({ name: 'swr_hits_stale_total', help: 'Served from stale cache with async refresh' });
const swrMissesCold = new Counter({ name: 'swr_misses_cold_total', help: 'Cache empty, blocked on sync refresh' });
const swrRefreshDuration = new Histogram({ name: 'swr_refresh_duration_seconds', help: 'Time for background refresh' });

Track these and alert on:

swr_misses_cold_total > 0: A key was not in cache at all. This should be rare. If it happens frequently, your Redis memory is too small or your stale window is too short.
swr_refresh_duration p99 approaching the stale window: The refresh is barely making it. Increase the stale window or investigate the database query.
swrHitsStale / (swrHitsFresh + swrHitsStale) > 0.5: More than half of your responses are stale. Either your TTL is too short or the data is accessed less frequently than you expected. Consider increasing TTL.

A healthy SWR endpoint in production should show 0 cold misses, < 20% stale hits (most hits should land in the fresh window), and refresh durations safely below the stale window.

Edge cases that will bite you

Redis eviction. If Redis runs out of memory and evicts your SWR key, the next request gets a cold miss. This is the one case where SWR cannot help. Mitigate by setting a sensible maxmemory-policy (prefer allkeys-lru over noeviction for a cache), and monitor eviction rates. Also, the expire set on the key ensures Redis does not hold data past the stale window.

Large payloads. Storing the data field as a JSON string in a Redis hash is fine for payloads under 1MB. For larger payloads, consider splitting: store a Redis key for the data separately and use the hash to store expiry metadata and a pointer. Or switch to a dedicated cache that handles large objects natively.

Serialization cost. JSON.parse on every read and JSON.stringify on every write adds up. For hot keys, consider a binary serialization format like MessagePack. The SWR pattern does not care about the encoding, only about the data/expiresAt/staleAt structure.

Clock skew. If your Redis server and application server have different clocks, the Unix timestamps in expiresAt and staleAt drift. Use the Redis server time via TIME command (or normalize all timestamps to Redis time at write time). In practice, sub-second clock skew is fine for a 30-second TTL.

Stale data avalanche. If your database goes down for 5 minutes and every key reaches the stale window limit, every request becomes a cold miss and every cold miss fails. This is the same failure mode as a normal cache stampede, just delayed by the stale window. The fix is the same as any database outage: use graceful degradation to serve the stale data even past the stale window, with a circuit breaker for the database call.

The complete client

Here is the full SWR cache client in about 80 lines:

import { createClient, RedisClientType } from 'redis';
import { Counter, Histogram } from './metrics'; // Your metrics system

interface SwrEntry<T> {
  data: T;
  expiresAt: number;
  staleAt: number;
}

export class SwrCache {
  private redis: RedisClientType;
  private pendingRefreshes = new Map<string, Promise<void>>();
  private metrics = {
    hitsFresh: new Counter({ name: 'swr_hits_fresh_total', help: '' }),
    hitsStale: new Counter({ name: 'swr_hits_stale_total', help: '' }),
    missesCold: new Counter({ name: 'swr_misses_cold_total', help: '' }),
    refreshDuration: new Histogram({ name: 'swr_refresh_duration_seconds', help: '' }),
  };

  constructor(redisUrl: string) {
    this.redis = createClient({ url: redisUrl });
    this.redis.connect();
  }

  async get<T>(
    key: string,
    ttlSec: number,
    staleSec: number,
    fetch: () => Promise<T>
  ): Promise<T> {
    const now = Date.now();
    const raw = await this.redis.hGetAll(key);

    if (!raw || !raw.data) {
      this.metrics.missesCold.inc(1);
      const value = await fetch();
      await this.writeEntry(key, value, ttlSec, staleSec);
      return value;
    }

    const expiresAt = Number(raw.expiresAt);
    const staleAt = Number(raw.staleAt);
    const data: T = JSON.parse(raw.data);

    if (now < expiresAt) {
      this.metrics.hitsFresh.inc(1);
      return data;
    }

    if (now < staleAt) {
      this.metrics.hitsStale.inc(1);
      this.scheduleRefresh(key, ttlSec, staleSec, fetch);
      return data;
    }

    // Cold miss (beyond stale window or evicted)
    this.metrics.missesCold.inc(1);
    const value = await fetch();
    await this.writeEntry(key, value, ttlSec, staleSec);
    return value;
  }

  private async scheduleRefresh<T>(
    key: string,
    ttlSec: number,
    staleSec: number,
    fetch: () => Promise<T>
  ): Promise<void> {
    if (this.pendingRefreshes.has(key)) return;

    const p = (async () => {
      const start = Date.now();
      try {
        // Extend stale window so concurrent readers keep getting stale data
        await this.redis.hSet(key, 'staleAt', String(Date.now() + staleSec * 1000));
        const value = await fetch();
        await this.writeEntry(key, value, ttlSec, staleSec);
      } catch (err) {
        console.error(`SWR refresh failed for ${key}:`, err);
      } finally {
        this.metrics.refreshDuration.observe((Date.now() - start) / 1000);
        this.pendingRefreshes.delete(key);
      }
    })();

    this.pendingRefreshes.set(key, p);
    p.catch(() => {});
  }

  private async writeEntry<T>(key: string, data: T, ttlSec: number, staleSec: number): Promise<void> {
    const now = Date.now();
    await this.redis.hSet(key, {
      data: JSON.stringify(data),
      expiresAt: String(now + ttlSec * 1000),
      staleAt: String(now + (ttlSec + staleSec) * 1000),
    });
    await this.redis.expire(key, ttlSec + staleSec);
  }

  async disconnect(): Promise<void> {
    await this.redis.quit();
  }
}

Usage in a route handler:

const cache = new SwrCache(process.env.REDIS_URL);

app.get('/api/dashboard/:userId', async (req, res) => {
  const data = await cache.get(
    `dashboard:${req.params.userId}`,
    30,   // TTL: 30 seconds of fresh
    60,   // Stale window: 60 seconds of stale-serve
    () => db.queryDashboard(req.params.userId)
  );
  res.json(data);
});

Every response returns in 2-4ms (Redis read time) except the very first request for that key, which pays the database cost once. After that, the cache is self-sustaining: reads refresh it, the stale window absorbs latency, and cold misses only happen if Redis evicts the key.

The takeaway

A normal cache-aside layer turns every cache expiration into a latency cliff. Users wait for the database query. Even with single-flight, the first user in line pays the full cost. Stale-while-revalidate eliminates that cliff by serving expired data immediately and refreshing in the background. The user always gets a response in cache-read time. The database gets quiet, async updates.

The pattern is not complex. It is three timestamps in a Redis hash, a dedup Map for pending refreshes, and a conditional read path that serves stale data instead of blocking. It is the same pattern that makes React Query and SWR on the frontend feel instant, adapted for the server side where the data source is a database and the cache is Redis.

Wire it once. Set a 30-second TTL and 60-second stale window. Watch your cold miss counter stay at 0 and your p99 latency drop to your Redis round-trip time. Then forget about it, because the cache is no longer something you page about.

A note from Yojji

The difference between a caching layer that feels fast and one that causes 3 a.m. incidents is often just a handful of deliberate patterns: single-flight, early refresh, and the humble stale-while-revalidate window. Production backend engineering is full of these small, high-leverage decisions that separate teams who chase problems from teams who move past them.

Yojji is an international custom software development company founded in 2016, with offices in Europe, the US, and the UK. Their senior engineers specialize in building the Node.js microservices, caching architectures, and database-backed APIs that stay fast and reliable under real-world traffic, whether through dedicated outstaffing or end-to-end product delivery.