JSONB Indexing In Postgres: GIN Vs Expression Indexes, And When Each Is The Right Choice

The team has a JSONB column. Queries against it are slow. Somebody adds CREATE INDEX ... USING GIN (data). Some queries get faster. Some queries are still slow. Some are now slower than before because the index is huge and Postgres is reading both it and the table. Nobody is sure what to do.

JSONB indexing is one of those Postgres areas that looks simple and is actually nuanced. There are at least four query patterns, three index types, and a sweet spot for each combination. This post is the framework: the four patterns, which index each needs, and the production-grade configuration that doesn’t waste disk.

The four query patterns

Almost every JSONB query falls into one of these:

1. Containment check. “Find rows where the JSONB contains this object.”

SELECT * FROM events WHERE payload @> '{"type": "signup"}';

2. Key existence. “Find rows where this key is set.”

SELECT * FROM events WHERE payload ? 'newsletter_opt_in';

3. Specific value lookup by path. “Find rows where payload.user.id = 42.”

SELECT * FROM events WHERE payload -> 'user' ->> 'id' = '42';
SELECT * FROM events WHERE payload->>'type' = 'signup';

4. Range / comparison on a value. “Find rows where the price is between X and Y.”

SELECT * FROM events WHERE (payload->>'price')::int > 100;

Each needs a different index.

Pattern 1: Containment — GIN with default opclass

CREATE INDEX events_payload_gin ON events USING GIN (payload);

This supports @>, ?, ?|, ?& (containment and key-existence operators). Use it when you query with @> and want flexibility.

-- Uses the index.
SELECT * FROM events WHERE payload @> '{"type": "signup"}';
SELECT * FROM events WHERE payload @> '{"user": {"plan": "pro"}}';

The default GIN index supports more operators but is larger and slower to update than jsonb_path_ops.

Pattern 1b: Containment-only — GIN with jsonb_path_ops

CREATE INDEX events_payload_path_gin ON events USING GIN (payload jsonb_path_ops);

Smaller (~30%), faster to update, but supports only @>. Not key-existence, not subkey lookup. If @> is your only access pattern, prefer this.

-- Uses the index.
SELECT * FROM events WHERE payload @> '{"type": "signup"}';

-- Does NOT use the index — fall back to seq scan.
SELECT * FROM events WHERE payload ? 'newsletter';

Most teams should use jsonb_path_ops if they can. The savings on disk and write performance are real.

Pattern 2: Key existence — GIN with default

?, ?|, ?& need the default GIN. jsonb_path_ops does not support them.

-- This needs the default GIN index, NOT jsonb_path_ops.
SELECT * FROM events WHERE payload ? 'newsletter_opt_in';

If you only need to check for one specific key existence, an expression index is smaller:

CREATE INDEX events_has_newsletter ON events ((payload ? 'newsletter_opt_in'))
WHERE payload ? 'newsletter_opt_in';

Partial index — only includes rows where the key exists. Tiny if the key is rare.

Pattern 3: Specific path — expression index

A GIN index doesn’t help with payload->>'type' = 'signup'. It needs an expression index on the specific path:

CREATE INDEX events_type_idx ON events ((payload->>'type'));

-- Uses the expression index.
SELECT * FROM events WHERE payload->>'type' = 'signup';

Drawbacks:

One index per path. Five hot paths = five indexes.
The index is rebuilt on any update to the column, even if the indexed path didn’t change.
Indexes are sized for the column’s full data type, not just the extracted string.

For 2-3 hot paths, this is fine. For more, see “promote to columns” below.

Pattern 4: Numeric / range — typed expression index

(payload->>'price')::int returns text by default. To use it for range queries, cast in the index:

CREATE INDEX events_price_idx ON events (((payload->>'price')::int));

The double parentheses are required by Postgres’ syntax. Now:

-- Uses the index.
SELECT * FROM events WHERE (payload->>'price')::int BETWEEN 100 AND 500;

Same caveat: one path, one index.

When to promote a key to a column

If a JSONB key is in WHERE clauses across the codebase, it has graduated to a column. The signal:

SELECT count(*) FILTER (WHERE payload ? 'plan') AS with_plan,
       count(*) AS total
FROM accounts;

If with_plan / total > 0.95, the key is essentially universal. Move it:

ALTER TABLE accounts ADD COLUMN plan text;
UPDATE accounts SET plan = payload->>'plan' WHERE payload ? 'plan';
ALTER TABLE accounts ALTER COLUMN plan SET NOT NULL;
CREATE INDEX accounts_plan_idx ON accounts (plan);
-- Optionally: UPDATE accounts SET payload = payload - 'plan';

A column has:

Native btree indexes (faster than GIN for equality and range).
Planner statistics (better query plans).
Clean column-level constraints.

See the JSONB-design post for the broader rule.

Compound indexes with JSONB

You can combine a column with a JSONB expression in one index:

CREATE INDEX events_user_type_idx ON events (user_id, (payload->>'type'));

-- Uses the compound index.
SELECT * FROM events
 WHERE user_id = 42 AND payload->>'type' = 'signup';

For queries that always filter by both, this is the right index. For queries that filter by only one, separate indexes are better.

Index size and maintenance

GIN indexes are surprisingly large — sometimes bigger than the table itself. Check periodically:

SELECT
  indexrelname AS index,
  pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC;

If a GIN index is 5 GB on a 1 GB table, consider:

Switch to jsonb_path_ops if applicable.
Replace with focused expression indexes.
Promote hot keys to columns.

GIN inserts are also slower than B-tree inserts. The fastupdate setting (default ON) batches inserts; gin_pending_list_limit controls how often the batch is flushed. For write-heavy tables, tune these or accept the cost.

Build indexes concurrently

A CREATE INDEX on a populated table holds an ACCESS EXCLUSIVE lock — writes block. For a hot table, use CREATE INDEX CONCURRENTLY:

CREATE INDEX CONCURRENTLY events_payload_gin ON events USING GIN (payload jsonb_path_ops);

Slower (multiple table scans) but doesn’t block writes. Always use CONCURRENTLY in production.

Verifying the index is being used

EXPLAIN ANALYZE
SELECT * FROM events WHERE payload @> '{"type": "signup"}';

Look for Index Scan on events_payload_gin or Bitmap Index Scan in the plan. If you see Seq Scan instead, either:

The index doesn’t fit the query (wrong opclass, wrong path).
The planner thinks seq scan is faster (small table, low selectivity).
Statistics are stale (ANALYZE events).

For low-selectivity queries (matching most rows), seq scan is actually faster. The planner is right.

Common mistakes

Indexing the whole JSONB when you only query specific paths. The GIN index is huge; queries don’t use it for ->>. Use expression indexes instead.

Forgetting CONCURRENTLY in production. A non-concurrent build on a hot table is a 30-minute write outage.

Using @> when you mean =.

WHERE payload @> '{"id": 42}'   -- containment: matches rows with id=42 plus other keys
WHERE payload @> '{"id": "42"}' -- string vs int — won't match!

JSONB containment respects type. 42 in JSONB is an integer; "42" is a string. They are different.

Indexing arrays with the wrong assumption. payload @> '[1, 2]' matches arrays containing both 1 and 2 in any order. payload[0] = 1 does not work the same way.

The takeaway

JSONB without an index is a sequential scan. The right index depends on the query: GIN with jsonb_path_ops for @>-only queries; default GIN for key-existence and ?; expression indexes for specific paths; typed expression indexes for range queries; columns when a key is universal.

Audit your slow queries. Match the index to the actual query pattern. Switch to jsonb_path_ops when you can. Promote hot keys to columns. The result is queries that go from seconds to milliseconds with minimal extra disk.

A note from Yojji

The kind of database fluency that picks the right index for the actual query — not just “add a GIN index and hope” — is the kind of senior backend skill Yojji’s teams bring to client work.

Yojji is an international custom software development company founded in 2016, with teams across Europe, the US, and the UK. They specialize in the JavaScript ecosystem, cloud platforms (AWS, Azure, GCP), and Postgres-backed application engineering — including the indexing strategy that decides whether your JSONB queries are fast or hidden seq scans.