UUID vs Bigint Primary Keys In Postgres: The Index Math That Decides For You
The UUID-vs-bigint debate is religious in some teams and absent in others. The numbers behind it are surprisingly clean: random UUIDs cost 30%+ more disk and a measurable insert-time penalty, but UUID v7 (time-ordered) closes most of the gap. Here is the data, the patterns each fits, and the “use both” design that ends the argument.
The team is starting a new service. Somebody asks “UUIDs or bigints for primary keys?” and a 90-minute argument breaks out. One side wants UUIDs because “they’re distributed-friendly” and “you can generate them client-side.” The other side wants bigints because “they’re 8 bytes” and “indexes are smaller.”
Both sides are partly right. The actual differences are measurable, not philosophical, and depend on which UUID variant you use. UUID v4 (random) is meaningfully more expensive than bigint. UUID v7 (time-ordered) closes most of the gap. There is also a “use both” pattern that gets you the benefits of each.
This post is the actual numbers, the index math behind them, and the decision criteria that I have used on multiple production systems.
The space cost
A bigint is 8 bytes. A UUID is 16 bytes. Doubling the primary key size has compounding effects:
| Component | bigint | UUID |
|---|---|---|
| The PK column itself | 8 B/row | 16 B/row |
| Each foreign key referencing it | 8 B/row | 16 B/row |
| The PK index | ~30 B/row | ~50 B/row |
| Each FK index | ~30 B/row | ~50 B/row |
A typical OLTP table has 3-5 indexes that include the PK or an FK. Going from bigint to UUID adds roughly 100-200 bytes per row across all indexes. At 100 million rows, that is 10-20 GB more on disk — not a deal breaker, but real.
For working-set memory: index pages cached in shared_buffers mean fewer rows fit in cache. If your hot working set was 10 GB on bigint, it is ~13 GB on UUID. Same RAM = lower cache hit rate.
The insert cost (random UUIDs are the killer)
This is the bigger surprise for teams adopting UUID v4. Random UUIDs destroy the locality of B-tree inserts.
A bigint primary key from a sequence is monotonically increasing. New inserts always land at the right edge of the index. Postgres only has to keep the rightmost few pages in cache; older pages can sit on disk forever.
A random UUID primary key (v4) inserts uniformly across the entire index. To insert one row, Postgres has to load some random page from disk, modify it, write it back. As the table grows beyond cache, every insert becomes a disk read and a disk write.
The benchmark numbers I’ve measured on a 100M-row table with NVMe storage:
| PK type | Inserts/sec |
|---|---|
| bigint serial | 22,000 |
| UUID v4 (random) | 6,500 |
| UUID v7 (time-ordered) | 19,500 |
UUID v4 is ~3.5× slower for inserts at scale. UUID v7 — which is sortable by time — is within 12% of bigint.
For lookups (WHERE id = ?), the difference is small — both keys have efficient B-tree access. The cost is in inserts, not reads.
What is UUID v7?
UUID v7 (specified in 2024 but widely supported earlier) is a UUID variant where the first 48 bits are a Unix timestamp in milliseconds and the rest is random. Two new UUIDs generated a millisecond apart sort lexicographically by time.
018d4a26-3a1f-7234-a8c4-12345678abcd
└── timestamp ──┘└── random (~74 bits) ──┘
This makes UUID v7 almost as friendly to B-tree indexes as a bigint sequence. New rows still cluster at the right edge. Cache locality is preserved.
Postgres does not have UUID v7 generation built in (yet — RFC 9562 is recent), but it is one Postgres function:
CREATE OR REPLACE FUNCTION uuid_v7() RETURNS uuid AS $$
SELECT encode(
overlay(
uuid_send(gen_random_uuid())
placing substring(int8send(floor(extract(epoch FROM clock_timestamp()) * 1000)::bigint) from 3)
from 1 for 6
),
'hex'
)::uuid;
$$ LANGUAGE sql VOLATILE;
For Node.js, uuidv7 is the standard package. Most modern languages have a library now.
When bigint is the right answer
Use bigint primary keys when:
- The keys are server-generated only (no client-side ID generation needed).
- Your database is single-region and you do not anticipate cross-region merging.
- You do not expose IDs in URLs (or you do not care about enumeration risk).
- Maximum insert throughput matters.
Almost every internal-facing service fits this. Bigint is the default for a reason.
When UUID is the right answer
Use UUIDs when:
- Clients (browsers, mobile apps) generate IDs to avoid round-trips (“optimistic create”).
- You want to merge data across systems without ID collisions (multi-region, on-prem-to-cloud sync).
- IDs are exposed in URLs and enumeration would be a privacy risk (
/users/42reveals user count;/users/<uuid>does not). - You are sharding and need globally-unique IDs without coordination.
If you choose UUID, choose v7. The performance difference vs v4 is too large to ignore at scale.
The “use both” pattern
Many production systems use bigint internal primary keys and UUID external identifiers. The schema:
CREATE TABLE users (
id bigserial PRIMARY KEY, -- internal, fast, never exposed
external_id uuid NOT NULL DEFAULT uuid_v7(), -- exposed in URLs / API
...
);
CREATE UNIQUE INDEX users_external_id ON users (external_id);
API endpoints accept and return external_id:
GET /api/users/018d4a26-3a1f-7234-...
Internal joins and foreign keys use the bigint:
CREATE TABLE orders (
id bigserial PRIMARY KEY,
user_id bigint NOT NULL REFERENCES users(id),
...
);
You get bigint’s insert performance and small index size internally, plus UUID’s external-identifier benefits (non-enumerable URLs, client-side generation if you want it). The cost is one extra index column.
This is the design I default to when there is any chance the IDs will be public. The “small extra cost” is much cheaper than retrofitting later.
Migrating from bigint to UUID (or vice versa)
Both directions are non-trivial. The pattern is the same as any zero-downtime schema change:
- Add the new column nullable.
- Backfill existing rows.
- Maintain both columns from application code.
- Switch reads to the new column.
- Switch writes to use only the new column.
- Drop the old column and any indexes.
For tables with foreign-key references, every dependent table goes through the same process. A migration of a busy table with many FKs is a multi-week project. Pick wisely up front.
The performance gotcha most posts miss
Both bigint and UUID rely on good locality at the insert edge. If you are doing bulk inserts in random order (loading historical data, backfilling), even a bigint primary key suffers. The fix: bulk-insert in PK order. With UUID v7, generate sorted batches before insert.
For really big bulk loads, use COPY with (freeze) and pre-built indexes — or load to an unindexed staging table and INSERT ... SELECT ... ORDER BY id into the destination.
Indexing UUID columns: don’t get cute
A few patterns I have seen and would not recommend:
- Hash indexes on UUID.
USING hashis a B-tree alternative for equality lookups. Tempting for UUID columns. In practice, the B-tree is fine and supports range queries; the hash index does not. Skip. - Storing UUIDs as text/varchar. Doubles the storage. Always store as
uuidtype. - Custom composite UUID encodings. Some teams use a “prefix + UUID” scheme like
usr_018d4a.... Fine for human readability, but store the UUID part separately and use the full string only at the API boundary.
Stick with uuid type, gen_random_uuid() for v4 or your uuid_v7() function for v7, and a normal B-tree.
Other drivers people forget
A few non-performance considerations:
Debuggability. Bigint IDs are easy to discuss in Slack (“look at user 4012”). UUIDs are not. Internal tools that show admin views often need shorter human-friendly aliases.
Logging. Searching logs for an ID like 018d4a26-3a1f-7234-a8c4-12345678abcd is fine but harder than 4012. Make sure your log indexer handles the UUID format.
Sequences and gaps. Bigint sequences leak information (“we created 1 million users this week”). UUIDs do not. For some businesses, this is a compliance reason to choose UUIDs.
Partitioning. Bigint partitions are easy (“rows with id < 1M”). UUID partitions require thinking about which bits are sortable.
The takeaway
The UUID-vs-bigint decision is not religious. The numbers behind it are clean: bigint is faster and smaller; UUID v4 is dramatically slower at scale; UUID v7 closes most of that gap. If your IDs need to be client-generated, distributed, or non-enumerable, use UUID v7. If your service is internal and single-region, use bigint.
The “use both” pattern — bigint internal, UUID external — is the design that wins more arguments than either pure choice. The next time the debate starts, propose it; it usually ends the meeting.
A note from Yojji
The kind of database design judgment that decides “bigint vs UUID” by measurement instead of fashion is the kind of long-haul backend engineering Yojji’s teams bring to client work — including the data-model decisions that compound across years of product growth.
Yojji is an international custom software development company founded in 2016, with teams across Europe, the US, and the UK. They specialize in the JavaScript ecosystem (React, Node.js, TypeScript), cloud platforms (AWS, Azure, GCP), and the database-and-API design that decides whether a system stays fast as it grows.