A Code Review Checklist That Ends The Same Three Arguments Every Sprint

The team’s PR cycle takes 3 days on average. Each PR gets 30 comments, half of them style nitpicks, half of them genuine feedback. Two engineers disagree on whether useState should be at the top of a component. Another two disagree on test coverage for trivial code. Authors get frustrated, reviewers get burned out, and every PR turns into a small negotiation.

The cure is not “be nicer in reviews.” It is to write down what you actually expect from a code review and use that document as the bar. Style is automated. Architecture and correctness are the focus. Disagreements without justification are noted but don’t block the PR.

This post is a checklist that has worked across multiple teams, plus the meta-rule that prevents it from becoming another bureaucratic process.

The checklist (1 page)

Reviewers go through this for each PR:

# Code Review Checklist

## Correctness (must)
- [ ] Does the code do what the PR description says?
- [ ] Are edge cases handled (empty inputs, nulls, large inputs, errors)?
- [ ] Are there race conditions, concurrency issues, or shared mutable state?
- [ ] Do tests cover the new behavior, including failure cases?

## Production-readiness (must)
- [ ] Is logging meaningful (not too quiet, not too noisy)?
- [ ] Will this break gracefully or cascade if a dependency fails?
- [ ] Is observability in place (metrics, traces, useful errors)?
- [ ] Migration / rollout plan if needed?

## Maintainability (should)
- [ ] Is the code readable? Names clear, complexity localized?
- [ ] Is there duplication that begs for a refactor (in this PR or next)?
- [ ] Are public APIs (functions, types, exports) named carefully?

## Security (must, when applicable)
- [ ] Is user input validated at the boundary?
- [ ] Are secrets handled correctly (no hardcoding, no logging)?
- [ ] Does this expose any new auth surface?

## Performance (when applicable)
- [ ] Are there N+1 queries, unnecessary loops, or large in-memory accumulations?
- [ ] Is this in a hot path? If so, has it been measured?

## Out of scope for review
- Code style, formatting, import order  →  enforced by linter
- Method length, file length              →  enforced by linter
- "I would have done this differently"    →  not blocking

The “out of scope” section is the meta-rule. Reviewers can comment on style, but they cannot block a PR for it. Linter handles style; humans handle correctness.

Why each section

Correctness is non-negotiable. A PR that doesn’t work as advertised, or that introduces bugs, is not ready. Tests are part of correctness.

Production-readiness is what separates “the feature is done” from “we can ship this without paging the on-call.” Many teams skip this section and pay later.

Maintainability is “should” because perfect maintainability slows things down. Good enough is the bar; refactors can happen in follow-ups.

Security is “must” when applicable. A backend PR touching user input — must. A frontend CSS-only PR — N/A.

Performance is “when applicable” because most code is not hot-path. Don’t optimize early; do optimize what matters.

Out of scope is the section that ends arguments. “I would have used a hook here” is not blocking.

The meta-rule

The checklist is a tool for converging on the same standards. The meta-rule is:

A reviewer can leave any comment they want. They can only block a PR on items in the “must” sections.

“Should” and “out of scope” comments are suggestions. The author may apply them, defer them, or note disagreement. Reviewer who really wants something to change has to argue it onto the “must” list at a team meeting — not block individual PRs over it.

This single rule eliminates the “I disagree on style and so I’m withholding approval” pattern.

Author’s responsibilities

The author has reciprocal duties:

Open PRs as small as possible. A 1500-line PR is not reviewable. Aim for under 400 lines of meaningful code change.

Write a useful PR description.

## What
Adds rate limiting to the /api/auth/login endpoint.

## Why
We've been seeing 1000+ login attempts/sec from a single IP during yesterday's incident.

## How
Token bucket: 5 attempts per IP per minute, with a 30-minute lockout after 10 failures.

## Tests
- Unit test for the bucket math.
- Integration test against the endpoint.

## Rollout
Behind a feature flag (`auth.rate-limit`); enabling at 10% on Monday, 100% by Friday if metrics OK.

## Risks
- IPs behind shared NATs may be over-limited. Monitoring this in metrics.

## Screenshots / logs
[when applicable]

The “Why” and “Risks” sections are the most important. They tell the reviewer where to focus.

Respond to every comment. Either fix, push back, or explicitly defer (“good point, ticket: PROJ-1234”). Never leave a review comment unanswered.

Re-request review after changes. Don’t make a reviewer chase your updates.

The “two reviewer” myth

Many teams require two approvals on every PR. For most teams, one is enough — especially if the team has good test coverage and good monitoring. Two approvals doubles the review load for marginal correctness gain.

Reserve “two approvals” for high-risk PRs:

Database migrations.
Auth / payment / billing code.
Anything touching production data.
Anything across team boundaries.

For everything else, one careful review is enough.

Linter, not human

Move every style decision into the linter. ESLint, Prettier, Black, gofmt, rustfmt — pick the relevant tool. Run on every commit (see pre-commit hooks post). Reject style PR comments politely:

“The linter passes; we agreed style is automated. If you want this rule changed, propose it for the linter config.”

This is the discipline that ends 80% of bikeshedding.

How long should reviews take

Internalized targets:

Acknowledge a PR within 4 working hours. “I’ll review later today” is fine; silence isn’t.
First substantive review within 1 working day. Faster for small PRs.
Second-pass review within 4 hours. Don’t make the author wait days for the change.

A team that hits these has PRs merging in under a day. A team that doesn’t has PRs sitting for a week.

For larger PRs, schedule a 30-minute pairing session instead of trading async comments — often faster.

Tone in reviews

Three patterns that help:

Use “we” instead of “you.” “We could simplify this loop” reads better than “Your loop is too complex.”

Suggest, don’t command. “Consider X because Y” is collaborative. “Do X” is not.

Ask questions when unsure. “Is there a reason you chose X?” gives the author a chance to explain instead of having to defend.

Approve with comments. A PR can have non-blocking comments and an approval at the same time. Use this to unblock the author while still leaving feedback.

Common antipatterns

Reviewing only the diff. A diff doesn’t show context. Open the file, read the surrounding code. Some bugs only show up when you understand the call site.

Letting the test pass-through count as approval. Tests catch some bugs, not all. Read the code.

Skipping security on “frontend-only” PRs. A frontend PR that adds a new API call may add new auth surface. Don’t auto-skip.

Approving without reading. “LGTM” as the only comment means “I didn’t read this carefully.” If you really did, leave a comment that proves it.

When to override the checklist

Two cases:

Hotfix during incident. Skip the checklist; ship the fix; review properly afterward. Document.

Trivial PRs. A typo fix, dependency bump, comment update. Wave through. The checklist is for substantive code changes.

For everything else, the checklist applies.

The takeaway

A short written checklist with clear “must” and “out of scope” sections ends most code review arguments. Linter handles style. Humans handle correctness, production-readiness, security. One reviewer is enough for most PRs; two for high-risk. Author writes a useful description; reviewer responds quickly; everyone uses “we” instead of “you.”

The team that adopts this finds their median PR cycle drops from 3 days to under 1, with fewer arguments and better outcomes. The discipline is the value, not the document.

A note from Yojji

The kind of process discipline that turns “code review takes three days” into “code review takes three hours” — written standards, reviewer accountability, automated style enforcement — is the kind of detail Yojji’s teams put into the engineering culture they bring to client work.

Yojji is an international custom software development company founded in 2016, with teams across Europe, the US, and the UK. They specialize in the JavaScript ecosystem (React, Node.js, TypeScript), cloud platforms, and full-cycle product engineering — including the development practices that decide whether shipping is fast or stuck in review.