May 13, 2025·8 min read

Two-pass code review for faster, safer team feedback

Two-pass code review helps teams check logic first, then safety and cleanup, so reviews move faster and bugs, risk, and noisy comments drop.

Why one review pass breaks down

A single review asks one person to answer two different questions at once. Does the code do the right thing? And is it safe and clean enough to keep?

Those questions need different kinds of attention. Logic review needs a full picture of the feature, the data flow, and the edge cases. Cleanup review is more local. It looks for naming, dead code, missing tests, weak error messages, logging gaps, and small safety problems. When reviewers try to do both at the same time, each kind of comment interrupts the other.

That is why one long review often gets messy. A reviewer starts by checking whether an order total is correct, then notices a weak variable name, then spots a missing null check, then tries to return to the business rule. After a few switches, the original logic is already fuzzy. Bugs slip through because the reviewer keeps rebuilding context instead of staying with one question.

Small comments also crowd out important ones. A serious bug can end up buried between notes about formatting or test names. The author fixes surface issues first, while the deeper problem stays open longer than it should.

Generated code makes this worse. AI can produce a lot of code very quickly, and much of it looks plausible on first read. That creates a trap. Reviewers spend energy scanning volume instead of checking intent. They comment on style because style is easy to see, while missing the real problem, like a permission check in the wrong place or a retry loop that can charge a customer twice.

A two-pass code review works better because it separates "does it work" from "is it safe and tidy enough to merge." Teams usually review generated code faster when they stop mixing those jobs.

You can see this in small teams that use AI heavily. A developer asks a model for a new endpoint, gets 300 lines back, and opens one big review. The reviewer catches three naming issues, one duplicate helper, and a missing comment. Later, someone finds that the endpoint accepts requests without checking account ownership. The bug was there the whole time. The review process made it easy to miss.

Pass one: check whether the code solves the right problem

Pass one should answer one question: does the code solve the right problem?

Teams skip this more often than they think. They see working tests, clean formatting, and tidy function names, then miss the bigger issue. The code handles the wrong rule, or it only covers the happy path. In a two-pass code review, pass one protects product logic.

Start with the main flow. Read the code from input to result as if you were the user or the business owner. What goes in? What decisions does the code make? What comes out at the end? If you cannot explain that flow in a few plain sentences, the code needs another look.

This matters even more with AI code. Generated code often looks complete, but small logic mistakes hide inside neat structure. Starting with style comments wastes time if the feature itself is wrong.

A practical pass one usually comes down to four checks:

The change matches the ticket, request, or business rule.
The main flow works from start to finish without odd detours.
Edge cases change behavior in the right way.
Cleanup issues stay out of scope unless they block understanding.

Edge cases matter most when they affect money, access, status, or timing. If a discount applies only above a certain total, test amounts just below and just above that line. If a user can cancel a booking only before a deadline, check what happens one minute before and one minute after. Those cases tell you far more than a quick skim.

Ignore naming, formatting, and tiny refactors in this pass unless they make the code hard to understand. A variable called tmpData can wait. A refund rule that charges the wrong fee cannot.

Good pass-one comments are direct. "This rejects repeat customers, but the rule says only first-time users need approval." That gives the author something real to fix. Save polish for later, after the logic is right.

Pass two: check safety, failures, and cleanup

Pass two starts only after the logic looks sound. The feature works, the happy path makes sense, and now the job changes. Stop asking, "Does it do the right thing?" and ask, "What breaks when real users, messy data, or odd timing hit this code?"

That shift matters. When teams mix both jobs into one read, they miss small safety issues or spend too much time arguing about names before they catch a bad default.

Start with risk. Generated code often accepts more than it should, trusts input too early, or leaves settings too open. Look for broad permissions, missing input checks, exposed secrets, debug modes left on, and retry loops that never stop. Code can work on day one and still create a security problem on day ten.

Then read the failure paths on purpose. Check what users see when an API times out, a record is missing, or a file upload fails. Good code handles empty states, partial data, and slow responses without confusing people. If the app only works when every dependency behaves, the review is not done.

Tests matter here too, but test count is not the point. Check whether tests cover bad input, permission failures, and risky edge cases. If the code touches production systems, ask whether logs would help someone find the problem quickly. Ask the same question about monitoring. If this breaks at 2 a.m., would the team know before users complain?

A short pass-two checklist helps:

Try to spot one security issue, even in a small change.
Read one failure path from start to finish.
Check whether tests prove the code fails safely.
Check whether logs and alerts would point to the cause.

Leave naming, repeated blocks, and dead code for the end. They still matter, but they should not steal attention from safety and supportability. Once the risky parts look clean, tidy the names, remove leftovers, and cut duplicate code. That order keeps reviews faster and catches the bugs that actually hurt.

How to run the process

Teams move faster when they treat review as two short jobs instead of one long debate. That matters even more when they review generated code, because polished output can hide weak logic.

Start before anyone opens the diff. The author should add a short summary at the top of the pull request: what changed, why it changed, and where the risky parts are. Three to five lines is enough. If a reviewer has to guess the goal, the review slows down right away.

Then make the current stage obvious. Add a simple tag in the pull request title or status, such as "pass one" or "pass two." That small label keeps everyone in the same mode and stops the usual mix of deep logic comments and tiny style notes.

A simple flow looks like this:

The author opens the pull request with a brief summary and marks it "pass one."
The reviewer checks behavior, assumptions, edge cases, and whether the code solves the actual problem.
The author fixes logic issues first.
After that, the pull request switches to "pass two."
Reviewers check safety, tests, logs, comments, naming, and small cleanup items.

That order matters. If a reviewer starts with naming or formatting while the logic is still shaky, the author often rewrites the same area twice. That creates noise and wastes time.

It also helps to keep the thread quiet during pass two. Batch small fixes into one short note or one follow-up commit instead of opening ten separate threads for import order, log wording, and minor renames. Reviews feel much calmer when the nits travel together.

One simple rule works well: if a comment would still matter after a rewrite, post it in pass one. If it would disappear after a rewrite, save it for pass two.

A simple example from a team under deadline

Build a review checklist

Turn review habits into a short process your team will actually follow.

Build My Checklist

A small SaaS team needed to ship a new signup form before a product demo. Most of the code came from AI tools, so the pull request looked finished at first glance. It had the page, API handler, email step, and tests that seemed fine on a quick read.

The team had a familiar problem. When they reviewed generated code in one pass, comments mixed everything together. One reviewer talked about password rules, another asked about log redaction, and someone else noticed missing tests. The thread got noisy fast, and simple fixes sat next to bigger risks.

So they split the review.

In pass one, the reviewer looked only at behavior. They checked whether the form rejected bad emails, whether weak passwords failed for the right reason, whether an existing user saw a clear message, and whether a successful signup moved the user to the next step. They also checked one boring but important thing: did the code match the product rule, or did the model invent its own version?

That first pass found three issues. The form accepted spaces at the end of an email, the API returned the same message for two different errors, and the success flow skipped email confirmation for one edge case. The developer fixed those comments in one round because every note pointed at logic and expected outcomes.

Pass two started after the flow worked. A different reviewer checked the parts that often get rushed near merge time. They asked for rate limits on the signup endpoint, removed a log line that printed raw email addresses, and added tests for retry behavior when the mail service timed out.

They also caught a common AI cleanup problem: an unused helper and a config flag nobody needed.

The team merged the change the same day. Not because the code was perfect, but because each pass had one job. Comments stayed focused, nobody argued across topics, and the developer could fix one kind of problem at a time.

Mistakes teams make

The most common mistake is polishing code before anyone agrees that the logic is right. A reviewer spots awkward names, spacing, or a repeated helper and starts fixing small things right away. Ten comments later, nobody has answered the bigger question: does this code solve the right problem with the right rules?

That order wastes time. If the team later changes the logic, many cleanup comments become dead work. A better habit is simple. Pass one decides whether the code should exist in this form at all. Pass two makes it safe, clean, and easy to keep.

Teams also slow themselves down by mixing style comments into every thread. One comment asks about an edge case, the next argues about variable names, then someone points out import order. The review turns into noise, and serious issues get lost.

This gets worse with generated code. AI often produces code that looks neat on the surface, so reviewers get pulled into cosmetic feedback. Clean formatting can hide weak error handling, missing permission checks, or code that works only for the happy path.

Another mistake is skipping pass two because pass one felt fine. Logic can be correct and still leave a mess behind. The code may pass a quick read and still expose secrets in logs, miss tests, duplicate old functions, or leave unused branches that confuse the next developer.

One more problem shows up in small teams. The same reviewer handles both passes every time. That person gets tired, starts skimming, and becomes the bottleneck. Blind spots grow when one set of eyes always checks logic and cleanup in the same order.

A few simple rules avoid most of this:

Hold style and naming comments until logic is approved.
Require a real second pass, even for short changes.
Rotate reviewers, or let a second person handle pass two.
Use tools for formatting so people spend time on judgment.

This kind of split review is especially useful when a team starts using more AI in development. Fast output only helps if people check the right things in the right order.

Who should do each pass

Fix review noise first

Let Oleg set up a two-pass review flow your team can keep using.

Book a Call

Assigning the same person to both passes sounds efficient, but it usually creates blind spots. One reviewer starts with product behavior, then gets pulled into naming, logging, secrets, and deployment details. The review gets longer, and the useful comments get mixed together.

For generated code, split the work by mindset. The first reviewer checks whether the change does the right thing for the user and the business. The second reviewer checks whether the change is safe to run, easy to maintain, and unlikely to cause trouble later.

A simple split works well:

Pass one: a product-minded engineer, tech lead, or feature owner checks logic, edge cases, and expected behavior.
Pass two: someone closer to security, infrastructure, platform work, or senior backend review checks safety, performance, logging, config, and cleanup.
If only one reviewer is available, do the passes at different times, not in one sitting.

Time limits help. Pass one should stay short, often 10 to 15 minutes for a normal pull request. Pass two can be even shorter, around 5 to 10 minutes, because the reviewer is not rebuilding the business context from scratch.

Separate approval rules help more than most teams expect. If one approval covers everything, people skip the second pass when the queue gets busy. Even if your tool only shows one green check, your team can still treat logic approval and safety approval as two distinct steps.

Teams that work with a fractional CTO or a small platform group often use this split well. One person owns product correctness, while another owns operational risk across the stack. That setup fits AI code review especially well, because generated code often looks fine on the surface while hiding weak error handling, noisy queries, or risky defaults.

Quick checks before merge

Harden generated code

Check retries, permissions, logs, and defaults before AI output reaches users.

Get Review Help

A merge should feel boring. If the branch still raises basic questions, the review is not done.

Right before merge, stop looking for polish and ask whether the code is safe to live with next week. This is where a two-pass code review helps most. The first pass checks whether the change works. The second makes sure it will not cause avoidable trouble.

Use a short final check:

The branch matches the request. If the task was "add CSV export," the code should add CSV export, not slip in extra refactors.
Someone tested the failure paths. Check what happens when the API times out, the file is missing, the user has no access, or the input is empty.
Someone reviewed secrets, auth, and input handling. No hardcoded tokens, no loose permission checks, and no raw user input flowing into queries, file paths, or shell commands.
The author removed noise. Delete debug prints, commented-out blocks, unused helpers, and generated code that nobody calls.
Tests cover the risky parts. You do not need tests for every line, but you do need them where mistakes would hurt users, money, or data.

The first item matters more than teams admit. Generated code often looks busy and complete, but it can still miss the actual task. A reviewer should compare the branch to the ticket in plain language and answer one simple question: "Would I call this done if I were the requester?"

Cleanup matters because it piles up fast. One stray secret, one skipped auth check, or one dead helper can turn a small merge into a long cleanup next week. Five extra minutes now can save an hour later.

If even one box stays unchecked, do not merge yet. Ask for the missing test, trim the noise, or fix the unsafe path first.

Start small and measure it

Start with one team, one repository, and one pull request template. A small trial gives you useful feedback quickly, and it keeps people from arguing about a process they have not tried.

For the first week or two, use the same split on every review. Pass one checks whether the code does the right job. Pass two checks safety, cleanup, and anything that makes future changes harder.

You do not need a new tool stack or a long policy document. Most teams can begin with four simple changes:

Add two short sections to the pull request template: "Pass one" and "Pass two."
Ask reviewers to leave comments in the matching section.
Track how long reviews take from open to approval.
Count bugs found after merge for a few weeks.

Those notes matter more than opinions. If review time drops and post-merge bugs do not rise, the process is working. If bugs fall and review time stays flat, that is still a good trade.

Do not freeze the process too early. A frontend-heavy codebase may need more attention on edge cases and state changes in pass one. A backend or infrastructure service may need a stricter second pass for permissions, secrets, retries, logs, and failure handling.

Keep the code review checklist short. If it grows every week, people will stop using it. Cut anything that does not catch real problems.

A realistic next step is to run the trial on five to ten pull requests, then review the results in one short team meeting. Look at review time, reopened comments, bugs after merge, and whether reviewers felt less mentally overloaded. That last part matters. Good review habits should make work calmer, not heavier.

If a team wants outside help setting up practical AI review workflows, Oleg Sotnikov at oleg.is does that kind of work as a Fractional CTO and startup advisor. His experience spans software engineering, CTO work, infrastructure, and AI-driven development, which makes this sort of process design more grounded and less theoretical.

If the trial works, keep the template, write down the few checks that catch real issues, and use the same pattern on the next team.