Aug 17, 2024·8 min read

Pull request queue time and the real cause of slow delivery

Pull request queue time often hides the real cause of slow delivery. Learn what to measure in reviews, test reruns, and release approvals.

Table of Contents

Why teams ship slowly even when coding is fast

A team can write code quickly and still ship slowly. The drag usually starts after the work leaves the developer's screen. A ticket moves to review on Tuesday, waits until Thursday, hits a flaky test on Friday, and misses the release until next week.

That is why delivery feels slow even when nobody is typing slowly. People finish their part, but the work sits in line. Founders and managers see effort, standups, and merged branches, yet users still wait too long for a change.

Review is often the first hidden queue. An engineer opens a pull request in 20 minutes, then waits a day for someone to read it. If the reviewer is in meetings, covering support, or buried in other work, the code stops moving. At that point, pull request queue time matters more than coding speed.

Testing creates another queue. A failed check does not always mean the code is bad. Sometimes one flaky test fails, someone reruns the suite, and the team loses two more hours. Nothing new gets built during that time. The work just waits for green checks.

Then comes the release gate. Many teams treat work as done once the code is merged. Users do not care about that moment. They care when the change reaches production, and release approval delays can push that moment far past the real finish.

These slowdowns are easy to miss because they do not look dramatic. No system is down. No one is obviously blocked. The team simply spends the day waiting for a reviewer, waiting for tests to rerun, waiting for a release window, or waiting for one last approval.

A small product team can lose days this way without noticing. Two developers might finish five tickets in a week, but if each one waits six hours in review and half a day for release, the software delivery flow still feels heavy.

That is why blaming the team is usually the wrong first move. Hiring more engineers often makes the queue bigger if the real problem starts after coding. Before you call the team slow, look at the waits between "done in code" and "live for users."

Where time gets lost after a pull request opens

A pull request can look almost done for hours or days while nobody touches it. If you count only coding time, you miss the part that slows the whole team down.

The first delay usually starts before the first review. A developer finishes the work, opens the pull request, and waits. If reviewers sit in meetings, work in another time zone, or already have a long queue, the change goes nowhere. On paper, the team looks busy. In practice, the work is parked.

Then a second queue appears. The author fixes comments quickly, but the updated pull request still needs another look. Teams often miss this delay because the code has already changed once, so it feels like progress is happening. Sometimes it is. Sometimes the fix sits untouched until the next day.

Tests add another layer of waiting. Many teams lose hours to flaky checks, slow pipelines, or reruns that people trigger just in case. One unstable test can keep a small change open far longer than the change deserves.

The final wait often sits with release authority. A pull request can be approved and still miss the release because one person owns the final sign-off. If that person is busy, asleep, or unsure, approved work enters another queue.

This pattern shows up in small startups all the time. Two developers may finish their work by lunch, but if review, reruns, and release approval all stall, nothing ships that day.

What to measure before you blame the team

Most teams blame coding speed because that part is visible. The slower part is usually waiting. If you want a fair picture of delivery, measure the gaps between actions, not just the time people spend writing code.

Start with five numbers:

Time from pull request open to first human review
Time from requested changes to the next review
Number of test reruns per pull request
Time from merge to release approval
Share of pull requests that sit idle for more than one day at any stage

These numbers tell a clearer story than story points or lines of code. A developer can finish a change in two hours, then lose a full day waiting for review, another day waiting for re-review, and more time after merge while release approval stalls.

Use median time, not just average time. One pull request that sat for a week can hide the everyday pattern. It also helps to split the numbers by team, repository, or change size. A tiny bug fix should not wait like a risky infrastructure change.

A simple example makes the point. If first review takes 14 hours, re-review takes 11 more, tests rerun three times, and release approval adds another 9 hours, the team does not have a coding speed problem. It has a queue problem.

That distinction matters. More engineers will not fix slow reviews, flaky tests, or an approval step that only one person can handle on Fridays. These numbers make the delay visible, and once you can see it, you can fix the right part.

How to track the numbers in one week

Pick one team and one recent sprint. Do not sample the whole company. One team gives you a clean picture, and one sprint gives you enough data to spot a pattern without turning this into a month-long project.

Start with timestamps you already have. Export pull request open time, first review time, approval time, merge time, pipeline start and finish times, failed runs, reruns, and release approval timestamps. If the team works in GitLab or a similar setup, most of this data is already in the repo and CI logs.

Put every pull request on one line in a sheet. Track a few basic timestamps: when the pull request opened, when the first human review happened, when the author pushed changes after review, when the pull request was approved and merged, and what happened in the pipeline before release.

That simple layout makes the waits visible. If a pull request sat untouched for 19 hours, that is queue time. If a developer pushed fixes twice after review comments, that is rework. If tests failed three times before passing, that is not coding time either.

Keep the labels simple. Mark each gap as waiting, rework, or approval. You do not need perfect taxonomy. You need enough structure to stop mixing all elapsed time into one fuzzy number.

Then review the sheet with the team that did the work. Do not turn it into a scorecard for leadership before you talk to the people involved. Ask plain questions. Which waits were normal? Which ones felt avoidable? Why did tests rerun so often? Why did release approval sit until Friday afternoon?

That conversation usually matters more than the spreadsheet. Numbers show where time went. The team tells you why.

A simple example from a small product team

Turn Git Data Into Action

Use repo and CI timestamps to make delays visible and fixable.

Review My Data

Picture a team with three engineers building a SaaS product. They ship code every day, and nobody looks idle. On paper, output seems fine. In practice, releases keep slipping by a few days, then a week, then more.

The team opens about six pull requests a day. Most sit for hours before anyone reviews them because one senior engineer handles nearly every review. That same engineer also fixes production issues and answers product questions, so reviews pile up fast. The code itself may take two hours to write, but the pull request queue time stretches to a full day.

The next delay comes from tests. A pull request fails, someone reruns the pipeline, and then waits again. Sometimes the tests pass on the second try. Sometimes they fail somewhere else. After two or three reruns, the engineer switches to another task, loses context, and comes back later. The coding time did not change, but the day got chopped into smaller pieces.

Releases add one more queue. The manager approves production releases only twice a week because that feels safer and easier to control. So even after a pull request gets reviewed, fixed, merged, and retested, it can still sit and wait for the next approval window. A change finished on Tuesday afternoon may not reach users until Friday.

By the end of the week, the team feels slow and overloaded. That feeling is real, but the cause is easy to miss. No one writes less code. No one suddenly got worse at the job. The delay comes from waiting.

This is why code review bottlenecks often look like an engineering speed problem when they are really a flow problem. If this team measured review wait time, rerun count, and approval delay for one week, they would see where delivery actually stalls.

What test reruns actually tell you

When pull request queue time grows, test reruns often explain more than the code diff. A rerun does not always mean someone wrote poor code. Many reruns happen because the test suite, runner, or environment behaves differently from one run to the next.

Flaky tests are the clearest example. The code stays the same, but one run fails and the next passes. Teams learn a bad habit quickly: rerun first, inspect later. That may save five minutes once, but over a sprint it teaches everyone to ignore red builds.

Long test suites create a quieter problem. A tiny fix can wait 20 or 30 minutes for a full pipeline. If one check fails near the end, that small change now sits in review even longer. People switch tasks, lose context, and return slower.

Manual reruns often point to unstable environments. A shared test database may keep old data between jobs. A runner may run low on disk space or memory. A network call to another service may time out for no product reason at all. If the same branch passes on retry, check the setup before you blame the developer.

A few patterns show up again and again. One rerun across many branches usually means flaky tests. Failures in the same stage often point to one weak part of the pipeline. Green results after manual retries usually mean the process hides noise instead of giving a clean signal. And when teams see repeated fail-pass-fail cycles, they stop trusting the test result.

Track how often people rerun jobs, how much time each retry adds, and which checks fail most. Those numbers usually lead you to the real fix: shorten the suite, isolate unstable tests, or clean up the environment so the first result means something.

Why release approvals create a second queue

Measure Before More Hiring

Use sprint data to spot queue problems before you add headcount.

Book Consultation

A pull request can be merged by noon and still miss the release by two days. The code is done, tests passed, review is closed, yet nothing reaches users. That gap is a second queue.

This queue gets worse when only one or two people can approve a release. If they are in meetings, traveling, or fixing another issue, everyone waits. A change that took one day to build can take three days to ship, even when nobody touched the code after merge. A Friday merge that waits for Monday approval is a common example.

Manual release checks also waste time when they repeat work the team already did. Someone reads the ticket again, rechecks the same test results, asks for the same screenshots, or requests a fresh summary in chat. That may feel safe, but it is often duplicate review with a new label.

You can usually spot this quickly. Merged pull requests pile up before a release window. Approvers spend their time gathering context instead of making a yes-or-no call. Teams start rushing "urgent" changes around the normal release path.

When that happens often, people stop trusting the process. Developers hold changes longer because they assume release will be slow anyway. Product people stop believing dates. Approvers feel pressure to act as a safety net for everything, so they ask for even more manual proof.

A better approval step is short and specific. Decide what needs human approval and what does not. If review, automated tests, and rollout checks already cover a risk, do not check it again at the release gate. Measure the wait from merge to approval, not just pull request queue time.

Mistakes that make the numbers useless

Bad metrics lead to bad decisions. A team can look slow on paper while code moves fine, and the real delay sits in review queues, flaky tests, or a release gate that nobody tracks.

Story points and lines of code are common traps. Story points tell you how someone estimated the work. Lines of code tell you how much text changed. Neither tells you how long a pull request sat in queue, how many times CI ran again, or how long approval stayed with one person.

Another mistake is mixing urgent hotfixes with normal feature work. Hotfixes often get reviewed in minutes because everyone drops what they are doing. Feature work does not get that treatment. If you blend both into one report, the numbers look healthier than the daily experience.

Keep normal feature pull requests, urgent production fixes, and release-only changes in separate buckets. Otherwise the fast edge cases hide the slow routine work.

Averages hide pain too. If five pull requests move in two hours and one waits three days, the average can still look fine. The team that waited three days will disagree. Track the median, but also track the slow end of the range, such as the 90th percentile or the longest waits each week.

Test reruns need the same care. Leaders often see repeated CI runs and assume the developer wrote bad code. Sometimes that is true. Often it is not. A rerun can come from a flaky test, a slow shared runner, a timing issue, or an environment that fails once and passes on the next try. If you treat every rerun as developer error, you punish the wrong problem and miss the fix.

The most expensive mistake is hiring more people before you fix the queue. More engineers create more pull requests. They do not create faster reviews, cleaner test infrastructure, or quicker release approval. In a small product team, one overloaded reviewer or one manual approval step can block everyone.

Good data stays split by work type, shows long waits clearly, and separates human delay from system delay. If your numbers cannot do that, they will push you toward the wrong fix.

A quick check for your next sprint

Reduce Wait Time Weekly

Run small process fixes with an experienced advisor and keep what works.

Start With Oleg

Use one sprint as your sample. Do not change the process yet. Just track where work sits still.

Start with the pull requests that reached merge last sprint. For each one, note five moments: when the pull request opened, when the first review started, when tests passed, when it merged, and when someone approved it for release. That gives you a real pull request queue time number instead of a guess.

Then count a few simple things:

How many pull requests waited more than 24 hours for the first review
How many went through a full test rerun more than once
How long merge-to-release approval took

Do not jump to averages too fast. One pull request that sat untouched for 40 hours can tell you more than five that moved in a few minutes. Look for the longest idle step across the whole path. If coding took 6 hours, review waited 28, and release approval waited 18, you already know where the slowdown lives.

A small team can do this in a spreadsheet in under an hour. If you want one simple rule for the next sprint, pick the rule that cuts the biggest idle block, not the noisiest complaint. For example, every pull request gets a first review the same workday, any second full test rerun needs a short reason in the pull request, or low-risk changes ship without a separate approval meeting.

Pick one rule, not five. Run it for one sprint, then check the same numbers again. If the wait drops, keep the rule. If it does not, change the rule and test the next bottleneck.

What to do next

Pick one product, one repo, and one release path. A week of clean data from one workflow tells you more than a month of opinions from the whole company.

Then fix the slowest handoff first. If coding takes two hours but a pull request sits half a day before anyone reviews it, headcount is not your first problem. If tests fail, rerun, and fail again for routine reasons, that delay is often cheaper to fix than hiring another engineer.

Simple service rules help because they remove guesswork. Set a clear time limit for first review on normal pull requests. Set one for release approval on low-risk changes. Decide who owns flaky test reruns and how fast they respond. Review the same numbers every week using the same definitions.

Keep the rules small and realistic. A first review within four business hours is better than a vague promise to review "soon." Same-day release approval for routine changes is better than waiting for a manager who may not even be needed.

If you measure pull request queue time, test rerun metrics, and release approval delays together, patterns show up quickly. One team may need a review rota. Another may need fewer required approvers. A third may need to fix one unstable test suite that blocks every release.

Do not roll out a company-wide process change after a few bad days. Run one experiment, watch the numbers for two sprints, and keep only what cuts waiting time.

If you want an outside view, Oleg Sotnikov at oleg.is works as a Fractional CTO and startup advisor for startups and small businesses. He helps teams improve delivery flow, infrastructure, and practical AI adoption without jumping straight to more hiring or more tools.

Frequently Asked Questions

Why does delivery feel slow if engineers code fast?

Usually because code stops moving after the author opens the pull request. Review queues, flaky test reruns, and release approval create idle blocks that make a busy team look slow.

What is pull request queue time?

Count the wait from pull request open to first human review, then count the wait after the author answers comments. That shows how long work sits instead of moving.

What should we measure first?

Start with open to first review, requested changes to next review, reruns per pull request, merge to release approval, and the share of pull requests that sit idle for more than a day. Those numbers show flow better than story points or lines of code.

Why should I use median time instead of average time?

Use the median because one three day wait can disappear inside a decent average. Also watch the slow end of the range so you see the delays people feel every week.

Do repeated test reruns mean the developer wrote bad code?

Not by themselves. Repeated reruns often point to flaky tests, slow runners, shared test data, or environment problems. If the same branch passes on retry, inspect the pipeline before you blame the author.

Why do releases still stall after the code gets merged?

Merge only ends coding. One approver, fixed release windows, or duplicate manual checks can hold finished work for days. Track merge to approval as its own delay or you will miss that second queue.

Will hiring more engineers solve slow delivery?

Usually not. More engineers create more pull requests, but they do not create faster reviews, cleaner CI, or quicker approvals. Fix the queue first, then decide if you need more people.

How can a small team check this in one sprint?

Pick one team and one recent sprint. Put each pull request on one row with open time, first review, author update, approval, merge, test runs, and release approval. You can spot the biggest idle block in under an hour.

What is one simple rule we can try next sprint?

Start with the biggest idle block, not the loudest complaint. If reviews lag, set a first review target within four business hours. If reruns drag, ask for a short reason after the second full rerun.

When does it make sense to ask a fractional CTO for help?

Bring in outside help when the team keeps arguing about whether the problem sits in coding, CI, or release, or when founders want to hire before they have timing data. A fractional CTO like Oleg Sotnikov can map the flow, find the real bottleneck, and keep the fix small.