Feb 07, 2025·8 min read

Software rewrite decision: test the urge in 2 weeks

Make a better software rewrite decision with two weeks of evidence. Track bug rate, lead time, and workaround patterns before you commit.

Why teams reach for a rewrite too fast

Teams usually ask for a rewrite after a rough stretch, not after a calm review. A painful release, a week of surprise bugs, or one feature that takes far too long can make the whole codebase look broken. When people feel stuck every day, "start over" sounds cleaner than dealing with another patch.

That reaction is human, but it is still a reaction. Pain tells you the team is struggling. Proof tells you why. Those are not the same thing.

A team can feel miserable for reasons that have little to do with the code itself. Product changes may keep shifting. One senior engineer may have left. Testing may be weak. The backlog may hide old decisions no one remembers. If you call all of that "bad architecture," you can end up making an expensive software rewrite decision on emotion alone.

One bad sprint is especially misleading. Sprints go off the rails for all sorts of reasons:

a rushed deadline forced shortcuts
a risky feature touched too many areas at once
support issues pulled engineers away from planned work
unclear requirements caused rework

None of those automatically mean the system needs a full rewrite. They may point to process problems, missing tests, or one messy part of the product that needs focused repair.

Rewrite talk also feels productive because it turns frustration into a bold plan. It gives the team a fresh story: new stack, cleaner design, fewer headaches. The trouble is that a rewrite resets years of small lessons. Teams often throw away known workarounds, hidden business rules, and edge cases they only remember when users start complaining.

A better goal is simple: collect two weeks of evidence before choosing rewrite vs refactor. Track bug rate, measure lead time, and note the workarounds engineers use to get things done. That short window will not answer everything, but it will separate a bad mood from a real pattern. Once you can see the pattern, the next step gets much easier to defend.

What to capture in the next two weeks

A good software rewrite decision starts with evidence, not irritation. Two weeks is enough to spot patterns, but short enough that the product, team, and priorities do not shift under you.

Use real work, not guesses. If a bug did not become a ticket, do not count it. Memory turns one painful outage into "the system is always broken," and that is how teams talk themselves into a rewrite they cannot justify.

For the same reason, measure lead time with clear start and end points. Start when someone asks for a change or opens a ticket. End when that change reaches production. Do not switch definitions halfway through the window, or the numbers lose meaning.

You also need a log of workarounds. These are the small moves people make to get around the current code: manual data fixes, extra review steps, "just ask Sam," copy-pasting between services, skipping tests to hit a date, or delaying a release until one person is online. A single workaround is normal. The same workaround every few days usually points to a design problem.

Keep the capture simple:

Count bugs from the ticket system only
Record lead time for each shipped request
Write down every workaround the team used to ship
Use one fixed 10-working-day window for all three

A short window matters more than most teams think. If you compare one bad month, one holiday week, and one major launch, you are not comparing the codebase. You are comparing chaos.

If you want one extra note, add impact beside each item. Was the bug customer-facing? Did the workaround cost 20 minutes or half a day? That small detail helps you separate annoying friction from expensive friction.

By the end of two weeks, you want a small, boring set of facts. Boring is good here. It gives the team something firmer than frustration.

How to track bug rate without adding busywork

If bugs live in three places, the count will lie on day one. Put every new bug into one board or one tracker, even if the team still chats about it elsewhere. For a software rewrite decision, messy bug data can push people toward a rewrite when the real problem is just poor visibility.

Use the tracker the team already touches every day. If they already work in GitLab issues, keep it there. If they use another board, stay with that. The tool matters less than one rule: every bug goes through the same place, with the same few fields.

Keep those fields plain:

severity: low, medium, or high
status: open, fixed, or reopened
area: login, billing, admin, search, API, or another product area
date opened and date closed

That is enough for two weeks. Do not add long templates, root cause essays, or extra forms. If reporting a bug takes more than a minute, people will skip it or write junk.

Severity should match language the team already uses. If people naturally say "small annoyance" or "blocks release," map that to three simple labels and move on. Fancy scoring systems look neat, but most teams stop using them by day three.

Count reopened bugs on their own. A bug that comes back after a fix tells you more than a fresh report. It often points to rushed testing, hidden dependencies, or code that nobody feels safe changing. Ten new bugs and six reopened bugs paint a very different picture from sixteen new bugs.

Track where bugs cluster by area, not by person. If billing has twelve issues in two weeks, that says something useful. If you turn this into a scoreboard by engineer, people will avoid reporting problems and the numbers will get softer, not better.

Review the count at the same time every day. A ten minute check after standup is enough. You are not hunting for perfect data. You are looking for a stable pattern the team can trust when frustration starts sounding like strategy.

How to measure lead time in a way the team trusts

Lead time gets messy when every person uses a different start and end point. If one engineer starts the clock at ticket creation and another stops it at code merge, the number means very little. For a software rewrite decision, that kind of fuzzy data can push a team toward a bigger move than it needs.

Use one rule for the whole team: start timing when work enters active development. That usually means the task leaves a ready state and someone begins building, testing, or changing code. Do not count days where the work sits in a backlog, waits for a meeting, or has no owner yet.

Stop the clock when users can actually get the change. If the code merges on Tuesday but the release goes out on Friday, Friday is the end. Teams often miss this gap, even though release delay is part of delivery.

Planned work and urgent fixes need separate buckets. A hotfix for a broken checkout flow moves fast because the team drops everything else. A planned feature goes through normal review, test, and release steps. If you mix them together, the average tells a false story.

Trust also depends on writing down pauses. When work stops, note why. Keep it simple: waiting for design, missing access, blocked by another service, unclear requirement, failed test, release freeze. After two weeks, these notes usually show whether the problem is code complexity, team process, or both.

Comparison matters just as much as timing. Compare similar tasks with similar tasks. A one-line text fix should not sit next to a payment flow change or a database migration. Split the data into rough groups such as small bug fix, medium product change, and larger backend task.

A small product team can do this in a spreadsheet or issue tracker without extra ceremony. One row per task is enough: start date, live date, work type, blocker notes. After ten to twenty items, patterns start to look real. If most delays come from approvals, release batching, or missing specs, a rewrite will not fix them.

What workaround patterns usually reveal

Modernize Without Starting Over

See where AI-first development and automation can remove manual engineering drag.

Discuss Your Stack

When people keep inventing side routes to finish normal work, the product is telling you something. One odd trick can be harmless. The same trick repeated by several people usually means the normal path is slow, brittle, or unclear.

Start by writing down every extra step the team repeats. That includes little things people stop noticing, like rerunning a script by hand, editing data in the database, or checking the same setting before each release. These actions feel small, but together they show where the system makes routine work harder than it should be.

Pay close attention when people bypass tests, code review, or internal tools. Teams do this for a reason. Sometimes they do not trust the tooling. Sometimes the process is too slow for the pressure they feel. Either way, the workaround points to a problem in the system, not a personal failure.

Patterns that usually matter

A short list helps when you review the last two weeks:

The same extra step shows up in many tickets or releases.
Developers copy the same fix into several features.
Someone skips part of testing because the full path takes too long.
One person knows the

A simple two-week process the team can follow

Most teams do not need a new tool for this. A shared sheet works better because everyone sees the same facts, and each update takes a few minutes at most.

Set up one sheet with three tabs: bugs, lead time, and workarounds. Give the whole team access. If one person owns all updates, the log goes stale fast and people stop trusting it.

A clean routine helps more than a perfect format:

On day one, agree on what counts as a bug, when lead time starts and ends, and what you mean by a workaround.
Every day, each developer adds short notes to the sheet. One line per item is enough.
Keep entries factual. Write "payment bug, found Monday, fixed Wednesday, patch in two files" instead of "payments are a disaster."
Meet twice a week for 20 minutes. Look for repeats, not opinions.
At the end of week two, write a short summary and hold one decision meeting.

The bugs tab should show what broke, where it happened, how serious it was, and how long the fix took. The lead time tab should track small changes from start to release. The workaround tab should capture the odd stuff people do to get work done, like restarting a service, skipping a flaky test, or making the same manual edit in three places.

Keep the notes dry and boring. That is a good sign. When people add emotion, they usually blur the pattern. A note like "needed three extra approvals because deploy scripts are fragile" tells you more than "releases feel painful."

The twice-weekly review matters because memory gets edited fast. In those meetings, ask simple questions. Do the same areas keep breaking? Do small changes sit too long before release? Do people rely on the same workaround again and again?

Then close the two weeks with a one-page summary. Count the bugs, note the average lead time, and list the repeat workarounds. That gives the team a software rewrite decision based on evidence instead of frustration.

Quick checks before you say yes to a rewrite

Plan a Smaller Replacement

Replace the part that slows the team down instead of restarting the whole product.

Start Planning

A software rewrite decision needs evidence, not one rough sprint. Teams often confuse irritation with proof. That is how they trade a messy codebase for a long, risky project that solves less than expected.

Use a few plain checks before you approve anything big.

Ask where bugs land. If they keep coming from the same module, checkout flow, or background job, you may have one damaged area rather than a whole product that needs replacement.
Look at small changes, not big projects. If a text tweak, pricing rule, or minor API edit still takes days, the problem may be weak boundaries, poor tests, or unclear ownership.
Count workarounds. When the team adds another manual step, feature flag, exception, or one-off script every week, the code is starting to resist normal work.
Check who holds the map in their head. If one engineer knows the risky files, deploy order, and old edge cases, the team has a knowledge problem as well as a code problem.
Watch what happens after release. If each fix creates new side effects, longer QA, or rollback fear, the system is getting harder to change.

One "yes" does not justify a rewrite. Four or five strong "yes" answers usually mean the pain is structural.

A small example makes this easier to judge. Say a team changes one pricing rule. The change looks tiny, but it waits three days, breaks a report, and needs one senior engineer to explain a hidden dependency before release. That is more than normal friction. It shows repeated drag in code, process, and team knowledge.

If the same pain returns on small changes, in the same areas, week after week, then a rewrite may be worth serious discussion. If the pain stays local, fix the local mess first.

A realistic example from a small product team

A seven-person SaaS team had one problem area: an old billing module that handled plans, discounts, proration, and invoice timing. Most of the product felt stable. Billing did not.

Routine pricing updates started to hurt. A simple change to annual discounts led to two invoice mistakes, one failed upgrade path, and a support spike the same week. Two weeks later, a small plan rename broke an internal report and sent the wrong totals to finance.

The team stopped arguing about whether the code "felt bad" and tracked what happened for 14 days. They found that five of the eight bugs reported in that period came from billing work, even though billing changes were a small share of total engineering time.

Lead time told the same story. A pricing change used to take about three days from request to production. Now it took 11 to 14 days. The coding itself was not the main delay. Engineers spent time reading old condition blocks, checking edge cases by hand, and waiting for someone who still remembered the weird parts of the module.

Workarounds piled up fast. One engineer kept a private checklist before every billing release. Another ran manual database queries to confirm proration values. Product managers avoided small pricing experiments because "billing always turns into a week-long job." That is usually a stronger signal than complaints in a planning meeting.

By the end of the two weeks, the software rewrite decision looked much narrower. The whole product did not need a rewrite. The team chose to replace only the billing rules engine, where most bugs and delays started, and keep the customer portal, invoices, and admin screens in place for now.

That choice cut risk. They could test the new billing core behind the old interface, move one pricing path at a time, and stop dragging the rest of the product into a problem it did not create.

Mistakes that lead to the wrong call

Rewrite or Refactor Review

Get a clear second opinion before you commit to a full rewrite.

Book Review

A single bad release can push smart teams into a bad software rewrite decision. If Friday night turned into a fire drill, people want distance from the code that hurt them. That reaction is human, but it is still just one data point. One painful release may point to weak testing, rushed approvals, or poor rollback habits as much as old code.

Teams also blur product delays and engineering lead time far too often. If a feature sat for four days because the spec changed twice, that is not the same as code taking four extra days to move. Count the time engineers control. If you mix design churn, shifting priorities, and waiting on stakeholder feedback into the same bucket, the codebase looks worse than it is.

Workarounds need weighting. A developer who runs one manual script at the end of the month is dealing with a nuisance. A team that copies data between two systems every day, checks it by hand, and fixes broken edge cases all week has a real pattern. Put those in separate groups, or you will make every annoyance sound like proof that the whole system needs replacement.

It is also a mistake to ignore the parts that still change safely. Most systems are uneven. One area may feel brittle, while another ships clean updates every week. That matters. If payments are fragile but account settings and reporting still move without drama, a full rewrite may be too broad.

A simple check helps:

Look at the last two weeks, not the worst day.
Separate waiting time from coding time.
Rank workarounds by frequency and cost.
Mark modules where the team still ships changes with low stress.
Ask which problems come from habits, not code.

That last point matters more than teams like to admit. A rewrite will not fix vague tickets, weak code review, missing ownership, or a habit of merging late on Friday. Those problems move into the new codebase almost immediately.

If the team cannot name what will change in its day to day work, a rewrite is probably a story people tell themselves to escape frustration.

What to do after the two weeks end

Put the notes, bugs, and delivery data in one place and force the choice down to three options: keep, refactor, or replace. If the team cannot name the option in one sentence, the evidence is still too fuzzy.

A simple split works well:

Keep it if the pain is real but limited, and the team can still ship without daily drama.
Refactor if the same parts slow people down again and again, but the product still has a usable core.
Replace only if the data shows the current code blocks normal work, creates repeat bugs, and resists small fixes.

Most teams should start with the smallest change that removes the biggest drag. If one service causes half the delays, fix that service first. If one ugly module creates most of the workarounds, replace that module instead of throwing away the whole product.

This is where many bad rewrite calls happen. People feel relief when they imagine starting fresh, so they treat that feeling like proof. The better test is boring: which option cuts bug repeats, shortens lead time, and removes the workaround people complain about most?

Write down the decision and the reason in plain language. A good note is short: "We will refactor the billing flow because it caused 40% of our fix time and three of the five recurring workarounds." That gives the team something concrete to check later.

Set a review date about a month out. That date matters because the first decision is not final. You want to see if the chosen path changed the numbers or just changed the mood.

If the team stays split, bring in an outside CTO or senior advisor for a neutral read. Fresh eyes help when internal debates turn personal. Oleg Sotnikov offers practical Fractional CTO support for rewrite, architecture, and delivery decisions, which can help if you want an opinion grounded in shipping and operations rather than team frustration.

A good software rewrite decision usually feels smaller and less dramatic than people expect. That is often a good sign.