Lean test pyramid for products with changing requirements
Build a lean test pyramid that covers checkout, login, and core user flows, so your team can handle weekly changes without slow releases.

Why weekly changes break normal test plans
When product requirements change every week, a detailed test plan goes stale almost immediately. A full regression suite sounds safe, but it often turns into a queue of slow checks, brittle scripts, and constant rework. By the time the team updates every test case, the product has changed again.
That creates a bad habit. Testing starts to feel like a gate to pass before release instead of a way to protect the release itself. Long suites get skipped. Manual checklists shrink. Nobody feels sure the latest change did not break something two screens away.
Checkout and login bugs still slip through for a simple reason: many teams spread their effort too evenly. They spend time on low risk pages, tiny UI details, or old edge cases, while the paths that matter most get one quick pass. A green build can still hide a broken coupon flow, a failed payment callback, or a sign in issue that appears only after a session refresh.
Those bugs are common because they cross several moving parts. Checkout touches pricing rules, taxes, payment providers, emails, and order creation. Login touches sessions, redirects, rate limits, password reset, and user roles. A weekly change in one area often breaks something nearby.
The fix is not to test everything harder. It is to protect the parts that keep the business running and keep users inside the product.
For most teams, that means starting with a short list:
- money flows such as checkout, renewals, and order confirmation
- access flows such as sign up, login, logout, and password reset
- the main workflow users came for
- anything that can lose data or create wrong records
Other areas still matter, but they do not need the same depth every release. A settings page, a new label, or a rarely used admin filter can wait behind the paths that bring in revenue and let users do the main job.
When scope changes every week, a smaller plan with the right priorities beats a bigger plan that nobody can keep current.
What to protect first
Start with the paths that hurt most when they fail. When requirements shift every week, you cannot protect everything at the same depth. You need to cover the parts that lose money, lock users out, or stop the main job your product exists to do.
Money flows usually come first. For many products, that means checkout, plan upgrades, renewals, and refunds. If any of those break, the damage is immediate. Revenue drops, support gets flooded, and trust falls fast.
Auth comes next, and sometimes it belongs at the top. If people cannot sign up, log in, reset a password, or stay signed in after a session expires, the rest of the product barely matters. Auth bugs often look small in code and huge in real life.
After that, pick two to four workflows people use every day. Keep the list short. Choose the actions that make users come back, not the nice extras. In a SaaS product, that might be creating a project, inviting a teammate, sending a report, or publishing an update. If a workflow is common and tied to retention, protect it early.
Rank by damage and frequency
A simple rule works well: rank each path by cost of failure and frequency of use. High cost and high frequency go first. Low cost and low frequency can wait.
Ask four questions:
- Does this failure stop revenue or create refunds?
- How often do users hit this path?
- Will support need manual work to recover it?
- Will one failure make users doubt the product?
That cuts through a lot of debate. A broken checkout matters more than a broken profile picture upload. A broken login matters more than a rare admin setting. A renewal bug that quietly cancels paid accounts can be worse than a visible UI issue because teams often notice it late.
Protect complete paths, not isolated screens. Test the full signup flow, not just the form. Test a real payment success and failure path, not only the button click. Test session expiry all the way through re login, because that is where small mistakes tend to hide.
If your team is small, this is enough to start. Protect money, auth, and a few daily workflows first. Everything else can get lighter coverage until it proves it needs more.
Set the shape of the pyramid
When requirements move every week, test speed matters as much as coverage. A lean pyramid keeps most checks low in the stack, where they run fast, fail clearly, and cost less to update.
Put unit tests at the base. These should cover business rules that can break money or trust: price calculations, discount logic, tax rules, refund rules, access checks, and status changes. If a product manager changes how a free trial works on Tuesday, a good unit test suite should tell you that same day whether the new rule still works.
Use the middle layer for API and service tests. This is where you check that real parts of the system still talk to each other: the app creates an order, the payment service receives the right amount, the user record updates, and the event log keeps the right status. These tests catch the messy problems unit tests miss, but they still run much faster than browser tests.
Keep the top layer small. Use UI tests only for complete journeys that matter most to the business, such as signing up, logging in, paying successfully, recovering a password, or finishing the main task in the product.
For many teams, a rough split works well: about 70 unit tests, 20 to 25 API or service tests, and 5 to 10 end to end UI tests for every 100 automated checks. The exact numbers can change, but the top layer should stay small on purpose.
That is the pyramid in practice. You get fast feedback on business rules, enough integration coverage to catch real breakage, and a few full flow checks for checkout, auth, and the workflow your product depends on.
If a browser test fails, treat it like a smoke alarm. Useful, but not your main source of truth. The heavy lifting should happen lower down, where failures are easier to read and fixes are quicker.
Build the first version step by step
Pick three journeys before you write a single test: how money moves, how a user signs in, and what a user does most days. Keep them narrow. One checkout flow, one login or password reset flow, and one everyday task such as creating an order, sending a message, or updating a record.
That gives the pyramid a clear shape. You are not trying to prove that the whole product is perfect. You are protecting the parts that lose revenue, lock users out, or break the main reason people use the product.
Write unit tests first. They are cheap, fast, and easy to fix when product rules change late in the week. Put price rules here: taxes, discounts, rounding, trial limits, coupon expiry, and refund math. Put permission checks here too: who can view, edit, approve, or delete. Edge cases belong here because they pile up fast and quietly cause expensive bugs.
Then add API tests around the business flow. Test order creation, payment status changes, session creation, logout, token refresh, and the state changes that matter after each action. If a cart becomes an order, or a draft becomes a paid subscription, the API should prove that change happened and stored the right data.
Add only one UI test for each path that really must work from start to finish. In many products, that means one successful checkout, one auth flow that reaches the app, and one daily workflow that completes without errors.
Keep those UI tests boring on purpose. Do not use them to cover button color, tooltip text, or every corner of the screen. UI tests are slower and break for small reasons, so each one needs a clear job.
Run the fastest tests on every code change. That usually means unit tests and a small set of API checks. Run the broader API suite and the few end to end UI tests on merge or before release.
A simple rule helps: if a test takes seconds, run it all the time. If it takes minutes, save it for the points where risk is higher. That rhythm gives teams enough safety without turning every release into a waiting game.
Keep each layer small and useful
This approach gets bloated fast when teams add one more browser test after every bug. A few weeks later, the suite is slow, noisy, and hard to trust. Smaller layers work better because each test has one clear job.
Most rules belong in unit and service tests. If a browser test repeats the same price rule, tax rule, or permission rule already covered below, delete the duplicate. Keep UI tests for what only a real user flow can prove: the user logs in, the cart keeps its items, the payment goes through, and the order reaches the next step.
Full user flow tests should ignore cosmetic details. A checkout test does not need to verify button spacing, font size, or the exact order of every label. When design details matter, check them with a focused component test or a visual check, not the most expensive test in the suite.
Stable data matters more than many teams expect. If a payment test depends on yesterday's seed script, random coupon codes, or a shared account that other tests change, it will fail for the wrong reason. Small fixtures with fixed inputs work better: one active user, one expired card, and one plan with a known discount.
Test names should read like user intent. "User can upgrade from basic to pro with saved card" is much clearer than "billing_upgrade_v2_spec". When requirements change late in the week, clear names make it easier to decide what still matters and what should change.
Once a month, review the slowest and flakiest tests. Remove tests that repeat lower layers. Split tests that try to cover too many steps. Fix or drop tests that fail because of the environment. Merge near identical cases into one cleaner scenario.
If a test does not catch a real risk, it adds cost every week. Keep the suite lean and it will protect checkout, auth, and core workflows without slowing delivery.
A simple example: pricing change before launch
Tuesday morning, the product team changes two things at once. The Pro plan now includes more seats, and a 20% launch discount applies only to new annual customers. Release is on Friday, so nobody has time for a full regression pass.
This is where a lean pyramid pays off. The team does not retest every settings page, every email, and every old checkout variation. They focus on the places where a pricing mistake turns into lost revenue, bad renewals, or support tickets.
At the unit level, developers test the pricing rules directly. Those tests are fast, and they catch the mistakes people make most often with last minute pricing edits: monthly versus annual billing, new customer versus existing customer, seat limits at the plan boundary, and discounts applied once instead of leaking into renewals.
A few API tests sit above that. They call the real pricing and billing endpoints and verify the numbers other systems will actually use. One test checks that a valid coupon changes the total. Another confirms tax is added in the right order. A third makes sure the renewal response drops the launch discount when it should.
Then the team keeps a single UI test for the money path. It signs in as a normal user, picks the upgraded plan, enters payment details in the test environment, and waits for the success screen. If that flow works, the team knows the main purchase path still holds together from browser to backend.
They do not spend hours clicking through profile screens, old reports, or low traffic admin pages the pricing change did not touch. That would slow the release without reducing much risk.
This small stack of tests gives the team enough confidence to ship. Wrong totals get caught early, API contract mistakes show up before release, and one browser check proves a customer can still pay and upgrade. For weekly changes, that is usually the right trade off.
Mistakes that waste time
Teams usually lose time because they protect the wrong things first. When requirements change every week, a huge pile of UI tests feels safe, but it often does the opposite. Screens change, labels move, flows get reordered, and suddenly half the suite fails even though the product still works.
That is why writing lots of browser tests before the business rules settle is such a bad trade. You end up fixing selectors and screenshots while the real risks sit elsewhere. Start lower. Put more checks around pricing logic, permissions, totals, and the rules that break money paths.
Another common mistake is treating every screen as if it carries the same risk. It does not. A broken help page is annoying. A broken checkout, signup, or invoice flow costs real money. Good prioritization means ranking flows by damage, not by how visible they look in a demo.
Teams also skip awkward paths too often. Password reset, session timeout, expired links, forced logout after role changes, and failed payment retries are less exciting than the happy path. They are also the cases that create the support tickets that swallow a whole afternoon. If users cannot get back into their account, the rest of the app barely matters.
Flaky tests waste time in a quieter way. One unstable test blocks a release, people rerun pipelines, and trust drops fast. After a couple of weeks, the team stops believing failures. Then a real bug slips through because everyone assumes the suite is crying wolf again.
A lean suite stays lean only if you remove dead tests. Many teams add new checks after every bug, then keep old checks for flows that no longer exist. The suite grows, runtime grows, and nobody wants to touch it.
A cleanup is overdue when UI failures outnumber logic failures, old flows still appear in test names, one flaky test gets retried every release, password reset or session expiry has no coverage, or small copy changes break dozens of tests.
If a test no longer protects a real risk, delete it. That often helps more than adding another one.
Quick checks before every release
A release check should take minutes, not half a day. When requirements move every week, teams need a short habit that catches expensive mistakes without freezing delivery. That is where this pyramid helps: most checks stay cheap and fast, and a few release checks confirm that the product still works where money and trust are on the line.
Before you ship, answer four questions:
- Did checkout finish successfully today, from cart to confirmation?
- Did login work, including the most common permission level and one blocked case?
- Did one daily task complete from start to finish using the current rules?
- Did prices, access rules, and user facing error messages match the latest change?
These checks sound basic. That is why they work. If checkout passes, login works, and one common task finishes, you have covered the paths that usually cost real money and create real support tickets.
When a test fails, one person should inspect it and decide what happened. Do not rerun the whole suite in a panic. Many failures come from stale test data, an old assertion, or a rule that changed yesterday. A quick human read often saves 20 minutes and stops the team from chasing noise.
Keep a short note after each release. If the team finds a hole, such as a missing permission case or a pricing rule nobody checked, add one new test the next day. Do not try to patch every gap before shipping. Fix the riskiest one while the change is still fresh.
This works especially well for small teams that ship often. A tiny team can move fast and still protect uptime if it checks a few release critical flows every day. That is much better than owning a huge suite nobody trusts.
Do it consistently and the suite gets better in a practical way. Each release leaves behind one small improvement, and over a month those improvements add up to a test pack that matches how the product actually changes.
Next steps for teams that need more structure
Start with one shared page. If product, engineering, and QA do not agree on what must never break, the test suite will grow in random directions.
Keep that page simple. Split it into three parts: money flows, auth, and daily workflows. Then list the few actions that matter most, such as starting checkout, finishing payment, signing in, resetting a password, inviting a user, or completing the task people do every day.
This approach works better when you set hard limits early. Small teams do not need a giant suite. They need enough coverage to catch expensive problems before users do.
For many teams, a good starting point is a small set of end to end checks for checkout, login, and one or two common user journeys, backed by more service tests around pricing rules, permissions, and state changes, plus unit tests for calculations, validation, and the edge cases that change often.
Match those limits to your release pace. If you ship several times a week, keep the top of the pyramid very small. If one browser test breaks every other day for no user facing reason, remove it or move that coverage down a layer.
You should also decide when to review the pyramid. Pricing updates, permission changes, and onboarding changes are good triggers because those areas create hidden failures and old test choices stop making sense fast.
Keep the review light. Update the one page map when requirements change. Ask which layer should cover the new risk. Remove tests that no longer protect a real business path. Add release checks only for flows people actually use.
That cuts a lot of wasted debate. Instead of arguing about every test, the team can ask one direct question: does this protect revenue, identity, or a common workflow?
If your team keeps circling the same release problems, getting an outside view can help. Oleg Sotnikov at oleg.is works as a Fractional CTO and startup advisor, helping teams set practical technical boundaries around architecture, infrastructure, and AI first development without adding process for the sake of process.
That is usually enough structure to move faster while still protecting the parts that cost real money when they fail.
Frequently Asked Questions
Do I need a full regression suite every week?
No. Start with checkout, login, and the main task users do most. That gives you better protection than a huge suite that nobody updates in time.
What should I test first when requirements keep changing?
Protect money flow, auth, and any workflow that creates or changes real records. If one failure loses revenue, locks users out, or creates cleanup work, cover that before low risk pages.
How many UI tests should I keep?
Keep only a few full journey UI tests. One for checkout, one for auth, and one or two for the main workflow usually cover the highest risk. Put most rules in unit and API tests, where changes cost less.
Where should pricing and permission rules live?
Put them in unit tests first, then add a small number of API checks around totals and access changes. Browser tests should only confirm that the full flow still works for a real user.
When should each test layer run?
Run unit tests and a small API smoke set on every commit. Run the broader API suite and the small UI set on merge or before release. Fast checks should run often, while slow checks should guard the moments with more risk.
How do I choose the main workflow to protect?
Look at what users do most often and what makes them return. In a SaaS app, that might be creating a project, sending a report, or inviting a teammate. Choose the flow that hurts retention when it breaks.
Why do browser tests get flaky so fast?
They often rely on unstable data, changing selectors, or long flows with too many steps. Use fixed test data, shorten each scenario, and stop checking cosmetic details in full user journeys.
Should I test edge cases in the UI?
Usually no. Put edge cases like coupon expiry, session refresh, tax math, or role checks in lower layers. Use UI tests for the normal path and one or two failure paths that users hit often.
What should I check right before a release?
Confirm that checkout completes, login works for a normal user and one blocked case, and one common task finishes under the latest rules. If those pass, you have covered the paths that usually create the most damage.
When should I delete a test?
Delete it when it repeats coverage from a lower layer, protects a flow nobody uses, or fails because of environment noise more than real bugs. A smaller suite that people trust beats a bigger one that slows every release.