Exception taxonomy for business workflows before automation
Use an exception taxonomy for business workflows to sort edge cases, set clear automation limits, and stop weekly scope drift before build work starts.

Why automation scope keeps drifting
Most workflow projects start with the happy path. A team maps the normal order, approval, refund, or handoff, then folds in a few odd cases at the same time. That feels efficient for about a week. After that, the exceptions start pulling the design in different directions.
A lot of the trouble starts with naming. Sales calls it "urgent approval." Finance calls the same thing "manual override." Support says "special handling." People mean the same exception, but they describe it three different ways. The build team hears three problems instead of one, so the workflow grows extra branches before anyone notices.
One-off requests add even more noise. A manager remembers a customer who needed a late invoice change, or a supplier who skipped a document once, and asks for that case to be built from day one. Teams often treat those stories like normal demand even when they happened once in six months. Scope drifts because nobody asks two simple questions: how often does this happen, and what does it cost if a person handles it?
Human judgment makes things worse when teams leave it unnamed. Some cases do not need more rules. They need a person to review the context, call a vendor, approve a policy exception, or reject a risky request. If nobody marks those cases early, developers keep trying to turn judgment into logic. The result is usually brittle automation, more testing, and longer delivery.
The build team then finds even more branches during implementation. A missing field changes one path. A duplicate request creates another. A customer with special terms creates a third. Each branch looks small on its own, but together they change the workflow, the timeline, and the budget.
That is why an exception taxonomy helps before any build work starts. It gives odd cases simple names, shows which ones repeat, and makes it clear where automation should stop and a human should decide.
What an exception actually is
An exception is not any small mess in a workflow. It is a case that cannot follow the usual path because the situation is genuinely different. The team still knows how to handle it, but it needs a different rule or a different owner.
You can usually spot an exception by looking at the next action. Maybe a manager approves instead of a team lead. Maybe finance answers in one day instead of three. Maybe legal takes over. If the next step, owner, or deadline changes, you probably have an exception.
Frequency matters too. A strange case from eight months ago does not need a label yet. But if people bring up the same odd case every week, give it a name. Once it has a label, you can count it, review it, and decide later whether it belongs in automation.
It also helps to separate real exceptions from lookalikes. Bad data, such as a missing price or wrong customer ID, is not an exception by itself. A bug is not an exception either. Neither is a duplicate request. Those problems only become exceptions if the business has an approved alternate path for handling them.
A simple test works well: when this case appears, does the team do something different on purpose? If the answer is yes, label it. If the team only fixes a mistake and returns to the usual path, it is not an exception.
That sounds small, but it changes planning fast. Teams waste a lot of time when they mix real exceptions with broken inputs and system errors. Clear labels make later automation decisions much easier.
Put weekly odd cases into simple buckets
Most teams talk about "odd cases" as if they are one messy pile. That is where scope starts to drift. Sort them into a few plain buckets and the picture gets clearer. You can see which cases need rules, which need better input, and which should stay manual.
Keep the taxonomy boring. Five buckets are usually enough for a first pass:
- Missing input
- Policy or money risk
- Urgent but allowed
- Single-person approval
- External proof or loop-back
That first split matters most. Missing data is not the same as a policy exception. If a purchase request has no cost center, the system can ask for it and pause. If the same request is above the allowed spend, that is a rule problem. You should not solve both with the same workflow.
Urgent cases need their own label too. Teams often mix rush work with risky work, and that creates bad shortcuts. A same-day laptop replacement for a new hire may be urgent, but it is still normal if it follows policy. A payment to an unapproved vendor is different. Speed does not make it safe.
Name cases that depend on one person. If the CFO, legal lead, or warehouse manager must say yes, mark that clearly. Those cases do not need clever automation first. They need a clean handoff, reminders, and visible status.
Also track items that stall because someone outside the company must send a form, invoice, or signed document. Keep another tag for work that keeps bouncing back to the queue. Repeat returns often point to a bad form, a missing rule, or unclear ownership. That is process repair, not an automation problem.
How to build the taxonomy step by step
Start with evidence, not memory. Teams usually remember the strangest case from last month and miss the small repeats that eat time every week. A good taxonomy comes from real work your team already handled.
Use one shared sheet or doc. Keep it plain. If people need training to fill it in, the system is already too heavy.
- Pull one week of odd cases from email, chat, and tickets. One week is usually enough to show patterns without turning this into a research project.
- Rewrite each case as one plain sentence: "When X happens, the team does Y because Z." For example, "When a customer leaves out the contract number, support asks sales to confirm it before billing continues."
- Merge duplicates early. People use different words for the same problem, so clean that up before debate starts.
- Give each case a bucket, an owner, and a rough frequency. Simple labels are enough: missing info, rule conflict, approval delay, outside system problem, or unusual customer request.
- Record the next action the system should take. Be specific. Ask for missing data, route to a person, pause the task, reject it, or create a follow-up ticket.
That last step matters more than most teams expect. If you only label cases and stop there, you get a neat list that does not help build anything. Automation needs a decision path, not just a category name.
Keep the wording boring
Write every case in the same style, even if the original message was messy. Short, flat sentences make duplicates easier to spot. They also stop arguments later when two people think they found different exceptions but actually found the same one.
Add frequency before you debate automation
Some odd cases feel urgent because they are annoying, not because they happen often. Mark each one as daily, weekly, monthly, or rare. That tiny note keeps scope from drifting. A case that shows up twice a year may deserve a clear manual rule, not custom build work.
By the end, you should have a short list of repeated exceptions, who handles them now, and what the system should do next in each case. That is enough to estimate the work honestly.
Decide what to automate now and what to leave manual
Start with the cases your team sees every week. If the same odd case shows up on Monday and again on Wednesday, move it near the top of the list. Repetition is the clearest sign that automation will pay off soon.
The next test is agreement. When two people look at the same case, do they reach the same answer without a long chat? If they do, you probably have a rule you can automate. If one person approves it, another rejects it, and a manager breaks the tie, keep that step manual for now.
A practical first release usually fits these rules:
- The case happens often enough to matter.
- The team follows one rule most of the time.
- The system already has the data needed to decide.
- A wrong decision is easy to catch and fix.
- The case does not depend on shifting human judgment.
Judgment-heavy work should stay with people. Discounts for a long-term customer, approvals for a rushed order, or exceptions for a supplier problem often depend on context that lives in someone's head. Trying to force those into version one usually creates more rework than speed.
Rare outliers should wait too. Teams often spend hours discussing a strange case that happened once last quarter, then treat it like a major requirement. That is how scope drifts. Put rare cases in the taxonomy, label them out of scope, and move on.
Your fallback path needs to be clear before build work starts. When the workflow hits something outside scope, the system should hand it to a named person or queue, log why it stopped, and keep the case moving. Do not leave people guessing whether they should retry, override, or open a chat thread.
A purchase approval flow makes this easy to see. If the request is under budget, uses an approved vendor, and includes the right fields, automate it. If the request is split across cost centers or breaks a policy that people interpret differently, send it to review.
That is the whole point: draw a clean line around what the first release will handle, and make everything outside that line predictable instead of messy.
Example: purchase approvals with odd cases
A purchase approval flow looks clean on paper. An employee submits a request, the manager approves it, and finance pays or reimburses it. That is the normal path.
The mess starts when the team handles unusual cases by habit instead of rule. People keep adding one more branch, one more exception, one more special note. The flow gets bigger while clarity gets worse.
Take a simple example. Sam buys a monitor for a new hire, uploads the receipt, gets manager approval, and finance closes the request. That path should stay simple. The real work is naming the odd cases that appear every week.
Useful buckets for this workflow
A practical version for purchase requests might look like this:
- Missing proof of purchase
- Urgent request with no manager reply
- Split vendor purchase
- Suspected fraud or policy abuse
Notice what happens with receipts. Teams often describe five versions of the same problem, then build five mini-rules around them. That usually wastes time. If finance takes the same action every time, keep them in one bucket and ask for acceptable proof.
Urgent orders deserve their own bucket because the next step is different. If a laptop fails and the replacement must go out now, waiting three days for a manager reply is not a normal delay. The workflow needs a fallback, such as a backup approver or a finance review after a short timeout.
Split purchases need their own rule as well. One vendor invoice may cover software for two departments, or one equipment order may need to hit two budgets. If you force that into the normal path, people work around the system with emails and side notes.
Finance teams often keep fraud cases manual at first, and that is usually the right call. Suspicious claims are rare, expensive, and hard to judge with simple rules. Leave them outside the first automation pass.
If a team can name these buckets before build work starts, the workflow stays smaller and the estimate stays honest.
Mistakes that waste time
When the taxonomy is vague, the team loses time before anyone builds a rule. The trouble often starts with categories that feel convenient but explain nothing. "Other" is the usual one. Once people drop cases there, they stop learning from them.
A simple test helps: can two people read the same case and put it in the same bucket? If they cannot, the bucket is too loose. Labels like "other," "special case," or "needs review" hide patterns that later show up as surprise work.
Teams also waste time when they mix policy disputes with simple input mistakes. Those are different jobs. If a customer typed the wrong address, someone can correct it quickly. If a manager argues that a purchase should bypass a spending limit, the team needs a decision, not a data fix.
When both issues sit in one bucket, estimates fall apart. People think they are covering a few edge cases, but they are really dragging business policy into the build.
Another common mistake is letting one loud incident change the whole plan. A single missed order or angry customer can push the team to redesign the workflow around a rare event. That may feel sensible in the moment. It is usually a bad trade if that case appears once or twice a year.
Frequency matters more than volume in the room. If nobody tracks how often each exception appears, scope turns into opinion. A case that happens 30 times a week deserves more attention than a dramatic story from last quarter.
The last big time sink shows up after manual review. Someone marks a case "approved" or "rejected," but nobody defines what happens next. Who updates the record? Who tells the customer? Does the item return to the normal flow, or does a separate path start? If that handoff stays fuzzy, people fill the gap with side chats and spreadsheets.
Watch for a few warning signs: people keep adding labels without writing rules for them, staff sort the same case into different buckets, rare incidents get more meeting time than common ones, and reviewers make decisions but nobody owns the follow-up.
A clean taxonomy does not need fancy labels. It needs clear boundaries, counts, and a defined handoff after review. That is what keeps scope from drifting.
A short checklist before build work starts
Build work goes sideways when the team starts coding before it agrees on the messy cases. A short pre-build check keeps the scope small, the handoffs clear, and the first release honest.
Use one working list and make sure every item answers five questions. Does it have a short plain name? How often does it happen, and how risky is it? Who owns it today? Is it in scope for version one or clearly manual? And can the team turn the top weekly exceptions into test cases before build starts?
This does not need a big workshop. One operations lead, one person who does the work every day, and one builder can usually finish it in under an hour.
If that group cannot agree on names, frequency, risk, owners, and manual boundaries, the exception map is still too fuzzy for automation. Waiting a week is cheaper than shipping a flow that breaks on the same problems your team already knows by name.
What to do next
A taxonomy only matters if it changes the build plan. Turn the cases you collected into version one scope, not a giant backlog. If a bucket shows up often, costs real time, and follows a clear rule, include it in the first build.
Keep the first release narrow. Three to five buckets is enough for most teams. Once you go past that, the work usually fills up with rare cases, extra approval rules, and arguments that slow everything down.
A good first bucket is easy to spot, shows up every week, takes real time to handle manually, and follows a clear action path. If the risk stays low when the rule fires correctly, it is a strong candidate for automation.
Leave the messy leftovers manual for now. That is not a weakness. It gives the team a clear edge for the first release and keeps the project from swelling around rare cases.
A simple example makes the point. If missing invoice fields show up every day and staff always send the record back for correction, automate that route first. If one supplier once asked for a special six-person approval chain, log it, but do not build around it yet.
Set a review point after two weeks. Look at every new odd case the team logged. Some deserve a new bucket because they repeat. Others are one-off problems caused by bad data, a broken handoff, or a rule nobody actually follows.
Retire buckets that never appear again. Old buckets make the workflow look more complex than it is, and they tempt teams to add controls nobody needs.
If you want a second opinion before build work starts, oleg.is is Oleg Sotnikov's site. He works as a fractional CTO and startup advisor, helping small and medium teams turn messy process rules into practical software and AI-first operations without bloating the first release.
Frequently Asked Questions
What counts as an exception in a workflow?
An exception is a case that cannot follow the usual path because the next step, owner, or deadline changes on purpose. If the team handles it differently by rule, it counts as an exception.
How do I tell an exception from bad data or a bug?
Look at what the team does next. If people just fix missing data, remove a duplicate, or correct a system issue and return to the normal path, that is not an exception. If they switch owners, rules, or deadlines, it is.
Why does automation scope keep growing?
Scope grows when teams mix normal work, rare stories, data issues, and judgment calls into one design. Different teams also name the same odd case in different ways, so builders create extra branches for what is really one problem.
How many exception buckets should I start with?
Start with three to five buckets. That gives you enough structure to spot patterns without turning the taxonomy into another project.
What buckets make sense for a first pass?
Use plain buckets that match what people actually do. Missing input, policy or money risk, urgent but allowed, single-person approval, and outside proof or loop-back work well for many teams.
Should we automate rare odd cases?
Usually not. If a case shows up a few times a year, give staff a manual rule and move on. Save build time for repeated cases that cost real time every week.
When should a case stay manual?
Leave it manual when people need context, judgment, or a policy call to decide. If two reviewers often reach different answers, the workflow should route the case to a named person instead of forcing a rule.
What should happen when the workflow hits something out of scope?
Send it to a named person or queue, record why the flow stopped, and make the next owner obvious. That keeps work moving and stops people from guessing whether to retry, override, or start a side chat.
How do we build an exception taxonomy quickly?
Pull one week of odd cases from tickets, chat, and email, then rewrite each one as a plain sentence. Merge duplicates, add a bucket, owner, frequency, and the next action the system should take.
What mistakes should we avoid before build starts?
Teams lose time when they use vague labels like "other," let one loud incident drive the plan, or stop at naming cases without defining the next step. Trouble also starts when nobody tracks frequency or owns the handoff after review.