Mar 19, 2026·7 min read

Business rules before AI: what to put in code first

Business rules before AI keep pricing, approvals, and legal limits out of prompts, so your automations act the same way every time.

Why rules fail when they live inside prompts

A prompt can sound clear and still produce mixed answers. Ask an AI assistant, "Approve a discount if the deal looks reasonable and stays within policy," and you'll likely see inconsistent results. One reply approves 12%. Another asks for manager review. A third rejects the request because the margin looks too low.

That happens because the model is predicting the next likely answer. It is not enforcing policy. When a rule sits inside a sentence, the model treats it like guidance. Small changes in wording, chat history, or attached notes can push the response in a different direction.

People read vague rules differently too. One manager thinks "reasonable" means up to 10%. Another allows 15% for a long-term customer. Legal cares about limits that sales never mentions. Everyone assumes the AI has one rulebook, but the real rulebook is scattered across memory, chats, and old documents.

Prompts also age badly. Someone updates policy in a meeting, someone else adds an exception in email, and nobody rewrites the prompt the same way. Before long, the AI follows a blurry mix of old rules and recent habits. Code has its own flaws, but at least it gives the team one place to check.

Once people cannot predict the outcome, trust drops fast. Sales reps keep rewriting prompts until they get the answer they want. Support agents send routine cases to a person because the AI feels random. Finance and legal cannot explain why one case passed and another failed. Customers usually notice that inconsistency first.

The fix is simple: put hard limits in software first. Price floors, approval thresholds, blocked actions, refund windows, and contract limits should live in code or in a rule table the system enforces. Then let AI work inside those boundaries, where judgment and language actually matter.

What should go in code first

Anything with a fixed answer belongs in code. If the same input should always produce the same result, the model should not decide it.

That usually includes list prices, discount caps, tax handling, refund windows, contract limits, region checks, and any rule tied to money, geography, risk, or compliance. This is what business rules before AI means in practice.

Pricing is often the first place teams get burned. A model can describe an offer, but it should not decide whether a customer gets 5% off or 25% off. Store pricing rules in code: price tables, discount bands, bundle logic, and exception paths. Then AI can explain the quote in plain language without changing the math.

Approval thresholds need the same treatment. Decide who can approve what, when a manager must step in, and when finance or legal has to review a request. Keep that flow outside the model. AI can write the approval note, summarize the exception, or turn a messy sales thread into a clean request. It should not decide who has authority.

Legal and policy limits also belong in software. If your company cannot sell in certain places, accept certain contract terms, retain some data, or go past a refund deadline, write that logic directly into the system. Do not bury it in a long prompt and hope the model remembers every line.

AI does better on work where wording, tone, or rough judgment matters more than exact limits. Drafting emails, summarizing meetings, suggesting next steps, and rewriting internal notes are good examples. A person can review that output quickly, while the system still enforces the rules.

When companies start adding AI to daily operations, this split is usually where trust begins. The model can sound smart. The software still has to stay consistent on price, approvals, and policy every time.

Rules that need hard limits

Some rules should never depend on how a prompt was written that day. If a wrong answer can cost money, break a contract, or create a compliance problem, lock that rule down.

Pricing sits near the top of the list. A sales assistant can draft a quote, but it should never invent a discount or bend a pricing formula. The system should calculate list price, apply allowed discount ranges, and reject anything outside policy. If a rep can offer 5% without approval and 10% with manager approval, the software should know that exactly.

Approval limits need the same precision. Teams often write prompts like "ask for approval when the amount is high," which sounds fine until nobody agrees on what "high" means. Hard limits fix that. A team lead may approve up to $2,000, a department head up to $10,000, and anything above that goes to finance. The AI can route the request or explain the result, but the rule itself should be fixed.

Legal review needs strict triggers too. Auto-renewal clauses, liability caps, payment windows, data handling terms, and non-standard indemnity language should push a document into a legal queue. The AI can flag the clause, summarize the risk, and draft a response. It should not decide that a risky term is "probably fine."

The same logic applies to refund caps, tax rules, shipping restrictions, export controls, privacy checks, and licensing limits. These are classic examples of legal limits in software. They need exact checks, not flexible interpretation.

A simple case makes the point. If a customer in Germany asks for a refund on a discounted annual plan, the system should check the refund window, local tax treatment, and approval path before AI writes a reply. The model handles language. The software handles limits.

How to tell a rule from a judgment call

A quick test works well. If the answer should stay the same every time, you have a rule. If two informed people can reasonably choose different answers, you are closer to a judgment call.

That difference matters. Rules belong in code. Judgment calls can sit with people, with AI assistance, or with both.

When you review a workflow, ask a few plain questions:

Would the same input need the same answer every time?
Would a mistake cost money, break a contract, or cross a legal limit?
Do you need an audit trail that shows why the system allowed or blocked an action?
Does the task depend more on tone, wording, or context than on a fixed pass-or-fail test?

Take a discount request. If deals above 20% always need manager approval, that is a rule. Put the percentage limit, approval path, and exception logic in code.

Now look at the email that explains the decision to the customer. That is a judgment task. The message may need empathy, a softer tone, or a shorter version for a busy buyer. AI is useful there because wording matters more than strict consistency.

Money and legal risk should push you toward code quickly. Tax treatment, refund windows, contract caps, age checks, export limits, and compliance steps should not drift from one answer to the next. If finance or legal may need to review a past decision, you also want a record that shows which rule fired and when.

A good split is straightforward. Code decides what is allowed. AI helps with explanation, drafting, summarizing, and edge cases where a person still makes the final call.

How to move rules out of prompts

Plan an AI First Stack

Choose the code, infra, and review flow that fit real business rules.

Plan Setup

When a workflow mixes fixed rules with AI instructions, small wording changes can change the result. That is why stable rules should live in code or in a rule table, where they stay clear and predictable.

Start by mapping the decisions the workflow actually makes. Most teams skip this and jump straight to prompt writing. A better approach is to list each choice the system makes: approve or send to manager, show price or block quote, send draft or hold for review.

Then rewrite each one in plain if-then language. Keep it boring and exact:

If discount is above 12%, require manager approval.
If the customer is in a blocked country, do not generate the contract.
If the quote expires in less than 7 days, require a refresh.

From there, the process is fairly simple. Write down every decision point. Turn each decision into a short rule. Store stable rules in code, and use a rule table for values that change more often, such as discount caps, approval thresholds, or country restrictions. Run those checks before any AI step.

After the rule check passes, call the model for the work that actually benefits from it: drafting an email, summarizing a case, classifying a request, or suggesting next steps. This is where AI prompt guardrails help most. The model gets a narrower job, so it has less room to improvise where it should not.

Logging closes the loop. Save the inputs, the matched rule, the rule version, and the action the system took. If a rep asks why a quote was blocked, your team should be able to answer in seconds. Without logs, every strange result turns into guesswork.

This approach feels less magical. It also causes fewer surprises, which is what most teams want once real money and legal exposure are involved.

Example: discount approval for a sales team

A sales rep wants to close a deal before month end. The customer asks for 22% off on an annual plan, and the rep thinks the account could grow later. This is exactly where teams get into trouble if they leave the decision inside a prompt.

If the AI works from loose instructions like "be flexible with strategic customers," one rep may get a fast yes while another gets blocked on a similar deal. The safer move is to put pricing rules in code first and let the AI handle the writing, not the policy.

A simple setup can check a handful of facts before anyone sees an approval screen: deal size, projected margin after discount, customer type, contract term, payment terms, and whether the contract includes custom legal language.

The rules might say that standard deals can get up to 10% automatically, existing customers on annual prepay can get up to 15%, and anything that drops margin below a fixed floor needs manager review. If the customer is in a regulated segment or the contract includes non-standard terms, the system can also require legal or finance review.

Now the routing becomes predictable. A small renewal at 12% with healthy margin can pass automatically if policy allows it. A large new deal at 22% goes to a sales manager. A thin-margin enterprise deal with custom terms goes to both the manager and finance. Nobody has to guess.

After the code decides, approval workflow automation gets much easier. The AI can draft the approval note, summarize the reason, and fill in the numbers from the deal record. It can write: "Requested discount exceeds the auto-approval limit because projected margin falls below 18%. Manager review required." That saves time without changing the outcome.

This is what the split should look like in practice. The team gets the same answer for the same deal, approvals move faster, and customers hear a consistent message from every rep.

Mistakes that break trust

Make AI Work Predictably

Build workflows where software enforces limits and AI handles language and triage.

Discuss Setup

Trust usually collapses for one reason: the team cannot explain why the system said yes, no, or maybe.

That tends to happen when the real rule is not in the software. It sits inside a long prompt, mixed with examples, exceptions, and polite instructions. Prices, discount caps, refund limits, and similar numbers should not live there. A prompt is too easy to rewrite, shorten, or misread.

Approval paths fail for the same reason. An AI model can summarize a request or classify it, but it should not decide who needs approval based on loose wording. A larger discount, a finance review, and a manager sign-off are business rules. They are not style choices.

Legal text creates even more trouble when teams mix it with free-form generation. If the system must block certain claims, require exact wording, or respect country-specific limits, hard boundaries need to sit outside the prompt.

A few mistakes show up over and over:

hiding pricing and policy limits inside giant prompts
letting the model choose the approver instead of using a fixed rule
asking AI to write legal language without locked templates or checks
skipping logs because the workflow looks small
editing prompts in place with no version history tied to rule changes

The logging problem looks minor until something goes wrong. Then a customer asks why they got one price while another customer got a different one. If you do not store the inputs, decision path, model output, and rule version, you cannot answer with confidence.

Prompt edits create the same mess. A quiet wording change on Tuesday can cause bad approvals on Wednesday. By Friday, nobody remembers what changed.

Small teams feel this first because they move fast and rely on memory. That works for a while. After that, stable automation needs rule files, audit logs, and clear ownership.

Checks to run before launch

Sort Rules From Judgment

Split fixed limits from judgment calls before you automate the whole process.

Book Session

Before you switch on automation, test the plain cases first. If two people give the same input, they should get the same answer. If they do not, some rule still lives in a prompt instead of code.

Start with repeatability. AI can write a message or explain a result, but it should not decide a discount cap, legal threshold, or approval path differently from one run to the next.

Use a short checklist:

Replay the same request twice. If the result changes, the rules are still too loose.
Follow every stop and approval step. A manager should be able to explain why the request paused and what clears it.
Change one rule without touching prompt text. If you cannot do that, the rule is in the wrong place.
Check the history. You need to see who changed a rule, what changed, and when.
Push the AI past the limit on purpose. Ask for an exception outside the allowed range and make sure it refuses every time.

One practical test works especially well. Take a batch of real requests from the last month and replay them in a safe environment. Mix in normal cases, edge cases, and a few bad inputs. If the system stays consistent and gives clear reasons for each decision, you are in much better shape.

That is how stable automation usually works. The AI handles language, summaries, and messy input. The software handles boundaries.

What to do next

Start with one process that affects money, risk, or customer response time every day. Discount approvals, refund exceptions, contract changes, and spend requests are good places to start because small mistakes there are expensive.

Keep the first pass narrow. If your team can state the rule in one clear sentence, and a developer can enforce it with a few checks, move that rule into code now.

A practical plan for this week is enough:

pick one workflow with real business impact
move fixed limits into code, such as price floors, approval levels, legal blocks, or required fields
test edge cases before wider rollout
add AI only after the boundaries are stable

The testing step matters more than most teams expect. Try awkward cases on purpose: missing customer data, expired pricing, a request just above the approval threshold, or a country your company cannot sell into. If the system fails there, people stop trusting it fast.

Once the rules hold up, let AI do the messy text work. It can draft responses, summarize requests, classify tickets, and route unusual cases to a person. That is usually where the time savings appear.

If the line between a rule and a judgment call still feels fuzzy, an outside review can save a lot of rework. Oleg Sotnikov focuses on this kind of AI-first engineering and Fractional CTO work, and oleg.is is a useful reference if you're sorting out where software should enforce policy and where AI should assist.

A good first result is modest: one live workflow, a tested rule set, and AI handling drafts and triage instead of policy decisions. That is enough to prove the idea without betting the whole operation on prompts.

Frequently Asked Questions

Why can't I keep business rules inside prompts?

Because a prompt guides the model; it does not enforce a rule. Small wording changes, chat history, or stray notes can change the answer, so the same case may get different outcomes.

What rules belong in code first?

Put any rule with one fixed answer into code first. That usually means pricing, discount caps, approval thresholds, refund windows, tax checks, region blocks, contract limits, and anything tied to money, legal risk, or compliance.

Where should discount and pricing logic live?

Keep pricing logic in software, not in the model. Let the system calculate list price, apply allowed discounts, check margin floors, and block anything outside policy, then let AI explain the quote in plain language.

Should AI decide who needs approval?

No. Your system should route requests by clear limits, such as amount, margin, contract terms, or customer type. AI can write the approval note or summarize the case, but it should not choose who has authority.

How do I tell a fixed rule from a judgment call?

Use a simple test: if the same input should always get the same answer, you have a rule. If two smart people could make different reasonable calls, you have judgment, and AI can help more there with drafting, summaries, or suggested next steps.

What should AI handle after the rules run?

Let AI handle language-heavy work after your checks pass. It can draft emails, summarize threads, classify requests, explain decisions, and prepare notes for a person without changing the actual limits.

Do I need code for every rule, or can I use a rule table?

You do not need to hardcode every value. Put logic that rarely changes in code, and keep changing values like discount caps, country blocks, or approval amounts in a rule table your app reads before it calls the model.

Why do audit logs matter in an AI workflow?

Logs let your team answer basic questions fast: what input came in, which rule matched, which version ran, and why the system allowed or blocked the action. Without that record, every odd result turns into guesswork.

How should I test this before launch?

Replay real requests in a safe environment and make sure the system gives the same result every time for the same input. Then push edge cases on purpose, like expired pricing, blocked countries, or discounts just over the limit, and confirm the system refuses them every time.

What's the best workflow to start with?

Start with a workflow that touches money, risk, or customer response time every day. Discount approvals, refund exceptions, contract changes, and spend requests usually work well because one small rule change there can save real time and prevent messy mistakes.