Feb 16, 2025·8 min read

AI feature pricing before discounts eat your margin

AI feature pricing should include model spend, review time, retries, and support work so a popular feature still earns money at higher volume.

AI feature pricing before discounts eat your margin

An AI feature can look like a win in the first month and still lose money later. The trap is simple. Sales sees strong demand, cuts the price to close deals, and usage grows faster than margin.

With normal SaaS, extra use often costs very little. AI works differently. Every prompt, retry, moderation pass, fallback step, and manual check adds a real expense. If customers use the feature every day, your costs rise with their success.

Those costs usually rise together. A customer who sends more requests also creates more strange cases. That means more failed runs to inspect, more support tickets, and more staff time spent explaining bad outputs or rerunning tasks. Even a small review step, like five minutes from support, can eat profit when it happens hundreds of times a month.

Discounting makes this worse fast. A sales rep cuts 20 percent to win a customer, then that customer becomes your heaviest user. The team celebrates adoption, but finance sees the opposite. The feature looks popular, yet each extra use makes the account less healthy.

This is where cost-based SaaS pricing matters. You need a price that covers model calls, retries, review labor, and failure handling before anyone starts offering discounts. If you wait until customers love the feature, changing the price gets much harder. People compare the new number to what they already pay, not to what it costs you to deliver.

That is how a feature with real demand turns into a quiet loss. Most teams notice it only after customers depend on it.

What to count before you set a price

A price breaks when it covers the happy path only. AI pricing needs a full cost per use, including the messy cases that show up every week.

Start with the model bill, but do not stop there. Count input tokens, output tokens, and the average length of a real request. A feature that looks cheap in a demo can get expensive fast when users paste long documents or ask for retries.

Most teams also miss the extra runs. If your product retries on timeout, sends a second prompt when the first answer is weak, or calls a fallback model when the first one fails, those runs belong in the price. They are not edge cases if they happen every day.

A simple way to think about it is to count five cost buckets for every 100 uses: model input and output spend, retries and fallback runs, human review time, refunds or credits plus support handling, and payment fees with basic overhead.

Human review changes the math more than people expect. If one in ten outputs needs a person to check it for two minutes, that labor adds up fast. Use actual wage cost, not a hopeful guess. If senior staff step in for hard cases, include that time too.

Failure handling has a price even when the customer never sees an invoice line for it. Bad outputs create tickets, credits, refunds, and follow-up work. If support spends five minutes calming down one unhappy customer, that cost belongs to the feature.

Basic overhead matters too. Payment processing takes a cut. Hosting, logs, monitoring, and storage cost money. None of these items looks dramatic on its own, but together they decide whether a feature makes money or quietly burns it.

Do this work before sales starts asking for custom deals. Then you have a price that can survive real usage.

Build a cost per use, not just a monthly guess

Monthly averages hide the part that hurts. One clean prompt and one messy PDF can cost very different amounts. If you price an AI feature on a rough monthly total, heavy users and bad inputs will eat margin first.

Start with one unit that matches how people actually use the feature. That unit might be a request, a document, a call transcript, or a seat with a clear usage cap. Pick one and stick to it. Pricing gets fuzzy fast when product, finance, and sales all measure usage in different ways.

Then write down the cost of a normal run. Keep it plain. Count model tokens or API calls, storage, any retrieval step, and the few seconds of app work around the model. If a support reply usually uses one prompt, one model response, and a quick database lookup, price that exact path first.

After that, add a second case for ugly real-world inputs. A scanned contract, a long thread, or a request that needs a retry often costs far more than the clean demo case. Many teams skip this and end up pricing for the happy path only. That is usually where the mistake starts.

A simple worksheet is enough. Write down one unit of use, the cost of a normal run, the cost of a messy run, the expected mix between the two, and the average usage cost per unit.

Keep fixed costs separate from usage costs. Base platform spend, monitoring, and shared engineering time should not sit inside every request unless usage truly drives them up. Fixed costs belong in your plan fee or margin target. Usage costs belong in the per-use math.

This is often the point where pricing gets clearer. Once a team sees the cost of one real action, discount rules stop being guesswork.

Build your price in five steps

AI pricing works better when you treat it like a unit economics problem, not a monthly guess. If usage grows fast, a bad price gets worse every week.

Start with one customer profile that looks normal for your product. Not the lightest user, and not your biggest account.

  1. Estimate how many times that customer will use the feature in a month. Use real behavior if you have it. If you do not, write down a low case and a likely case.
  2. Multiply that usage by your cost per use. Include model calls, storage, retries, logging, and any other cost that shows up every time the feature runs.
  3. Add people time. Count manual review, support tickets, and the time someone spends checking bad outputs or odd cases.
  4. Add margin before anyone asks for a discount. If sales often gives 15 percent off, your list price needs room for that and still has to cover cost.
  5. Test the price at double the expected usage. Popular features rarely stay at the neat number from the first spreadsheet.

A small example makes the risk obvious. Say one customer uses the feature 3,000 times a month, your technical cost is $0.03 per use, and review plus support adds another $60 a month. Your direct cost is $150. If you price it at $180, a routine discount can wipe out most of the margin.

Then run the same math at 6,000 uses. If the feature turns thin or negative at that level, fix the offer before launch. Raise the per-use fee, cap included volume, or split human review into a separate charge.

That last check saves a lot of pain. It is much easier to set a sane price now than to explain later why growth made the feature unprofitable.

Add review labor without fooling yourself

Bring in a Fractional CTO
Get direct CTO help with AI pricing, architecture, and launch tradeoffs.

Most teams underprice review work because they treat it like background effort. If a person checks AI output, that time is part of the feature cost.

Use a stopwatch, not a guess. Time real reviews from the moment someone opens the case to the moment they approve it, edit it, reject it, or send it back with a note. A review that feels like about two minutes often takes much longer once you include reading context, checking source data, and writing a short explanation.

The first pass is only part of the bill. Count the work that comes after the first answer too: rechecks after edits, customer follow-up when the output is unclear or wrong, internal questions between reviewer and support, and manager escalation for hard cases.

That last item matters more than teams expect. If one out of twenty requests needs a manager, you still need to spread that time across every request. Rare cases do not happen often, but they still shape margin.

A simple example shows why. Say a reviewer spends 4 minutes on the first pass, 1 minute on rechecks on average, and 30 seconds on customer follow-up spread across all cases. Then a manager spends 10 minutes on 5 percent of requests. That adds another 30 seconds per request. Your quick review is now 6 minutes, not 4.

This is where pricing often goes wrong. Teams budget for model cost and forget that review labor grows with volume too. If usage doubles, review work usually doubles with it.

Update your number after the first month. Real traffic changes everything. Reviewers get faster in some workflows, slower in others, and new edge cases appear fast. If you want pricing that holds up under sales pressure, replace rough guesses with live review data as early as you can.

Price failure handling before it hurts

Most teams price the happy path and forget the messy one. That is how a plan looks fine in a spreadsheet and then starts leaking money once real users hit bad inputs, rate limits, and wrong answers.

Failures cost more than the model call. Count every retry after a timeout, every user rerun, and every manual fix your team makes after a poor result. If one bad output triggers three more calls and a support ticket, the true cost is much higher than the original request.

Decide who handles wrong or risky output before launch. A support rep, an operations person, or a specialist reviewer each adds a different cost. You also need a clear rule for when the team steps in, such as bad financial data, unsafe text, or an answer that could push a customer toward the wrong action.

A basic failure budget should include extra model calls from retries and reruns, staff time for manual checks or corrections, account credits or refunds for unusable results, and the time spent explaining the issue to the customer.

Credits and refunds belong in the pricing model like any other cost. If your team often gives a $10 credit after a broken run, that is part of the feature margin. Ignore it, and a popular feature can lose money even when usage looks healthy.

Free reruns need a hard limit. One included rerun is reasonable in many products. Unlimited reruns are rarely safe, because a small group of frustrated users can multiply your costs in a day.

A simple rule works well: the first rerun is free, the second rerun triggers a check, and anything beyond that gets charged or held for review. That protects margin and gives support a clean answer when customers ask for more.

If 5 percent of requests need another pass and 1 percent need a human fix, put those numbers into the offer now. Volume makes small failure costs grow fast.

Set discount rules before sales starts asking

Discounts get messy when sales treats an AI feature like ordinary software. A 20 percent cut can erase margin fast if each customer also triggers model usage, manual review, retries, and support time. Good pricing needs a written floor before the first quote goes out.

Put that floor in plain language. Sales should know the minimum monthly fee, the minimum price per use, and which parts of the offer never get discounted. If a deal goes below the floor, one person should approve it. No side deals and no verbal exceptions.

Larger discounts should mean less included work, not the same work for less money. If a buyer wants a lower price, change the package. Reduce included usage, move review to a slower queue, or remove custom prompt changes. That keeps the offer tied to the real work behind the feature.

Be strict with review costs. Blanket discounts should cover the software portion only. If your team has to inspect outputs, fix odd cases, or rerun failed jobs, price that labor on its own line. Custom review work is where many teams lose control of margin without noticing.

Three package types are usually enough:

  • Standard: full price, normal usage limits, standard review.
  • Volume: lower price per use with a usage commitment and less manual help.
  • Custom: separate pricing for review labor, failure handling, and special workflows.

This also protects you when failure handling gets expensive. If outputs fail and your team steps in, that is not a free extra. Write down what discounting can touch and what stays fixed.

Lean teams learn this quickly. Even with careful cost control, a few heavily discounted deals with manual work attached can turn a popular feature into a loss. Sales can still move fast, but it should do it with approved packages, not improvised promises.

A simple example with real numbers

Check Margin Before Launch
Use an outside CTO review to check margin, limits, and support load.

A SaaS team sells AI lead summaries to busy sales reps. The offer sounds simple: each rep pays $27 per month and can generate summaries for new leads before calls.

The first model run costs only $0.06, so the feature looks cheap. That first impression is misleading.

The real cost per summary is higher. The first model run is still $0.06. Reruns add another $0.009 on average because 15 percent of summaries need one more run. Human review adds $0.16 on average because 1 in 5 summaries needs 2 minutes from an ops reviewer paid $24 per hour. Failure handling, including retries, credits, and support time, adds another $0.03.

That puts the true cost at about $0.259 per summary.

With 80 summaries per rep, monthly cost is 80 x $0.259 = $20.72. At the full $27 price, the feature still makes money. Gross margin per active rep is $6.28. That is workable, but it is not generous.

Now sales asks for a 30 percent discount to win a larger account. The price drops from $27 to $18.90 per rep.

Nothing else changed. Usage is still 80 summaries. Review work is still there. Reruns still happen. Support still handles failures. Monthly cost stays at $20.72, so the team now loses $1.82 per active rep.

If that account has 500 active reps, the feature loses $910 every month. If usage rises because reps like the summaries, the loss gets worse.

That is why cost-based SaaS pricing has to include reruns, review labor, and failure handling before anyone approves discount rules for sales. Cheap first-run model costs can hide an unprofitable feature for months.

Mistakes teams make in the first pricing pass

The first pricing pass often fails for a simple reason: teams price the demo, not the messy day-to-day use. A feature looks cheap in a test account, then real customers send longer prompts, strange files, repeat requests, and support tickets. That gap is where margin disappears.

One common mistake is copying a competitor's number and treating it like proof. That works only if your costs match theirs, and they usually do not. Another company may have lower model costs, tighter limits, or a product that avoids manual review. If your feature needs people to step in, your price has to cover that.

Support work gets missed all the time. A bad output can create three extra costs at once: a chat with support, a manual retry, and sometimes a refund or account credit. None of that appears on the model bill, but it still costs money. Teams that ignore this end up losing margin one exception at a time.

Review labor also changes with volume. At 100 uses a week, one person can skim the odd mistake. At 10,000 uses, queues form, edge cases pile up, and someone has to own the process. Review time rarely stays flat as usage grows.

The warning signs are usually obvious:

  • pricing came from a competitor screenshot
  • support handles AI issues for now
  • manual review has no owner or time budget
  • the plan says unlimited before normal usage is clear
  • refunds happen by habit, not by policy

Unlimited is often the most expensive promise a team makes. Heavy users find the edge of a feature fast, and a few accounts can eat most of the budget. Put caps, fair-use limits, or review thresholds in place first. You can always loosen them later. Taking them back after customers depend on them is much harder.

Quick checks before you launch

Test Your Plan Under Load
See what happens when usage doubles before customers find the cracks.

A price can look fine in a spreadsheet and still break the week customers start using the feature hard. Good pricing survives normal usage, heavy usage, discount pressure, and the messy cases where output fails.

Run four checks before launch.

  1. Stress one account. If a single customer doubles normal usage, does your margin stay positive after model calls, retries, logging, and human review? If not, add usage caps, overage fees, or tighter limits per plan.
  2. Set a discount floor. Sales will ask for a bigger discount sooner than you think. Write down the lowest allowed price, the smallest margin you will accept, and who can approve exceptions.
  3. Spell out failed output. Every plan should say what happens if the AI gives a bad answer, times out, or needs a second pass. Decide whether you retry automatically, send it for review, refund credits, or count it as usage anyway.
  4. Use one rulebook. Sales, support, and finance should quote the same terms for pricing, credits, discounts, and review turnaround. If each team explains it differently, customers will find the gap.

Keep this document short. One page is enough if it includes plan limits, discount floors, overage rules, and failure handling.

A small test helps. Take your biggest likely customer, double their expected usage for a month, then add a rough failure rate and review time. If that customer turns profitable usage into a loss, fix the offer now. It is much easier to tighten a plan before launch than after customers get used to loose terms.

What to do next

Put the pricing math in one place and make one person own it. A shared sheet beats a slide deck full of guesses. It should show model cost per use, review time, retry cost, refunds or credits, and the rule for when a failed run turns into manual work.

If that sheet is missing one of those lines, your price is still a draft. A feature can look profitable in a demo and still lose money once customers start using it every day.

Keep the sheet simple. Track cost per request or task, average review minutes and hourly rate, failure rate and retry rate, credit policy, planned margin at list price and at approved discount levels, and the trigger points for repricing or usage limits.

Review those numbers every month after launch. Do not wait for the quarter to end. If usage doubles, or review takes five minutes instead of two, margin can disappear fast.

A common miss is letting sales discount first and asking finance or product to explain the margin later. Fix weak pricing while volume is still small. It is much easier to tighten usage caps, raise minimums, or narrow discount bands before a sales habit forms.

If you want an outside review, Oleg Sotnikov at oleg.is works with startups and smaller companies as a fractional CTO and advisor. He can pressure-test the offer against the real AI workload, the support burden, and the discount rules before launch, which is often enough to catch a feature that looks popular but loses money.

Frequently Asked Questions

Why can a popular AI feature still lose money?

Because cost rises with usage. Every prompt, retry, support ticket, review step, and credit adds expense. If sales discounts too early, heavy use can turn a popular feature into a loss.

What should I count before I set the price?

Count model input and output, retries, fallback runs, storage, logging, human review, support time, refunds or credits, payment fees, and any overhead that rises with usage. Use real traffic estimates, not demo numbers.

Should I charge by seat or by use?

Price around one clear unit of work, like a request, document, or transcript. A seat can work only if you set a clear usage cap. Without that cap, heavy users can wipe out your margin.

How do I price human review time?

Time real cases from start to finish. Include rechecks, customer follow-up, and manager escalations. Then multiply average minutes per request by the full hourly wage cost and add that number to each use.

Do retries and reruns really change the math that much?

Yes, they do. One bad output can trigger more model calls, more support work, and sometimes a credit or refund. Put average retry and rerun rates into your cost per use, and set a hard limit on free reruns.

How much discount can sales safely offer?

Set the floor before sales starts quoting. Your list price needs room for normal discounts and still has to cover model cost, labor, and failure handling. If a buyer wants a lower price, cut included usage or manual help instead of giving the same package for less.

Is unlimited usage a bad idea for AI features?

Usually no, at least not early. A small group of heavy users can drive most of the cost, especially when they send messy inputs or ask for reruns. Start with caps, fair use rules, or overage fees, then loosen them after you see real behavior.

How can I tell if my current price is too low?

Stress-test one normal account at double expected usage. If margin turns thin or negative after retries, review, support, and credits, the price is too low or the included volume is too high. Fix it before customers get used to the old plan.

How often should I update the pricing model?

Check it every month after launch. Real traffic changes prompt length, failure rate, review time, and support load fast. Replace guesses with live data early, then tighten limits or reprice as soon as margin starts slipping.

Should I get an outside review before launch?

If your team cannot show cost per use, review minutes, refund policy, and discount floor in one simple sheet, get outside help. A fractional CTO or advisor can pressure-test the offer before volume hides the problem.