Apr 21, 2025·8 min read

How to explain technical risk to a board after you find it

Learn how to explain technical risk to a board through customer impact, cash burn, repair options, and a clear ask for the first meeting.

Table of Contents

What the problem looks like now

Most hidden technical risk looks normal from the outside. The product still works. Customers still log in. The team still ships. Then someone looks closer and finds one weak area that can fail under ordinary growth or during a routine release.

The board does not need a tour of the codebase. It needs one plain opening sentence that says what you found, where it sits, and why it matters. For example: "We found that our billing service depends on an old component that fails under load, and that puts invoice creation and payment retries at risk."

That kind of sentence works because it keeps the discussion tied to the product instead of engineering jargon. Be specific about the area it touches. Say whether the issue affects checkout, onboarding, customer login, reporting, mobile sync, or another part of the product the board already understands. A director may not care about database internals, but they will understand "this affects renewals" or "this can block new accounts."

You also need to say what happens if nothing changes this quarter. Keep that part direct. If you wait, the risk stays in production through another release cycle. The team spends more time patching. Small incidents pile up. One bad week can force engineers to stop planned work and fix the same area again.

Facts and guesses

Do not mix evidence with prediction. Boards respond better when you separate what you know from what you suspect.

Facts are measurable. Error rates rose during peak traffic. Two recent incidents came from the same service. A component is no longer supported. Only one or two engineers fully understand the fix.

Guesses should sound like guesses. You can say the problem will likely appear more often as usage grows, or that the next major release may raise the chance of failure. That is fair. Saying "customers will leave in large numbers" without proof is not.

A simple rule helps: if you measured it, call it a fact. If you inferred it, label it clearly. That keeps the rest of the discussion grounded because everyone starts from the same picture of the problem.

Start with customer impact

Boards move faster when the risk feels real for customers instead of abstract for engineers. Skip stack names, error logs, and internal labels. Say what breaks, who feels it, and how often it happens.

Start with a count. Estimate the users, accounts, orders, or transactions the weak spot can affect in a normal week or month. A rough range is fine if you say it plainly. "About 12,000 customer accounts depend on this workflow each month" is much easier to absorb than "the event pipeline has a state consistency issue."

Then describe the failure in everyday language. New customers may get stuck during signup at busy times. Existing customers may see late or incorrect account data. Orders may go through twice, or not at all. Support staff may need to fix accounts by hand.

That gives the board something concrete. They do not need to know whether the problem sits in a queue, a database, or an aging service unless they ask. They need to know what a customer sees on the screen, on the invoice, or in the delivery timeline.

Trust usually drops before revenue does. If customers see wrong balances, missing updates, or failed renewals, they start checking your product twice. Some stop using self service actions and contact support instead. Some delay expansion. A few leave after one bad week if the issue hits a core workflow.

Use simple cause and effect. "If this fails during peak traffic, around 8% of new signups may not finish. That means more support tickets, slower activation, and more drop off during the trial period." That lands better than a summary full of code terms.

Start with pain people can picture. Once the board sees the effect on trust, churn, and support load, the rest of the conversation gets easier.

Put numbers on cash burn

Boards react better to dollars than to vague risk. If a system issue can slow releases, cause outages, or break billing, turn that into a monthly cost range. Do not chase perfect precision. A clear estimate with stated assumptions is more useful than a dramatic claim with no math behind it.

Start with money that leaves the business first: lost revenue from failed signups or missed renewals, refunds and service credits after incidents, support time spent on complaints and manual fixes, and engineering time lost to emergency work instead of planned delivery.

Use simple numbers. If the team saw 40 failed payments last month and average monthly revenue per account is $500, that is a direct starting point. If support handled 120 extra tickets and each ticket takes 20 minutes, convert that into payroll cost. If three engineers spent two days patching an old service, price those hours too. The board does not need perfect accounting. It needs a fair picture of what the problem costs now.

Show two cases, not one. A best case might be: "If the issue stays at current levels, we lose about $18,000 a month." A likely case might be: "If volume grows and failure rates stay the same, the loss is closer to $35,000." That range makes the discussion more honest and keeps you from sounding theatrical.

Every number should have a source in your notes or slide comments. Use billing reports for refunds and credits. Use CRM or finance data for lost deals and churn. Use support system data for ticket volume. Use sprint records, time logs, or manager estimates for engineering time. If one number is an estimate, say so.

Do not forget opportunity cost. If senior engineers spend 30% of their week keeping an old system alive, they are not building the next release. That delay burns cash too, even if it never appears on an invoice.

Lay out the repair options

Boards handle bad news better when they can compare real choices. Put three paths side by side: patch, partial rebuild, and full replacement. Use the same measures for each one so the trade off is easy to scan: cost, time, risk, and what delivery work must pause.

A patch works when the danger sits in one service, one vendor dependency, or one weak workflow. It costs the least, and teams can often finish it in weeks instead of quarters. The downside is simple: you reduce the chance of failure, but the deeper flaw stays in place. To do it, the team may need to pause one or two planned features while senior engineers focus on the weak area.

A partial rebuild fits cases where one layer causes most of the outages, delays, or security problems. It costs more than a patch but much less than replacing everything. It takes longer, yet the scope is usually easier to control. It also gives you a better long term result because you remove the broken section instead of covering it up. Expect a larger pause in roadmap work, often for one release cycle.

A full replacement sounds clean, but it is often the hardest option to manage. It makes sense when the whole system blocks growth, or when every fix reveals another hidden problem. It also costs the most and takes the longest. The risk is not only technical. Long replacements drain focus, delay product work, and wear out board patience. In practice, the team may need to freeze most new feature work for months.

If you want a quick summary, keep it short:

Patch: low cost, short timeline, highest chance the same problem returns.
Partial rebuild: medium cost, medium timeline, lower long term risk.
Full replacement: highest cost, longest timeline, biggest delivery risk during the project.

Then make a recommendation. Do not put three equal choices in front of the board and wait for them to decide. Pick one path and explain why it protects customers and cash better than the others.

Most teams should not jump straight to a full replacement. A partial rebuild is often the best first move because it removes the riskiest area without stopping the whole company. Say that plainly: "I recommend a partial rebuild of the billing system. It will take 12 weeks, delay two lower priority features, and cut incident risk much more than a patch."

Build the first board update step by step

Need startup CTO guidance

Use Oleg's founder and CTO experience to frame the trade off clearly.

Get Advice

A board update should read like a business decision, not a tour of the codebase. Keep it to one page if you can. Twenty slides usually hide the point.

A simple order works well. Open with the business problem in plain language and say what happens if nothing changes: slower releases, more outages, failed customer commitments, rising support load, or delayed sales. Then put numbers on the impact. Estimate cash burn, lost team time, refund risk, missed revenue, or extra infrastructure cost. After that, explain the cause in one short paragraph. Name the system, the weak spot, and why the team found it now. Then show the repair options side by side. End with one decision: approval on budget, headcount, timeline, or scope.

That order matters. If you start with architecture diagrams, directors either stop listening or jump straight to cost. Put the numbers early, then use technical detail only to support them.

Rehearsal matters more than polish. Practice short answers to the questions you will almost always get: Why did we not see this sooner? What happens if we wait one quarter? How much will the fix cost? What slips while the team works on it?

This is the sort of framing Oleg Sotnikov often uses in Fractional CTO work at oleg.is: reduce a messy engineering issue to one decision memo with a clear trade off. If the board can repeat your message in one sentence after the meeting, the update worked.

A simple example from a growing software company

A software company grows from 300 customers to 6,000 in 18 months. Its billing service still runs the same month end process it used in the early days. Now the system slows down so much that invoice jobs spill into the next morning.

Customers feel it fast. Some do not get invoices on time, so they cannot close their own books. A smaller group gets charged twice because the billing job times out, retries, and creates a second payment. Those customers do not care that the bug lives in old code. They see a company that cannot bill them correctly.

Inside the business, people patch the gap by hand. Support spends two days answering angry tickets. Finance checks payment records line by line and issues refunds. Engineers stop planned work every month to watch the billing run and clean up after it. If four engineers lose three days each month, and support and finance add another 70 hours, the company can waste tens of thousands of dollars each quarter on a problem that keeps coming back.

The board discussion gets much clearer when the issue is framed in numbers people can use. About 12% of invoices go out late at month end. Duplicate charges hit 0.4% of payments. Cash collection slips by three days on roughly one quarter of monthly billings. Support, finance, and engineering lose about 120 hours a month fixing the same issue.

That changes the tone of the meeting. The board is no longer hearing "our billing code is messy." It is hearing that customers are losing trust, cash arrives later than expected, and staff time keeps disappearing into manual repair work.

In this example, a full rewrite would take six months and pause other product work. A staged repair is easier to approve. The company first fixes the retry logic so one payment request can charge only once. Next, the team breaks the heavy month end job into smaller runs. After that, they add alerts and a daily billing check so finance sees errors before customers do.

The board funds the staged repair because it lowers customer harm now, reduces cash burn, and avoids putting the whole product on hold.

Mistakes that weaken the discussion

Review infra before it hurts

Check deployment, observability, and hosting choices that keep creating avoidable risk.

Review Infra

Boards lose confidence fast when a real engineering problem turns into a lecture on services, frameworks, and internal code names. Most directors do not need the stack diagram. They need to know who gets hurt, how much money leaks, and what choices the company has.

Use business language. "Our event pipeline drops orders during peak traffic" is clear. "We have coupling issues in the ingestion layer" is not. The first version tells the board why the issue matters.

Another common mistake is asking for budget as if there is only one answer. The discussion gets weak when a team says, "We need $600,000 to rebuild this," and stops there. Give options instead. Show the small repair, the safer middle path, and the larger reset. Include cost, likely effect, and what each option does not fix.

Bad date promises do real damage. Teams often feel pressure to sound certain, so they offer a deadline that depends on unknowns they have not tested yet. If the work touches old code, external systems, or missing documentation, say that plainly. A range is better than a fake promise.

Teams also muddy the conversation when they bundle urgent work with optional work. If payment failures are rising, do not mix that repair with a dashboard redesign, a tooling refresh, and a migration the team simply prefers. Separate work that protects revenue now from work that improves life later.

Uncertainty is not the enemy. Hidden uncertainty is. Boards usually react better when leaders put edges around what they do not know. You can say: we know the failure rate, we know the customer group affected, we do not yet know whether the root cause is one service or three, and we need two weeks to confirm that. That kind of honesty is stronger than false precision.

Quick checks before the meeting

Pressure test your repair plan

Compare patch, rebuild, and replacement options with an experienced CTO before the meeting.

Review Options

A weak board discussion often starts with bad inputs. Before you talk about systems or code, make sure the numbers and the customer stories match reality. Clean prep matters more than polished slides.

Start with finance. If you plan to say the risk costs $60,000 a month, finance should agree with the math and the source. Check revenue at risk, refunds, contractor spend, overtime, and any launch delay that pushes cash out. One shaky number can make the whole case sound larger than it is.

Then talk to support. Ask what customers actually report, in their own words. You are looking for repeated pain, not internal labels. "Imports fail after large uploads" is stronger than "queue instability." "Our team has to rerun reports every Monday" is stronger than "batch job degradation."

Next, get a real 30 day view from engineering. Do not ask for a perfect rescue plan. Ask what the current team can ship in the next month if they stop lower priority work. They may be able to add better monitoring, remove one risky dependency, patch the most common failure path, or pause a feature release to fix the part that breaks most often.

Write the board decision you need in one sentence. Keep it plain. For example: approve six weeks of repair work and move one launch date. If you need budget, say how much. If you need time, say how much. If you need both, say which one matters more.

Also walk in with a fallback. If the board says no to the full repair plan, you still need a smaller move that cuts risk. That could mean a narrower scope, temporary controls around the risky area, or a slower release schedule until the team repairs the weak spot.

Put all of this on one page before the meeting. If finance, support, and engineering all recognize the picture, the discussion stays grounded.

What to do after the meeting

A good board meeting ends with less ambiguity, not more. Within a day, send a short written summary that says what the board chose, what it did not choose, and what the team will do first.

Keep that note plain. Name the risk, the repair path, the expected cost, and the first checkpoint date. If someone reads only that message a week later, they should still know what was approved.

Then turn the decision into named owners and dates. A board decision without clear ownership turns into drift, and drift is expensive when the problem already touches customers.

A short follow up list is enough:

one owner for the repair plan
one date for the first visible milestone
one weekly measure for customer impact
one weekly measure for cash burn
one trigger for when the plan needs escalation

Track those numbers every week, even if they are rough at first. Customer impact can mean support tickets, failed jobs, churn risk, lost deals, or time customers spend waiting on workarounds. Cash burn can mean extra engineering hours, emergency contractor spend, cloud waste, refunds, or delays to roadmap items that bring revenue.

This weekly view matters more than a polished monthly update. If the number of affected customers drops but burn keeps rising, you may have a staffing problem. If burn looks small but support pain keeps growing, your repair scope may be too narrow.

If the plan slips, bring the bad news early. Do not wait for the next board meeting and hope the team catches up. A short note that says "milestone missed, cause found, new date proposed" is much easier to handle than a surprise after four lost weeks.

Many teams struggle here because they confuse progress with effort. "We worked hard on it" is not a useful update. "Incidents fell from 14 to 5, but migration work is two weeks late because test coverage was weaker than expected" is useful.

If the choices still feel messy, an outside CTO can help pressure test the repair options, check whether the cost and timing are realistic, and frame the next board discussion in plain business terms. That is the kind of work Oleg Sotnikov does at oleg.is, and it is often cheaper than staying on the wrong path for one more quarter.

Frequently Asked Questions

What should I say first when I tell the board about technical risk?

Open with one plain sentence that names the weak area, the product part it touches, and the business risk. Say something like: "We found a failure point in billing that can delay invoices and payment retries during normal growth."

How do I separate facts from guesses in the meeting?

Keep facts and guesses apart. Put measured items first, like incident counts, error rates, unsupported components, or how many engineers understand the fix. Then label the rest clearly as expected risk if traffic or release volume grows.

What numbers matter most in a board discussion like this?

Bring numbers tied to customers and money. Use counts like affected accounts, failed payments, support tickets, engineer hours lost, refunds, and delayed revenue. A rough range works if you show where the numbers came from.

How do I explain customer impact without using engineering jargon?

Describe what customers see, not what engineers call the system. Say "new signups fail at busy times" or "some customers get charged twice" instead of naming services, queues, or internal labels.

Should I show more than one repair option?

Yes. Show patch, partial rebuild, and full replacement side by side with the same measures: cost, time, risk, and what product work must pause. Then make one recommendation so the board reacts to a decision, not a menu with no guidance.

How exact do my cost estimates need to be?

You do not need perfect precision. A fair monthly range with clear assumptions usually works better than a big number with weak math. If one figure is an estimate, say that out loud and keep the source in your notes.

What mistakes make the board lose confidence?

Boards lose trust when teams bury the issue in code terms, ask for one big budget with no options, or promise dates they cannot support. They also pull back when teams mix urgent repair work with nice-to-have cleanup.

Who should I talk to before the board meeting?

Check the math with finance, get real customer wording from support, and ask engineering what they can ship in the next 30 days if they pause lower priority work. Those three views keep the discussion grounded and stop you from walking in with a story that falls apart under questions.

What exactly should I ask the board to approve?

Ask for one clear decision in one sentence. That might mean budget, six weeks of repair work, a moved launch date, or approval to pause lower priority features. If you need both time and money, say which one matters more.

What should happen right after the meeting?

Send a short note within a day that states the risk, the chosen repair path, expected cost, first checkpoint date, and named owner. After that, track one weekly customer measure and one weekly cost measure so you can show progress in business terms, not effort alone.