AI compensation conversations on scope, ownership, and risk
AI compensation conversations work better when teams discuss scope, ownership, and risk handled, not raw commit counts distorted by AI tools.

Why commit counts stop telling the story
AI changed output before most pay systems caught up. A developer can now draft an endpoint, a test file, and a migration in minutes with a coding assistant. The speed is real. It still does not mean every extra commit reflects harder work or a bigger contribution.
Commit charts look neat, so managers lean on them. The problem is simple: they mix easy work, hard work, and noisy work into one number. If one engineer asks an AI tool for pages of boilerplate, they might push far more code than the person who spots a design flaw early and prevents a bad release.
Big commits can look impressive and still be shallow. They often bundle generated code, copied patterns, and light edits into a single burst of activity. The hard part may have happened elsewhere. Sometimes it never happened at all.
Small commits can carry far more weight. A two-line change might stop a memory leak, fix a race condition, or close a security gap that would have caused days of pain later. The commit count stays small. The judgment behind it does not.
This gap gets wider on teams that run live products. Reviews, incident response, rollback calls, cleanup work, and careful refactors rarely create charts people admire in a meeting. Still, that work protects uptime, lowers support load, and keeps the team from slowing down next month. On a product that runs around the clock, the person who keeps releases calm often does more for the business than the person with the busiest commit history.
That is where pay discussions go off track. If one chart drives the whole conversation, managers end up rewarding visible volume and missing scope, ownership, and risk handled. Engineers notice fast when output gets more attention than judgment. Trust drops with it.
A better read of performance asks different questions. Who took responsibility for the messy area? Who made tradeoffs that saved the team from rework? Who handled the bug nobody wanted, the review that caught a bad assumption, or the cleanup that made future work safer?
Commits still matter a little. They can show cadence, collaboration habits, and long stretches of low engagement. They just cannot carry the full weight of engineering pay anymore. Once AI helps everyone type faster, counting typing stops being a fair stand-in for contribution.
What scope, ownership, and risk mean
Pay gets easier to discuss when you stop measuring visible output and start looking at the weight of the work. AI can help someone write more code, close more tickets, and ship more drafts in a week. That says very little about how much of the product they steer, what decisions they own, or what happens if they get the call wrong.
Scope is the area someone steers
Scope is not about how busy a person looks. It is about how much of the system or product they can shape.
A developer who fixes five small bugs may produce more commits than the engineer who decides how billing, auth, and deployment fit together. The second person may touch less code, but they influence a larger part of the business. Their choices affect more teams, more customers, and more future work.
Senior roles usually grow in this direction. They deal with fuzzy problems, partial information, and tradeoffs that do not have neat answers. They spend less time clearing a queue and more time deciding what belongs in the queue in the first place.
Ownership and risk carry the real weight
Ownership means someone makes the call and lives with the result. If a launch slips, a customer complains, or a bad choice creates months of cleanup, that person cannot shrug and point elsewhere. They own the outcome.
Risk is the cost of being wrong. On a product team, that can mean outages that block customers, security mistakes that expose data, missed deadlines that delay revenue, architecture choices that raise costs for months, or poor priorities that waste a quarter.
This is why raw output misleads managers so often. AI may let two engineers finish the same feature faster. If one engineer still owns uptime, security review, rollback plans, and the final go or no-go call, the responsibility did not shrink just because the typing got faster.
A simple test helps. Ask how wide the area is that this person influences, who decides when the answer is unclear, and who carries the downside if the decision fails. Those answers usually separate level and pay more clearly than activity metrics do.
In a small team, one person may ship only a few pull requests in a month and still carry more scope than the teammate who merged twenty. If they decide the architecture, approve release risk, and answer for outages or budget overruns, their role has more weight. Pay should reflect that weight, not just the visible output.
What evidence belongs in the room
Memory is a bad pay tool. People remember the loud week, the urgent launch, and the person who talks most in meetings. If you want a fair discussion, bring evidence that shows what someone owned, what calls they made, and what risk they carried across a full quarter.
Start with decisions. Write down the choices the person made and the result of each one. Maybe they cut a feature to protect a deadline, slowed a rollout to avoid a customer issue, or pushed for extra testing before a risky release. Those choices often matter more than raw output.
Build a quarter-long record
Ownership needs a trail. If someone took a project from a rough idea to a shipped result, note the parts they handled: shaping scope, working with product or design, keeping the work moving, and staying responsible after release. People who close the loop usually carry more weight than people who touch only one slice.
Incidents and escalations belong in the same record. When systems fail, deadlines slip, or teams disagree, someone has to make the hard call. If they handled a production issue, made a tradeoff under pressure, or stopped the team from taking a risky shortcut, write it down in plain language.
Good evidence is specific. Note the decision, the context, and the result. Note the projects the person owned from start to finish. Note the incidents, escalations, and tradeoffs they handled. Note how they reviewed code, mentored others, or helped a teammate avoid a repeat mistake. Most of all, look at the pattern over several months, not one busy week.
A short example makes the point. One engineer may have fewer commits because AI tools helped everyone write code faster. But if that engineer owned a migration, caught a bad data issue before release, reviewed code that prevented repeat bugs, and coached two teammates through a messy handoff, the evidence points to bigger scope and heavier responsibility.
Use artifacts, not memory
Bring sprint review notes, release records, incident write-ups, code review comments, and brief feedback from partner teams. You do not need a giant packet. Five solid examples beat a long list of vague praise.
Look for movement over time. Did this person handle larger work than they did three months ago? Do they need less direction now? Do other teams trust them with harder problems? That trend says far more than a single heroic week.
If the evidence still feels thin, pause the pay decision and gather better examples next cycle. Thin evidence leads to loud arguments and weak calls. Clear evidence makes the discussion calmer and harder to dismiss.
How to run the conversation
When AI can turn one prompt into twenty files, people need a steadier way to talk about pay. Start before anyone mentions commits, ticket counts, or how quickly code appeared. Begin with the role, the level, and the outcomes that role owns.
If a staff engineer owns a release path, mentors others, and takes on production risk, say that first. If a product engineer mostly ships scoped tasks with close guidance, say that too. This keeps the conversation tied to the job instead of whatever looked busy in the last sprint.
A simple order helps. Define the role and what success looks like. Review the scope handled in the last few months. Discuss the ownership the person took without being pushed. Then look at the risk they carried, reduced, or ignored. Only after that should you decide pay, name the gaps, and set the next review date.
Scope comes first because it is easiest to see. Did the person work on one feature, a whole system, or a messy cross-team problem with unclear edges? Wider scope usually means more context, more tradeoffs, and more pressure.
Then talk about ownership. Did they notice problems early, make decisions, and carry the work over the line? Or did they need heavy steering at each step? AI can help someone produce a lot of code, but it cannot fake steady ownership for long.
Risk changes how you read the first two. Someone who touched billing, security, migrations, or live incidents carried more downside than someone working in a safer corner of the product. That difference should show up in pay.
Use recent examples, not general impressions. Pick two or three moments from the last quarter. One strong example beats a vague line like "they stepped up a lot." A manager can say, "You took over the failing release, cut the incident count, and coordinated support, QA, and engineering for two weeks." That gives the person something real to respond to.
Keep market pay and performance separate, even if they happen in the same meeting. First decide whether the person performs at, above, or below the role. Then discuss whether their pay matches the market for that level and location. Mix those questions together and the conversation gets muddy fast.
End with a plain answer. Say what you decided now, what still blocks the next pay step, and when you will review it again. If the answer is "not yet," name the missing evidence in one or two lines and put the follow-up date on the calendar before the meeting ends.
A simple example from a product team
Picture a small product team with two engineers, Maya and Ben. Both use AI tools every day. Over one quarter, they ship almost the same visible volume: a similar number of pull requests, bug fixes, tests, and small feature updates.
If a manager looks only at output, they seem close. That is where many teams get this wrong. AI can help both people produce more code, so raw volume stops saying much about the level of the job.
Maya owns the release for a billing update. She decides what can ship now, what must wait, and when to roll back. She joins the on-call rotation for that area, reviews alert noise, and makes the call when a late bug might affect revenue. When product and support ask hard questions, she answers them.
Ben does good work too. He delivers the tasks assigned to him, asks for help when requirements change, and fixes issues quickly. But most of his work sits inside a frame that someone else sets. He is not carrying the release risk, and he is not making the final tradeoffs when things get messy.
That difference matters. Maya is not earning more because she typed more. She is earning more because she handled broader scope, made decisions with real cost, and took responsibility when the outcome was uncertain.
A fair manager says that plainly: both of you produced strong output, but the pay gap reflects ownership. One role includes release judgment, incident calls, and business risk. The other role is still mostly execution, even if the execution is fast and solid.
If Ben wants to grow into the next pay band, the path should be concrete. He can own one feature from planning to release, take primary on-call for a limited area with support nearby, make tradeoff calls and document why, or run one release checklist without hand-holding. That works better than vague praise or a flat "do more." It gives Ben a real target, and it keeps Maya from feeling that invisible work does not count.
Teams that use AI heavily need this kind of clarity even more. In lean groups, two people can ship similar code volume while carrying very different levels of risk. Pay should follow that difference.
Mistakes that derail pay talks
Pay talks break down when managers grab the easiest numbers and call them proof. In AI-heavy teams, that usually means commit count, ticket count, or lines changed. Those numbers are easy to pull, but they miss the harder part of the job: who took on broader scope, who owned the result, and who handled real risk after the tools sped everyone up.
Commit volume is the most common trap. One engineer may use AI to close ten small tasks in a week. Another may ship fewer commits because they cleaned up a shaky release process, set up better checks, or fixed a bug that could have taken the product down. If you treat raw output as value, you reward visible motion and ignore the work that protects the business.
Another mistake is cramming three different debates into one meeting. Role level asks what job this person is really doing. Bonus asks what happened in this period. Promotion asks whether they can keep operating at the next level. When managers mix all three, the discussion turns messy and people leave feeling judged by shifting rules.
Vague language makes it worse. "Strong leadership" or "high impact" sounds polished, but it tells nobody what actually happened. Specific examples work better. Say they owned the rollout and handled incidents without pushing blame. Say they cut model costs by changing the workflow instead of lowering quality. Say they caught a security or compliance risk before launch. Say they made the team faster by building checks, docs, or guardrails other people now use.
Invisible work often gets ignored until something breaks. That is a bad habit. The person who keeps evals current, watches spend, adds rollback paths, and reduces failure risk may look quiet on a dashboard. They still carry serious weight. A small AI-first product team stays stable only because someone does that work every week.
The last mistake is waiting until review season to explain expectations. Surprises poison pay discussions. If scope, ownership, and risk matter, managers need to say that early and repeat it often. A short monthly note beats a dramatic year-end debate.
Good pay talks rarely depend on one big speech. They depend on a steady record of what someone owned, what changed because of their work, and what problems they kept off everyone else's desk.
A quick check before you finalize anything
Before a pay decision turns into real money, pressure test it against work that still matters after the sprint ends. This keeps the discussion tied to judgment and responsibility instead of activity logs.
Start with one plain sentence about the person's scope. If you cannot describe it without jargon, the role is still fuzzy. "Owns the billing flow for web and mobile" is clear. "Supports product delivery across customer payments" is not.
Then check ownership. You should be able to point to decisions this person makes without waiting for approval every hour. That might mean choosing the design for a release, deciding how to handle a risky migration, or setting the support plan after launch.
Risk is where many pay calls get weak. Ask what lands on this person when things break. Do they get pulled into incidents at 2 a.m.? Do they make tradeoffs that can hurt revenue, security, or customer trust if they get them wrong? Higher pay usually follows higher downside, not higher message count.
A short checklist helps:
- Write the scope in one sentence a new hire would understand.
- Name two or three decisions the person owns alone.
- Describe one failure they are expected to handle.
- Review their work across a full cycle, not one busy month.
- Note the next bigger scope they need before the next pay jump.
That full cycle matters more now because AI can compress the middle of the work. Someone may ship ten times more code in a week and still avoid the hard parts. Someone else may ship less and still carry planning, review, release, cleanup, and incident follow-through. The second case often has more weight.
Take a checkout team as an example. One engineer uses AI tools to close many small tickets quickly. Another owns checkout reliability from planning through post-release support, approves risky changes, and handles rollback calls when payment errors spike. If pay is close today, the second person has the stronger case for the next step.
Write down the next scope change before you finalize anything. That turns a vague promise into a fair path: own one more system, lead a release alone, take incident command, or carry a customer-facing metric for a quarter. People can work with a target they can see.
If you cannot answer all five checks with plain facts, wait. A delayed decision is better than one built on noise.
Next steps for managers and founders
Most pay systems still reward visible activity. That breaks fast when AI tools let one person ship in a day what used to take a week. Managers need role notes that describe decisions made, systems owned, and risk carried, not just output volume.
A good role note sounds plain. It says someone owns release quality for a product area, makes architecture calls without constant approval, or handles failure risk for a customer-facing workflow. That is far more useful than counting pull requests, tickets closed, or hours online.
If your team is growing, rewrite role notes before the next review cycle. Keep them short and specific so managers will actually use them. For each role, define the decisions this person can make alone, the business or technical area they own, the risks they are expected to prevent or absorb, and the outcomes that show good judgment.
This works best when engineering, product, and operations use the same logic. A product manager who removes ambiguity from specs, an engineer who owns an AI-assisted delivery flow, and an operations lead who cuts failure points all carry scope in different ways. One framework makes those differences easier to compare.
Make evidence a monthly habit too. Annual reviews go sideways when managers try to reconstruct six months of work from memory. One short paragraph per month is usually enough if it answers four questions: what did the person own, which decision did they make, what risk did they reduce or accept, and what changed after that decision in cost, uptime, speed, customer pain, or team load?
That habit makes compensation conversations less emotional. Instead of debating whether someone "looks productive," you can point to the release they safeguarded, the fragile workflow they stabilized, or the messy process they turned into something repeatable.
If titles and pay bands no longer match the real work, an outside view can help. Oleg Sotnikov at oleg.is works with founders as a Fractional CTO and startup advisor, and his background spans engineer, CTO, CEO, and founder roles. That perspective is useful when AI changes how a team works faster than your compensation system can keep up.
Done well, this does not make pay talks softer. It makes them clearer, because everyone can see what the role actually is.
Frequently Asked Questions
Should commit count matter in pay reviews anymore?
A little, but only as a small signal. Commit counts can show cadence or long gaps in engagement, yet they do not show judgment, scope, or the cost of a bad decision. AI makes that gap even bigger because people can produce far more code without taking on more responsibility.
What should managers look at instead of raw output?
Start with scope, ownership, and risk. Ask what part of the product the person steers, which decisions they make when the answer is unclear, and what happens if they get the call wrong. That tells you far more than raw output.
How do I define scope on a small product team?
In a small team, scope means the area this person can shape without constant direction. One engineer may own a whole release path or system, while another handles tasks inside that frame. The first role usually carries more weight even if both people ship similar code.
Why does ownership matter more than code volume?
Because ownership includes the result, not just the task. The person who makes the call, answers for the outcome, and stays with the work after release carries more weight than someone who only finishes assigned items. AI can speed up execution, but it does not remove that responsibility.
How should I talk about risk in a compensation discussion?
Tie risk to real downside. Look at outages, security issues, revenue impact, customer trust, release rollbacks, and long cleanup costs. If someone owns decisions in those areas, their role carries more weight than a safer stream of work.
What evidence makes a pay conversation fair?
Bring a few concrete artifacts from the last quarter. Use release notes, incident write-ups, code review comments, sprint notes, and short feedback from partner teams. Pick examples that show what the person decided, what they owned, and what changed after their work.
Should I separate pay, promotion, and bonus talks?
Keep them separate even if you discuss them in one meeting. First decide how the person performs in the current role. Then check whether market pay fits that level. After that, discuss whether they already operate at the next level often enough to justify a promotion.
How do I explain a pay gap when two engineers ship about the same amount?
Say it plainly: the gap comes from ownership, not typing speed. If one engineer owns release judgment, incident calls, or a risky product area, that role carries more downside and more business impact. Clear wording helps both people understand the decision.
What invisible work should count in pay reviews?
Count the work that keeps the product stable and the team out of trouble. That includes incident response, rollback planning, review work that catches bad assumptions, cleanup that prevents repeat bugs, and cost control on tools or infrastructure. Teams often ignore that work until it fails.
What should an engineer do to earn the next pay step in an AI-heavy team?
Ask for a bigger slice of ownership, not just more tickets. Own one feature from planning through release, make tradeoff calls in writing, take on-call for a limited area, or run a release with support nearby. Those steps give managers real evidence for the next pay band.