Feb 23, 2026·7 min read

Aggregate invariants that stop double writes in apps

Aggregate invariants help teams stop double writes when money, inventory, or permissions change. Learn boundaries, steps, mistakes, and checks.

Why double writes happen

Most double-write bugs start with an ordinary action. A customer places an order, a support agent issues a refund, or an admin changes a user's access. The app then updates two records that need to stay in sync.

That sounds simple until those updates happen in different places. One path charges a card and another marks the invoice as paid. One path reduces stock and another creates a reservation. One path gives a user a role and another writes the permission log.

Teams usually scatter these checks by accident. A controller gets one guard because it was the fastest place to add it. Later, a background job gets a second guard for retries. Then a support, billing, or import script adds its own version. Each piece looks reasonable on its own, but they don't all follow the same rule.

The problem gets worse when people model around tables instead of rules. They see an orders table, an inventory table, and a payments table, so they update each one where it seems convenient. When a bug appears, they patch that path only. Another path still skips the check, runs twice, or saves one change without the other.

The cost is real. A duplicate charge means refunds, support work, and lost trust. Oversold stock means apologizing to buyers and fixing orders by hand. A bad permission update can block the right person or give access to the wrong one.

Aggregate invariants help because they keep scattered checks from becoming the system's real logic. If one user action changes money, inventory, or permissions, the rule for that change needs one home. Every controller, job, and script should call that same part of the model.

That is why aggregate boundaries matter. They are not there to make diagrams look neat. They are there so one action leads to one protected change instead of two writes that drift apart when the system is under pressure.

What an aggregate really protects

An invariant is a rule that must stay true every time data changes. If that rule breaks even for a moment, users get bad results. Money disappears, stock drops below zero, or someone keeps access after their role changed.

An aggregate protects that rule by giving it one owner. That owner is not a table, and it is not just a group of records with the same ID. It is the part of the model that decides, "this change is allowed" or "reject it."

Take a balance update. The rule might be simple: an account cannot spend more than it has, and every accepted payment must leave the balance correct. If one part of the app checks the balance, another writes a payment row, and a third updates the account total, those writes can drift apart. One aggregate should decide the payment and apply the change as one business action.

Stock counts work the same way. "Available stock cannot go below zero" is an invariant. An order, product, or reservation model may own that rule, but only one should. If checkout subtracts stock, a warehouse sync subtracts it again, and an admin tool edits the count on its own, the bug comes back fast.

Permissions are often worse because teams spread checks across screens, APIs, and jobs. Suppose a company removes a manager role. If the UI hides the approval button but the API still accepts approval requests, the rule is already broken. One aggregate should own the role change and the permission change together.

This is why aggregate boundaries should follow business rules, not database shape. One rule can touch several tables, and one table can support different rules. Tables store facts. Aggregate invariants keep those facts true when two users click at the same time, a retry fires, or a background job runs late.

When a rule has one owner, double writes have fewer places to hide.

Start with the rule, not the schema

Many double writes start when a team models tables before it names the rule that must never break. Rows and foreign keys tell you where data lives. They do not tell you which change must stay atomic.

Write the rule as one short sentence. Keep it plain enough that a product manager, developer, and tester would all read it the same way. Good rules sound like this: "An invoice can be paid only once" or "A user cannot grant a permission they do not have."

That sentence gives you a target. Without it, teams spread checks across handlers, jobs, and database triggers, and nobody owns the final yes or no.

A simple way to pin down aggregate invariants is to ask four blunt questions:

What single rule must never break?
Which state changes could break it?
Which action should approve or reject the change?
What is the smallest group of data that action must control?

The second question is where bugs usually show up. If the rule is "an invoice can be paid only once," the risky changes are not just the status field. The payment record, paid amount, receipt creation, and retry logic can all push the system into a bad state if they happen in pieces.

Then give one action the authority to decide. Not three services. Not one API check plus one background fix. One action. In code, that often becomes a method such as applyPayment() or grantRole(). It checks the rule against current state and either changes everything inside its boundary or changes nothing.

Keep that boundary small. If you need half the database to answer a simple yes or no, the model is too wide. If the action can protect the rule with one invoice, one order, or one permission set, keep the decision there. You can publish events after the decision, but the decision itself should live in one place.

This rule-first approach is usually less glamorous than adding more infrastructure, but it works. Teams move faster when one clear action owns the rule, and double writes have fewer places to hide. That is also the kind of architecture problem Oleg Sotnikov works on at oleg.is in his fractional CTO advisory: finding the business rule first, then shaping one safe write path around it.

How to choose the boundary

Draw the boundary around the rule that must stay true after every write. If an order must never reserve more stock than exists, the stock count and the reservation change belong in one place. If a payment must never charge twice, the check for a prior charge and the new charge record should live in the same write path.

That is where aggregate invariants help. They show which facts cannot drift apart, even for a moment. Start with that rule, not with tables, screens, or which fields happen to sit near each other in the database.

Many values sit near each other but do not need the same boundary. A product's stock number and a dashboard counter for "items sold today" may appear on the same admin page, yet they carry very different risk. Stock must stay exact. The dashboard can catch up a few seconds later.

Keep reports, search indexes, analytics counters, caches, and other read models outside the write boundary when a short delay does no harm. Teams often pull them inside because it feels tidy. Then one small write touches too much data, and the app slows down under load.

A simple test helps. Ask what happens if one part updates now and another updates later. Can you lose money? Can you sell stock you do not have? Can someone keep a permission they should lose right away? If the answer is no, you may not need one aggregate.

The opposite mistake hurts too. Teams build huge aggregates because "everything about the customer" sounds clean on paper. Then a profile edit, a billing change, and a role update all fight over the same boundary even though they do not protect the same rule.

Split data that can move later without causing damage. A shipping address can often update on its own. An invoice total and the payment that settles it usually should not.

Good aggregate boundaries are often smaller than the first draft. They protect one risky decision at a time. If a boundary keeps growing, ask which data must change together every single time, and move the rest out.

Model one safe change

Make Payments Safer

If charges, refunds, or invoices drift apart, Oleg can help you design one write path.

Plan Payment Fix

Pick one action that can break things fast if two requests hit at the same time. Reserving stock is a clean example. If you have 3 units left, the app must never end up promising 4.

A good boundary for this rule is often one SKU in one warehouse. That aggregate owns the numbers that matter: available units, reserved units, and any rule about duplicate holds for the same order. You load that one aggregate, ask it to handle the request, and let it decide.

The flow should stay simple. The app receives a request to reserve 2 units for order 847. It loads the inventory aggregate for that SKU and warehouse. It calls one method such as reserve(orderId, quantity). That method either changes state and returns success, or rejects the request with a clear reason.

The method should check the business rule where the state lives. It can reject zero or negative quantities, block the same order from reserving twice, and stop any request that would push available stock below zero. That is the point of aggregate invariants: one place decides, and every caller gets the same answer.

If the aggregate accepts the change, save that new state once. Do not split the rule across a controller check, a stock update, and a second write somewhere else. That is how double writes creep back in. One decision, one state change, one save.

If the aggregate says no, keep the reason. Return a code like "out_of_stock" or "duplicate_reservation" and attach the request ID or order ID. Support can see why the action failed. Developers can search logs without guessing. Users get a clear message instead of a vague error.

One more detail matters in real apps. If two requests race, one save may fail because someone else updated the same aggregate first. That is normal. Reload it, run the same method again, and let the aggregate decide with the latest state.

What happens when two buyers race

Picture a store with 2 units left of one product. Two shoppers click "Buy" within the same second. If the app checks stock in one place and writes the order in another, both requests can see "2 available" and both can create an order. Now the store has sold 4 units it never had.

A safer model puts the rule inside the write path that accepts the sale. The aggregate should not trust a stock number that the page loaded five seconds ago. It should check the latest available quantity and reserve the items as part of the same decision.

Now the race is easier to reason about. Request one arrives and asks for 2 units. The aggregate sees 2 available, reserves all 2, and accepts the order. Request two arrives a moment later and asks for the same 2 units. The aggregate now sees 0 available and rejects the order with a clear "out_of_stock" result.

That second response matters. The app does not create a half-valid order. It does not charge first and apologize later. It says no before bad data gets into the system.

This is where aggregate invariants earn their keep. The rule is not scattered across the cart, checkout page, payment callback, and admin panel. One place decides whether the order can exist. Every path gets the same answer.

Support feels the difference fast. They do not have to explain why a customer paid for something that vanished. Finance sees it too. There is one source of truth for what got reserved, what got sold, and what never became an order.

A lot of teams try to fix this with extra checks after the fact. That usually leads to refunds, manual edits, and messy reports. A single rule at the boundary is less exciting, but it saves real pain when money and inventory move at the same time.

Mistakes that bring the bug back

Clean Up Retry Logic

Add idempotency and one clear owner for writes that users or workers may repeat.

Audit Retries

Most teams fix a double-write bug once, then quietly add it back through a side path. The code looks clean in one place, but another path updates the same rule without the same guard.

A common mistake is splitting one rule across request code and background jobs. The API checks whether an order can reserve stock, but a worker later adjusts the same stock count with different logic. Now the rule lives in two places, and they drift. One small mismatch is enough to oversell, refund twice, or grant access after it should be removed.

Another bug starts when two aggregates can change the same number. If one aggregate updates an account balance and another can also edit that balance, nobody owns the rule anymore. The same problem shows up with inventory counts and permission sets. Pick one place that decides, and make every write go through it.

Direct SQL updates cause more damage than teams expect. Someone needs a quick fix, runs an update in production, and skips the checks that the app normally applies. The data looks right for a moment, but the next write runs into state the model never expected. A one-off fix often creates a long cleanup.

Retries need rules too. If a payment request times out and the client sends it again, the system must know whether it already applied that change. Without idempotency, a harmless retry becomes a second charge or a second stock deduction.

Stale reads are another trap. A dashboard might show "5 items left," but that number may already be old by the time a user clicks buy. If the write must stay exact, do not trust a read from a different moment and hope nothing changed.

A fast gut check helps. Can two code paths change the same balance, count, or permission? Can a retry repeat the same write? Can anyone bypass the model with direct SQL? Does the write depend on data read earlier from a stale view? When the answer is yes, the rule is still too easy to break.

Checks before you ship

Tighten Access Changes

Put role and permission updates behind one rule instead of scattered checks.

Fix Access

A model can look safe in tests and still fail in production when one path slips past the rule. Before release, review every change that touches money, stock, or access with one concrete action in mind, such as "reserve 3 items" or "remove admin access."

Vague reviews miss bugs. A good rule fits in one plain sentence, and the code has one clear place where that rule can reject the change.

Start by writing the rule in one line. For example: "An order can reserve stock only if enough units remain." If two developers explain it differently, the model is still loose.

Then find the one path that can say no. The API, background job, import script, and admin tool should all hit the same rule. If one tool updates rows directly, it can bypass the guard.

After that, force the app into the situations that usually expose double writes. Run two requests at the same time from the same starting state. Retry the same command on purpose. Check whether one record commits before the rule finishes. If only one request should win but both succeed, or if a retry performs the action again, the boundary is still wrong.

This review does not need a long meeting. In many teams, 15 minutes with one engineer and one product person is enough to spot the weak point.

A blunt test works well: can someone on the team explain the rule and the failure case in one sentence? If not, the rule probably lives in scattered checks, comments, or team memory. That is where double writes come from.

Aggregate invariants should feel a little strict. If a change seems too easy to slip in through a side door, it probably is. Catch that before users do.

What to do next

Pick the flow that can hurt you this week. Start with the one that can lose money, oversell stock, or give the wrong person access. If a refund, stock reservation, or role change still touches two tables or two services with separate checks, start there.

Do not begin by moving classes around. Write down three things: the rule, the owner, and the change. For example: "stock can never go below zero," "the inventory aggregate owns that rule," and "every reservation must pass through that one write path." That small exercise clears up a lot of team confusion before anyone opens an editor.

Then fix one path all the way through. Teams often patch five entry points halfway, and the bug comes back through the sixth. A safer approach is boring on purpose: choose one command, put the invariant in one place, remove duplicate checks around it, and make every caller use that path.

Track what happens afterward. You should see fewer manual fixes, fewer "how did this happen" moments, and fewer cases where support has to untangle money, inventory, or permissions by hand. If those numbers do not drop, the boundary is probably still wrong.

If your team keeps getting stuck on aggregate boundaries, get a design review before you recode more paths. This is the kind of architecture work Oleg Sotnikov does through oleg.is as a fractional CTO and startup advisor. A few hours of clear modeling can save days of patching the same double writes again.

That is usually the first real win with aggregate invariants: fewer surprises, and one change path your team can trust.

Frequently Asked Questions

What is an aggregate invariant?

An aggregate invariant is a business rule that must stay true every time data changes. Think of rules like "an invoice gets paid only once" or "stock never drops below zero." You put that rule in one write path so every API call, job, and script follows the same decision.

How do aggregate invariants stop double writes?

Double writes happen when different parts of the app update the same rule in different ways. An aggregate stops that by making one method check the rule and apply the change together. That cuts down the spots where two writes can drift apart.

How do I choose the right aggregate boundary?

Start with the rule, not the tables. Write one short sentence about what must never break, then ask which data must change together to keep that rule true. If you need more data than that, your boundary probably grew too wide.

Should reports and caches live inside the aggregate?

No. Keep read models like reports, search indexes, analytics counters, and caches outside the write boundary when a short delay causes no harm. Put only the data that must stay exact in the same write path.

What should happen when two buyers try to buy the last item?

The aggregate should check the latest stock when the write happens, not when the page loaded. If buyer one reserves the last units first, buyer two should get a clear rejection such as out_of_stock. That prevents half-finished orders and manual cleanup later.

Do I still need idempotency if I use aggregates?

Yes. Retries often create the same bug again if the app cannot tell whether it already applied the change. Add idempotency so the second attempt returns the earlier result instead of charging twice or deducting stock again.

Can controller checks or database triggers solve this on their own?

No. A controller check helps, but another path can skip it. Database triggers can catch some problems, but they often hide business rules in a place few people read. Put the rule in the model that owns the write, then make every caller use it.

How can I tell that my aggregate is too big?

You usually made it too big when unrelated changes fight over the same boundary. If a profile edit, a billing change, and a role update all block each other, split them unless they protect the same rule. Keep the boundary tight around one risky decision.

How should I test this before I ship?

Run two requests at the same time from the same starting state and see whether only one wins when the rule says only one should. Then retry the same command on purpose. If both writes succeed or the retry repeats the action, the rule still leaks through another path.

Where should I start in an app that already has double-write bugs?

Start with the flow that can hurt you fastest, like payments, stock reservations, refunds, or role changes. Write the rule in one sentence, name the owner, and move one command to a single safe write path. Fix one path fully before you touch the next one.