Small software architecture you can still reason about
Small software architecture helps teams spot risk, cut service sprawl, and get better AI support because one person can still follow the whole system.

Why this gets hard fast
A system can feel simple on Monday and messy by Friday. The change often looks harmless: one new service for billing, another for email, a worker for background jobs, a separate store for search. Each piece solves a local problem. Together, they make the product harder to understand as a whole.
Every new service gives you one more place to check when something breaks. You read one log, then another. You compare timestamps. You try to work out whether the request failed, retried, or landed in a queue nobody watches closely. The bug may be small, but the search area gets wide very quickly.
One user action can also cross too many boundaries. A step like "create account" might touch the API, user database, auth, email worker, analytics, and billing setup. If one handoff fails, the user only sees "something went wrong." The team then has to trace the entire chain to find the real fault.
Small bugs hide in those handoffs. One service sends a field with the wrong name. Another accepts it but drops part of the data. A queue delivers events late, so the UI shows old status for a few minutes. No single part looks badly broken on its own, which makes the problem slower to spot.
After a while, nobody knows the full path anymore. One developer knows the API. Another knows the worker. Someone else set up the queue six months ago and has since left. Now nobody can explain, in plain language, what happens between a user's click and the final database write.
That is why architecture matters earlier than most teams expect. Complexity rarely arrives through one dramatic rewrite. It creeps in through extra boundaries, one tidy decision at a time.
What "small enough" looks like
"Small enough" does not mean tiny. It means one person can still hold the full shape of the system in their head without guessing. If a request comes in, they can trace it from the browser or API to the app, the database, and any background job without opening twenty diagrams.
In a healthy setup, request paths stay short. A user action should pass through a few obvious steps, not bounce across five services because the team wanted perfect separation too early. If a customer updates a subscription, you should know where that starts, where it gets validated, where it gets stored, and what side effects happen after.
Each module should have one plain job. Auth handles sign in. Billing handles plans and invoices. Notifications send email or messages. Once a module starts reaching into three other areas, the design is already getting muddy.
Data also needs one home. If the same customer status lives in the app database, a cache, a reporting store, and a second service with its own copy, people stop trusting the system. Small systems keep the source of truth boring and obvious.
A quick paper test usually tells the story. Can you draw the full request flow in under a minute? Can you name the owner of each module in one short sentence? Can you point to the single place where each important fact is stored? Can you explain what fails first if one part goes down?
Failure should look plain on paper. If the database fails, writes stop. If the email worker fails, messages wait but the main product still works. If you cannot explain those paths in simple words, the system is already too spread out.
That is one reason Oleg Sotnikov often pushes teams toward leaner systems. When the shape stays compact, people see risk earlier and fix it before it turns into a late night debugging session.
Signs you split too early
A split is too early when the new boundary creates more coordination than clarity. The diagram may look neat, but daily work gets slower. That gap shows up fast.
If two services change together almost every week, they are probably one thing with two deploys. A pricing rule, permission update, or signup tweak should not force the same pair of repos to move in lockstep. When it does, the boundary adds ceremony instead of reducing risk.
The same problem appears when one feature needs edits in several repos. A developer starts in the API, jumps to a worker, updates shared types, then fixes the admin app. Now most of the risk sits between repos, not inside the feature. A compact system keeps most changes close to one codebase or one tight module.
Queues are another giveaway. They make sense for long jobs like video processing, email delivery, or large imports. They do not fix a slow query or too many internal calls. If a team keeps adding queues to cover basic request delays, the system is getting more complicated than the problem.
Bad names also give the game away. Teams end up with folders and services called "core," "shared," "platform," or "common," then spend meetings arguing about where code belongs. Clear boundaries usually produce plain names because the job is obvious.
Local setup is a blunt test, and it rarely lies. If someone needs hours to run the product on a laptop, the split costs too much. The same goes for AI coding tools. A model can follow one repo and one test suite fairly well. Once a feature path jumps across many services, the tool spends more effort rebuilding context than helping.
You can usually spot the pattern without a long audit. The same change lands in multiple repos every week. One user feature triggers several deploys. Shared libraries keep growing because ownership is fuzzy. Developers stop running the whole product locally. Service names need long explanations in meetings. When those signs show up together, the system is wider than it needs to be.
For many SaaS products, one API, one worker, and clear modules inside each are enough for a long time. If the split makes the system harder to explain, harder to test, and harder to change, you probably did it too soon.
How to shrink the system
Start with the paths users take every day: signup, payment, report export, settings changes. Write each flow in plain language, then mark what data it reads and what data it changes. Very often, the same tables, jobs, and rules show up across several parts of the code.
That map tells you where the system got wider than it needed to be. If two services touch the same data, ship on the same schedule, and fail for the same reasons, they are often one thing pretending to be two. Merge them. Separate repos and separate deploys add mental load quickly.
A short pass is usually enough. Pick a few user flows that drive revenue or support tickets. Mark every read, write, and side effect. Look for rules copied in more than one place, and mark the parts that always release together.
Copied rules are a common source of service sprawl. Permission checks, plan limits, status changes, and pricing logic often end up repeated in handlers, workers, and admin tools. Pull those rules into one module and keep that module close to the data it owns. One clear place beats five almost matching versions.
For most teams, one deployable unit is enough for a long time. Keep separate modules inside the codebase, but ship them together unless load or security forces a split. A real reason looks like this: one area needs very different scale, handles sensitive data with stricter isolation, or must meet a hard compliance boundary. "It might be useful later" is not a reason.
A small architecture often looks less impressive on a diagram. It works better in production. If billing, webhooks, and account changes always move together, keep them in one app with tight modules. You cut network calls, config drift, and release mistakes. More importantly, one person can still hold the full shape in their head, and that is still one of the best tests for sane design.
A simple example from a growing product
A SaaS startup reached its first few thousand users and did what many teams do when growth picked up: it split the app into separate services for billing, users, notifications, and reports. The diagram looked clean. Daily work got harder.
A new signup seemed simple from the outside. A person created an account, picked a plan, got a welcome email, and expected to see the first report later. Under the hood, that one flow now depended on four deployments, several logs, and a queue or two. If one part slowed down, the failure showed up far away from the actual cause.
The billing service could charge the card while the user service failed to finish account creation. Or the account could exist, but the notification service could miss the welcome email after a bad deploy. Nothing here was dramatic. These were small, messy failures that kept on call noisy and hard to reason about.
The team rolled back some of the split. They merged notifications back into the main app because email and in app messages did not need their own deploy cycle, worker pool, and alerting. That change alone removed a surprising amount of overhead.
They still kept reports out of the request path, but they stopped treating reporting as a separate service. Report jobs ran behind an internal boundary inside the main system. The app could enqueue work, a single job runner could process it, and engineers had one place to retry failures and inspect what happened.
After that, signup depended on fewer moving parts. The code looked less "pure" on paper, but the system was easier to understand. On call work dropped because the failure path got shorter.
That trade off makes sense earlier than many teams expect. If one customer action crosses four services, you usually do not have a scaling win yet. You just have more places for a normal bug to hide.
Why AI tools work better on smaller systems
AI coding tools work best when the code fits into one mental picture. A single codebase with a few clear modules gives the model enough context to make changes that match the rest of the app. Split the same product across many repos and services, and the tool starts filling gaps with guesses.
That is where trouble starts. The model updates an API handler but misses the worker that uses the same payload. It changes one schema and leaves a second copy untouched in another repo. It writes a neat test for the happy path, but it cannot see the queue, cache, and background job that also shape the result.
Short call chains help more than people expect. If a request goes from controller to business logic to database, AI can usually follow it. If that same request jumps across a gateway, auth service, billing service, message bus, and notification worker, useful context gets thin fast.
You can feel the difference in review. When the system stays compact, the AI reads one repo instead of hunting across many. Module names point to real business actions. Shared types live in one place. Tests map cleanly to the code that changed. A reviewer can trace the edit in a few minutes.
Names matter a lot here. A module called billing/invoices gives both humans and AI a solid clue. A folder called common2 or helpers does not. Clear boundaries also make test generation simpler, because the tool can see what belongs inside the module and what should be mocked.
Human review gets better too. When the shape of the system is tight, reviewers catch wrong edits sooner because they can still hold the flow in their head. The goal is not elegance. The goal is to make changes easier to verify before they reach production.
How smaller architecture exposes production risk
When a request moves through three parts, you can usually trace failure in a few minutes. When it jumps across twelve services, queues, caches, and webhooks, the same bug turns into a hunt. You stop asking "what broke" and start asking "which hop broke first."
That is why smaller architecture makes production risk easier to see. A simpler request path gives you a short list of places to inspect. Logs make more sense, retries are easier to spot, and one bad timeout does not disappear behind five other moving parts.
Checkout and signup make this obvious. In a compact system, the app writes the order, charges the card, and sends a confirmation. If something fails, you can follow the request from the app to the database and back. In a sprawling setup, the same flow might touch an API gateway, auth, cart, billing, an event bus, an email worker, and analytics. One slow dependency can make the whole thing look random.
Clear ownership matters just as much. Every important fact needs one home. If billing owns subscription status, the rest of the system should read it there instead of keeping half copies elsewhere. Teams get into trouble when several modules each think they own the same customer state. Then support sees one answer, finance sees another, and engineers waste hours deciding which record to trust.
Rollbacks stay simpler too. If one deploy changes a single app and one database schema, you can undo it quickly. If five services must stay on matching versions, rollback stops being a safety net and becomes another incident.
Alert noise drops for the same reason. Fewer services talking to each other means fewer connection errors, fewer retry storms, and fewer duplicate alerts for one underlying problem. A small system still fails, but it fails in ways people can understand.
During an incident, you want fast answers to four questions: which request failed first, which module owns the broken data, whether the last deploy caused it, and whether one rollback can stop the damage. If the architecture lets one person answer those without opening ten dashboards, the risk is probably visible instead of buried.
Mistakes that keep the system too big
A lot of teams do not grow into complexity. They import it. A founder sees how a huge company runs dozens of services, queues, and side systems, then copies that shape on day one. The result looks serious, but it gives a small product more moving parts than it can watch.
One common mistake is splitting by tech stack instead of by user flow. "The Python service does AI, the Node service does the API, the Go service does background jobs" sounds neat on a diagram. In practice, one feature now crosses three repos, three deploys, and three places to fail. Smaller systems work better when one user action maps to a small, clear slice of code.
Another trap is keeping old services after the reason for them disappears. Teams add a worker, cache layer, or reporting service during a rush, then never ask whether it still earns its place six months later. If traffic changed, the product changed, or the team shrank, that old split may now cost more than it saves.
Slow queries cause the same kind of overreaction. A slow page usually points to a missing index, a bad join, too much data loaded at once, or a report that should run in the background. Spinning out a new service before fixing the query is like buying a second kitchen because dinner took too long once.
There is a simple test. Ask, "If we removed this service next week, what problem would come back right away?" If nobody can answer in one sentence, the service is probably dead weight.
Shared libraries create hidden bloat too. They look tidy because code lives in one place, but they often blur ownership. Then one small change forces several people to coordinate, and nobody wants to touch the code. A tighter module inside the main app is often easier to change, test, and understand.
Teams usually notice the cost late. AI coding help gets worse when logic is spread across too many repos, and production risk gets harder to see when failures hop between systems. Smaller systems are not childish. They are easier to run well.
A quick check before you add another service
A new service looks neat on a diagram. In production, it means one more deploy, one more set of logs, and one more place where data can drift or fail. A split should solve a real problem now, not a vague fear about later.
Start with the boring option first. If you can keep the code in one app and draw a clean boundary inside it, that is often the better choice. A module with a strict interface is easier to test, easier to change, and much easier for one person to hold in their head.
Before you split something out, ask a few blunt questions. Could one module inside the current app solve this cleanly if you tightened the boundary? Will the new service own data that other parts of the system should not read or write directly? Can one person explain the full failure path in under a minute, from user action to broken outcome? Can the team test and deploy it with the same pipeline they already use, without extra manual steps? Does the split remove a measured bottleneck such as scaling, security, or uptime trouble, instead of a guessed one?
If the answer is "no" to most of that, keep the code where it is. Extra services often hide weak boundaries instead of fixing them. They also create quiet costs: more alerts, more config, more secrets, and more ways for staging and production to drift apart.
A simple example helps. If a growing product starts doing heavy video processing and that work blocks web requests, a separate worker may make sense. It has its own runtime, queue, and scaling needs. If the team only wants "cleaner code," a well shaped module inside the main app is usually enough.
That bias toward fewer moving parts saves time later. When something breaks at 2 a.m., the team can trace the path quickly and fix the right thing first.
What to do next
Pick one real user flow and trace it end to end this week. Keep it ordinary, not a rare admin path. Signup, first project creation, or checkout are good choices. Write down every service, queue, and database that flow touches.
That count tells you more than a diagram. If one simple action jumps across six or seven moving parts, you have more production risk than you need. It also shows where AI coding tools will struggle, because the logic lives in too many places at once.
Keep the review simple. Name the flow. List each service, queue, and database it touches. Note where ownership is unclear. Mark any step that can fail without an obvious alert.
Then choose one boundary to tighten or merge. Do not try to redraw the whole system in one pass. One clean change is usually enough to expose the next one.
A common example is a separate service that mostly passes requests to the main app and stores a small copy of the same data. If it has no clear reason to exist, fold it back in. If you must keep it, remove one extra queue, cache, or duplicate table around it.
After that, write ownership rules in plain language. Keep them short enough that a new engineer can read them in two minutes. "The app owns signup" and "Only the worker sends email" are better than abstract rules nobody remembers.
Aim for a system that one person can explain on one page without hand waving. Then run the same user flow again after your change. If the path is shorter and the failure points are easier to name, the system is better.
If you want a second opinion, Oleg Sotnikov at oleg.is works with startups as a Fractional CTO on architecture cleanup and practical AI adoption. That kind of outside review is often most useful when the product still feels manageable, not after service sprawl has already become normal.
Frequently Asked Questions
What does "small enough" mean in software architecture?
It means one person can trace a normal user action from the app to the database and any background job without guessing. If signup, billing, or checkout needs a long tour across repos, queues, and dashboards, the system has grown too wide.
When should I split code into a new service?
Split only when you face a real problem now, like very different scaling needs, stricter isolation for sensitive data, or a hard compliance boundary. If you only want cleaner code, keep one deploy and make the module boundary tighter inside the app.
How can I tell if we split too early?
You will feel it in daily work. One feature starts touching several repos, the same two services ship together every week, local setup gets slow, and service names need long explanations in meetings.
Do queues fix slow requests?
Usually no. A queue helps with long jobs like email, video processing, or large imports, but it will not fix a bad query or too many internal calls. Fix the slow path first, then add a queue only if the work truly belongs outside the request.
How do I shrink an architecture without a full rewrite?
Start with one common user flow like signup or checkout and write every read, write, and side effect in plain language. Then merge the parts that always change together, pull repeated business rules into one module, and keep the data close to the code that owns it.
Why do AI coding tools work better on smaller systems?
AI tools do better when they can see one clear picture of the code. In a compact system, the model can follow the request path, shared types live in one place, and reviewers can check the change without hunting across several repos.
How many services does an early SaaS product usually need?
For many SaaS products, one API, one worker, and clear modules inside each can carry a lot more load than teams expect. You can stay there until traffic, security, or uptime needs force a real split.
What user flow should I review first?
Trace a boring, high-traffic path first, not an edge case. Signup, first project creation, payment, or report export will show you the real request path, the data owners, and the spots where failures hide.
How do I know if a service still earns its place?
Ask one blunt question: if you remove this service next week, what problem returns right away? If nobody can answer in one sentence, the service probably adds more overhead than value.
When should I ask for outside architecture help?
Bring in outside help when the team can no longer explain the full path of a normal request, incidents take too long to trace, or changes keep crossing too many repos. A practical review from someone like Oleg Sotnikov can help you cut extra boundaries before they turn into normal pain.