Jan 23, 2026·8 min read

Monolith vs microservices for your first 20 engineers

Monolith vs microservices for a team of 20: compare release pain, hiring costs, and failure modes so you pick the simpler setup now.

Monolith vs microservices for your first 20 engineers

Why this choice matters before 20 engineers

The monolith vs microservices argument stops being abstract the moment your team starts paying for it in normal work. One bad release can take the product down. One new hire can spend months just learning how things fit together. One small feature can turn into a week of coordination.

Early on, almost any architecture feels fine. The product is still young, the team is small, and the founders still know most of the code. Later, the same decision gets expensive. Architecture shapes how people debug, how they release, and what kind of engineers you need to hire.

Most teams feel that cost in three places: outages, hiring, and releases. Outages show how failures start, how far they spread, and how hard they are to trace. Hiring shows how much experience you need and how long onboarding takes. Releases show whether one change moves quickly or waits on several teams, repos, and deployment steps.

Team stage matters more than trend-following. A team of 3 engineers has very different problems from a team of 12, even if both want the same future scale. Small teams usually lose more time to coordination than to raw system limits. They do not need architecture that looks impressive on a diagram. They need architecture they can still understand at 2 a.m. during an incident.

For teams under 10 engineers, the biggest risk is often self-inflicted complexity. A setup with many services can create network issues, version drift, duplicate tooling, and fuzzy ownership long before traffic demands it. A monolith has its own problems, but at that size it is often easier to read, test, and fix.

Closer to 20 engineers, the pressure changes. More people edit the same code. Release conflicts rise. One team can block another. Domain boundaries start to matter more. At that point, the question is no longer about fashion. It is about whether the current setup still matches the way the team works.

If your team is somewhere between 3 and 20 engineers, do not look for one perfect answer that stays right forever. Look for the tradeoff that fits now, then revisit it as the team grows.

When a monolith fits best

A monolith usually works best when the product still changes every week and the team is small enough to talk through most problems together. In that stage, teams often get more from less complexity. One codebase makes cause and effect easier to trace.

If checkout breaks after a pricing change, engineers can inspect one repo, one set of logs, and one release. That sounds simple because it is. Early teams need that direct line between change and outcome.

It also makes hiring easier. A new engineer learns one app, one deploy path, and one test flow. They can fix bugs and ship features sooner because they do not need to understand service boundaries, API contracts, queue failures, and cross-service auth on day one.

A single database helps more than many teams expect. Reports are easier to write. Transactions are easier to reason about. When someone asks, "Which users paid and then canceled within 30 days?" the answer often comes from one query, not a chain of sync jobs and half-matched data copies.

Shipping speed is usually the strongest reason to stay with a monolith. Most early products do not slow down because the code lives together. They slow down because small changes take too long. Every extra service adds plumbing work: network timeouts, retries, tracing, deployment rules, and version mismatches.

A monolith is often the right fit when one team still owns most features, product changes are frequent, data moves across many features in the same user flow, and a single application can handle traffic with ordinary scaling.

Picture a 7-person SaaS team building billing, admin tools, and reporting. In a monolith, they can change one shared model, run one test pipeline, and deploy once. Put those same features into four services and the team starts spending part of the week updating contracts, fixing integration bugs, and checking whether one release broke another.

For the first stretch of growth, boring architecture is often the smart move.

When microservices start to make sense

Microservices earn their keep when one part of the product starts living a different life from the rest. Team size alone is a weak reason to split. If the whole app still changes together, ships together, and breaks together, a monolith is usually simpler.

A split starts helping when one area moves much faster than the rest. Maybe the core product changes every week, while billing hardly changes at all. Or maybe the main app is calm, but a background job that processes files or AI tasks changes constantly. In cases like that, forcing both parts into one release cycle creates friction you do not need.

Signs the boundary is real

A service boundary works best when the business already has a clear domain, not when the team is guessing. Payments, search, video processing, and identity often work because they have their own rules, risks, and owners.

Ownership matters just as much. If three teams still edit the same logic every week, splitting the code will not create clean ownership. It just moves the same mess onto the network.

Scaling can also justify a split, but only in a narrow case. If one workload spikes far above the rest, separate deployment can save money and reduce stress. A normal web app with a document-processing pipeline that handles ten times more jobs at month end is a good example. That pipeline may deserve its own service. The rest of the system may not.

The cost of splitting too early

Every new service adds network calls, API contracts, deploy steps, logs, alerts, and failure points. A bug that was once a function call can turn into a timeout, a retry loop, or mismatched data between services.

That is why the split should come after you see a real boundary, not before. If one domain has a different pace of change, clear ownership, and very different load, microservices can help. If not, you are probably buying complexity early.

How failures spread in each model

A monolith usually breaks in a loud, obvious way. One bad deploy, one memory leak, or one broken database migration can knock over a large part of the app at once. That hurts, but the upside is clarity. The team sees one incident, opens one codebase, and makes one rollback decision.

Monolith failures

With a monolith, the blast radius is often wide but easy to see. If login fails, the team can usually reproduce the problem on one machine, inspect one set of logs, and trace one request path from start to finish. For a small team, that matters more than people admit. Four engineers can swarm one visible outage and often restore service fast.

A simple example: a bad release changes how user sessions are stored. Suddenly checkout, admin pages, and account settings all start failing. That is ugly, but the fix is usually direct. Revert the release, restore the old behavior, and sort out the bug after the site is stable.

Microservices failures

Microservices tend to fail in chains. One service gets slow, which causes timeouts in another service, which backs up a queue, which creates stale data somewhere else. Users do not always see a clean outage. They see strange symptoms: duplicate emails, missing updates, spinning screens, partial checkouts, or reports that lag by 20 minutes.

That makes debugging much harder. Instead of one process and one log stream, the team may need to trace a request across five service hops, a message queue, and two versions of the same API. Version drift makes this worse. Service A expects one field. Service B stopped sending it last week. Now only some requests fail.

This is where small teams often get surprised. A monolith can fail bigger, but microservices can fail in stranger ways. Strange failures eat time.

For teams under 20 engineers, one visible failure is often cheaper than five small failures hidden inside retries, caches, and async jobs. Big incidents are painful. Murky incidents are usually worse because they pull half the team into detective work before anyone can fix the real problem.

What hiring costs look like

Fix release pain first
Get practical CTO advice on CI, ownership, and deploy flow before you break up the app.

Architecture changes hiring costs long before headcount reaches 20. The bill is not just salaries. It includes training time, on-call stress, slower interviews, and the number of people you need before work feels calm again.

A monolith usually lets you hire strong generalists first. That is cheaper in practice. One good engineer can fix a bug in checkout, add an admin screen, review a database change, and ship it in the same week. You may still want specialists later, but you can delay that need while the product is still changing fast.

Microservices change the shape of the job much earlier. You start looking for people who understand service boundaries, API contracts, deployments, logs, metrics, tracing, queues, and failure handling across services. That narrows the hiring pool. Interviews also get harder, because you are not only checking coding skill. You are checking whether someone can work safely in a distributed system.

The hidden costs show up quickly. You spend more time hiring infra-minded engineers. On-call gets heavier because each service can alert, fail, or deploy differently. New hires take longer to ramp up because they have to learn many repos and local setups. Feature work needs more team coordination when one change touches several services.

On-call load is where many teams feel the pain first. In a monolith, one deploy process and one alert path can cover a lot of ground. In microservices, each service tends to collect its own rules, dashboards, and edge cases. Even if each service is small, the total operational noise wears people down. Then you either pay with burnout or you pay by hiring earlier.

Training costs rise too. A new engineer can often run a monolith with one repo, one database, and a short setup guide. In a microservices setup, that same engineer may need five repos, local containers, test data, service credentials, and a mental map of who calls what. Something that takes one afternoon in a monolith can take a week to feel normal.

That is why many experienced CTOs keep the first system simpler than the org chart suggests. If you only have a handful of engineers, paying for broad problem-solvers usually beats paying for distributed-system overhead.

How releases change at 5, 10, and 20 engineers

Release pain changes shape as the team grows. The fix at five engineers is different from the fix at twenty.

With a 5-person team, one release train usually wins. Everyone still knows most of the code, changes touch the same areas, and a shared rhythm keeps surprises low. If you split into services too early, you often trade one queue for five smaller ones. Now each release needs version checks, API coordination, and extra debugging when something breaks across boundaries.

Around 10 engineers, the monolith starts to feel heavier, but the pain usually starts with boring stuff. Tests take too long. Pull requests sit open for days. Two teams keep editing the same files. That is frustrating, but it still does not prove you need microservices. Many teams get real relief from faster CI, better test selection, clearer code ownership, fewer manual release steps, and smaller deploys.

By 20 engineers, ownership starts to matter more than pure code size. You can still ship from a monolith, but only if teams know which areas they own and can change them without waiting on three other groups.

This is where service boundaries can help, but only when they match real team boundaries. If one team owns billing, another owns auth, and another owns analytics, separate services may reduce release collisions. If the domains are still tangled, services just hide the same mess behind network calls.

Release pain is a clue, not a verdict. A bad release process can come from weak tests, slow builds, unclear ownership, or too many handoffs.

A small example makes the point. If a 10-person team waits two hours for tests and merges everything on Friday, the first fix is not service extraction. The first fix is faster feedback and smaller releases. If a 20-person team has three product areas with separate roadmaps and frequent cross-team blocking, then architecture may need to change with the org chart.

How to decide step by step

Need a second opinion
Ask Oleg to review your architecture before one split turns into months of cleanup.

Most teams do better when they treat this as a sequence of small tests, not a one-time architecture bet. For a young product, the right direction usually gets clearer when you look at change, failure, and release pain in that order.

Start with the parts of the product your team touches every week. If billing, onboarding, search, or reporting changes in almost every sprint, write that down. Frequent change matters more than neat diagrams, because the busiest part of the codebase is where structure starts to help or hurt.

Then look at the last few incidents and slow releases. Use facts, not opinions. Which module breaks most often? Which area makes deploys take longer because tests run forever, migrations feel risky, or two teams keep stepping on each other?

A simple scorecard helps:

  1. List the product areas that changed in the last 6 to 8 weeks.
  2. Mark the ones that caused incidents, rollbacks, or long release waits.
  3. Check whether any one area needs very different scale, data rules, or uptime.
  4. If one boundary is obvious, split that single part first.
  5. Measure release speed, failure rate, and team friction before you split more.

That third step matters a lot. A service boundary makes sense when one area really lives by different rules. Maybe your public API needs stricter uptime than an internal admin tool. Maybe file processing needs heavy compute while the rest of the app does not. Maybe customer data rules force one part of the system to be handled more carefully. If none of that is true, splitting early often buys you extra work, not relief.

Keep the first split small and boring. Do not carve the whole app into six services because it looks cleaner. One boundary, such as billing or document processing, is enough to learn whether separate deploys actually reduce pain.

Then watch the result for a few weeks. Did releases get easier? Did on-call improve? Did one team move faster without making debugging worse? If the answer is mixed, keep the rest together.

That is where many teams slip. They see one successful extraction and rush into a full breakup. Resist that urge. Keep most of the system in one place until the new shape proves itself through fewer incidents, simpler releases, and less team drag.

A realistic team example

Picture a small SaaS company with one web app for customers, a billing flow tied to subscriptions, an admin area for support, and a reporting screen that shows usage and revenue. The team starts with 5 engineers, then grows to 8.

At that size, one repo usually keeps life simpler. Everyone can run the product locally, change the UI and backend in one branch, and ship a fix without opening four pull requests in four services. When a customer reports a billing bug, one engineer can trace the request from the web app to the payment code and the database in the same codebase. That saves time every week.

This is where the debate stops being theory. With 8 engineers, the main pain is usually delivery speed, not service isolation. A single app with clear folders, code owners, and a decent test suite often beats a pile of small services with separate pipelines and hard-to-debug network calls.

Now move the same company to 15 engineers. The product has more customers. Reports take longer to generate. Nightly jobs import data, build summaries, and send account emails. That is often the first clean split.

Reporting and background jobs can move out before the rest of the system. They have different load patterns, and if a reporting job runs slowly, you do not want it slowing down the customer-facing app. A separate worker or reporting service can make sense here because the boundary is real: different runtime needs, different failure patterns, and fewer daily code changes touching both sides.

Auth, billing, and the core product usually stay together longer. They change together all the time. A pricing update may touch permissions, plan limits, checkout logic, and account screens in one release. Split those parts too early and one small product change turns into coordination work across teams.

A good first move is boring on purpose: keep one repo, keep one main app, and split the noisy parts only when they have their own workload and their own release rhythm.

Mistakes that create pain fast

Bring AI into delivery
Set up practical AI assisted development, code review, testing, and docs for your team.

Early teams often copy the diagram of a much larger company. That looks smart on a whiteboard, but it creates work before the product even settles. You end up with extra deploys, more config, more alerts, and more places for simple bugs to hide.

A startup with six engineers does not need the same service map as a company with six hundred. If one feature changes every two weeks, keep the code close together until the shape of the product stops moving so much.

One mistake creates a lot of release pain: teams split services by org chart instead of code boundaries. A "growth service" or "mobile backend" may match today's team names, but those names change fast. The user flows people depend on change more slowly. If one signup flow touches four services because four teams own pieces of it, every release turns into coordination.

Another common mess starts with async messaging too early. Queues and events can help later, but they make debugging harder on day one. If your tests are weak, logs are hard to search, and rollbacks take guesswork, async flows will hide failures instead of isolating them.

The usual warning signs are easy to miss. Teams add events before they can trace one request from start to finish. They treat a slow deploy as proof that the monolith is wrong. They split a service, launch it, and leave ownership fuzzy. They keep adding services even when one database change still touches them all.

Slow deploys often come from boring problems. Builds run too much. Migrations lock tables. Test suites sprawl. CI jobs wait in line. Fix those first. Oleg Sotnikov has shown in production that careful CI/CD and lean infrastructure choices can remove a lot of release pain without a full rewrite.

Ownership matters even more after a split. Each service needs one team or one person who watches errors, updates dependencies, and knows the rollback steps. If nobody owns it, the service turns into a small abandoned monolith, except now it can fail over the network too.

Quick checks and next steps

Start with what your team can do this week, not with a diagram you hope will make sense later. The better option is often the one your current team can run without stress.

Before you change anything, ask a few plain questions. Can one team deploy safely every day without a long freeze, manual checklist, or late-night babysitting? Can a new engineer get the app running, make a small change, and ship it within a day? Can you trace one customer request from start to finish without guessing which service failed? Can you name one architecture change for the next quarter that solves a real pain instead of a future fear?

If two or three answers are "no," stop talking about a bigger rewrite. Fix the sharp edge in front of you. Maybe the release process is the problem, not the monolith. Maybe logs are weak, local setup takes four hours, or only one person understands deploys.

A good next step is small and specific. Move one noisy background job out of the app. Add request tracing across the existing codebase. Clean up module boundaries inside the monolith. Those moves teach you more than a big migration plan.

New hires are a useful test. If a backend engineer needs a week to understand where code lives, how to run it, and how to debug one failed request, your architecture already costs too much. If they can ship a small fix on day one, you still have room before a split makes sense.

An outside review can save months. Someone who has seen both startup chaos and production scale can usually tell whether you have a real scaling problem or just a messy release process. Oleg Sotnikov does this kind of advisory work through oleg.is, especially for startups and smaller teams that need a practical second opinion before making a costly architecture change.

Write down one change, one owner, and one deadline for the next quarter. Then measure the result. Did deploys get safer? Did onboarding get faster? Did incidents get easier to debug? Those answers matter more than the label you put on the architecture.

Frequently Asked Questions

Should a team under 10 engineers start with a monolith?

Usually, yes. If your team has fewer than 10 engineers and the product still changes every week, a monolith often gives you faster shipping, easier debugging, and simpler onboarding. You can still keep clean module boundaries inside it so you do not trap yourself later.

When do microservices start making sense?

Split when one part of the product starts living by different rules. Good first candidates include billing, file processing, reporting, or background jobs that have their own load, release rhythm, and owner. If the whole app still changes together, keep it together.

Does a monolith make outages worse?

A monolith often fails all at once. That hurts, but your team can usually trace one codebase, roll back one release, and restore service faster than they could untangle several partial failures across services. For a small team, obvious pain often costs less than murky pain.

Why do microservices feel slower for small teams?

Because every service adds network calls, contracts, deploy steps, logs, alerts, and version drift. A change that once stayed inside one app now crosses service boundaries and creates more places for timeouts, stale data, and coordination delays.

How does architecture affect hiring costs?

Hiring changes earlier than most teams expect. A monolith lets you hire strong generalists first and get them productive fast. Microservices push you toward engineers who already know tracing, queues, service contracts, and distributed failures, so hiring and ramp-up take longer.

What usually causes release pain around 10 engineers?

At that size, slow releases usually come from slow tests, large pull requests, weak ownership, and too many manual steps. Fix CI, test selection, deploy flow, and code boundaries first. Many teams get real relief there without splitting the system.

What should we split first if we decide to split anything?

Start with the noisiest isolated workload, not the most important feature. Background jobs, report generation, document processing, or AI task runners often make good first splits because they have different runtime needs and fewer daily changes from the main product flow.

Can we keep one repo and still prepare for growth?

Yes. One repo can work well for a long time if you keep modules clear, name owners, and avoid tangled dependencies. You do not need many services to prepare for growth; you need code that engineers can understand and change without stepping on each other.

How do I tell whether our problem is architecture or just a messy release process?

Look at the last 6 to 8 weeks of work. Check which areas changed most, caused incidents, or slowed releases, and ask whether one part truly needs different scale or uptime. If one boundary keeps showing up, split that part. If not, fix your release process and local setup first.

Should we get an outside review before a rewrite?

Yes, especially if your team keeps talking about a full rewrite. A second opinion often shows whether you have a real boundary to split or just weak CI, poor tracing, or fuzzy ownership. That kind of review can save months of churn and a lot of avoidable complexity.