Aug 20, 2025·8 min read

Docker Compose to Kubernetes migration with clear rules

Docker Compose to Kubernetes migration works better when teams set ownership, deploy rules, and rollback paths before they split live environments.

Table of Contents

Why this gets messy fast

A Docker Compose to Kubernetes migration gets confusing the moment both runtimes stay live at the same time. One team deploys a fix to Kubernetes, another restarts the old Compose stack, and now nobody knows which version users are hitting. The confusion usually starts before anyone notices.

The biggest problem is simple: the same service ends up running in two places. An API might handle public traffic in Kubernetes while the old Compose copy still runs a cron job, webhook, or internal admin task. Workers can process the same queue twice. Scheduled jobs can fire twice. A database migration can run from the wrong place and break both environments in one shot.

Shared dependencies make it worse. Secrets change in one runtime but not the other. One team updates environment variables in Compose while another edits ConfigMaps or Helm values in Kubernetes. Someone points the app to a new Redis instance, but an older container still talks to the old one. The system looks "mostly fine" right up until those small mismatches turn into a real outage.

This happens fast. A small team might move the public API to Kubernetes first, leave workers in Compose, and keep the same database behind both. Then a hotfix goes only to Kubernetes. The API starts writing data in a new format, but the old worker in Compose still expects the old format. Jobs fail, retries pile up, and the team burns hours trying to figure out whether the problem is in code, config, or traffic routing.

Outages usually start at the edges of the split. Health checks hit one version while real users hit another. Logs live in two places. Rollbacks drag on because nobody agrees which environment actually owns the service.

Each service needs one clear home during the split. One runtime owns it. One deploy path updates it. One rollback path restores it. If a service still depends on the other environment for a while, treat that as a temporary exception and write it down.

Decide who owns each service

Most migration problems are not about YAML. They are about people.

If two teams both think they own a service, nobody makes the hard call when a deploy fails at 6 p.m. Shared ownership sounds cooperative, but during a split it usually turns into waiting, guessing, and half-made decisions.

Before anything moves, assign one named owner to every service. A simple ownership sheet is enough. For each service, record the service name, one direct owner, who approves config changes, who leads incident response, and who can approve a move between Compose and Kubernetes.

Keep the rule strict: one service, one owner. Other people can review or help, but one person decides. That removes the usual back-and-forth when a service needs a port change, a secret update, or a quick rollback.

Config approval needs its own line because it often belongs to someone else. A backend lead may own the API, while an operations lead approves memory limits, secrets, and ingress changes. If you do not write that down, small edits turn into waiting games.

Incident ownership matters just as much. When a service fails, the team should know who joins first, who checks logs and metrics, and who can pause a rollout. During parallel environments, part of the stack still runs in Compose while another part runs in Kubernetes. Someone still has to own the outcome, even when the root cause sits on the other side.

The move itself also needs a gatekeeper. Do not let anyone shift a service just because the manifest looks ready. The person approving the move should check dependencies, traffic paths, secrets, health checks, and rollback steps.

A small team can do this without making it bureaucratic. One startup might assign the payment API to the backend lead, database config to the CTO, incident handling to the on-call engineer, and cut-line approval to one person only. It sounds strict, but it prevents drift.

Keep this ownership list in one shared document and update it every time something changes. Old assumptions break migrations more often than bad manifests.

Set deploy rules before the split

Parallel environments go bad when teams ship the same change in two places.

During a Docker Compose to Kubernetes migration, drift usually starts with tiny decisions. One team adds a flag in Compose, another changes a secret in Kubernetes, and after a week nobody can say which setup is correct.

Pick where new work goes before you move anything. One side should be the default home for new changes. Many teams send all new features to Kubernetes and limit Compose to bug fixes for services that still live there. That rule feels restrictive, but it cuts release mistakes and stops people from rebuilding the same behavior twice.

Environment variables also need one source of truth. Keep one approved list with variable names, defaults, and owners. Use that list to feed both environments. If a setting exists only in a Compose file or only in a chart, fix it right away. Small naming differences like REDIS_URL on one side and CACHE_URL on the other waste hours.

Be explicit about dual deploys too. Do not let people push to both sides because it feels safer. Limit that to a few named cases:

a planned traffic cutover
a data migration window
a rollback drill
an emergency fix after a failed release

Everything else should deploy only to the runtime that owns the service.

Release approval needs one home as well. Choose one place where a release becomes official, and stick to it. That might be a GitLab pipeline, a release ticket, or one CI job with a manual approval step. If one person approves Compose changes in chat while another approves Kubernetes in a dashboard, the process is already split.

Say the API now runs in Kubernetes but the worker still runs in Compose. The team should still approve both through the same release flow, use the same variable list, and record one rollback decision. Separate runtimes are hard enough. Separate rules make them worse.

Mark the shared parts first

Most migration trouble starts outside the containers.

The app moves, but the database, queue, cache, secrets, logs, and scheduled jobs often stay where they are for a while. If you do not map those shared parts early, both environments will change the same thing and nobody will notice until production breaks.

Start with the pieces that both sides still touch. That usually means the main database, Redis or another cache, message queues, secret values, and any shared storage. Write down which service reads from each one, which service writes to it, and who can change its settings. One owner per shared resource is a good rule.

Take a common example. The API still runs in Docker Compose, but the background worker moves to Kubernetes. Both use the same Postgres database and the same Redis instance. If the worker deploy adds a schema change or clears cache entries in a new way, the API can fail even though nobody touched the Compose stack. Schema changes, cache rules, and secret rotation need named owners before the split starts.

Do not skip the less obvious items. Shared files cause quiet failures all the time. A report generator may still write to a mounted folder that only the Compose host can see. A nightly cleanup job may run from cron on one server while a new Kubernetes Job tries to do the same work again. That is how teams end up with duplicate emails, deleted files, or missing data.

Keep a short record for each shared part:

what it is and where it lives
which services depend on it
who owns changes during the overlap
whether cron jobs or shared file paths touch it
which logs and metrics should match on both sides

Logs and metrics need the same care. If Compose sends logs one way and Kubernetes sends them another, you lose the full story during an incident. Use the same service names, alert rules, and basic labels where you can. That makes rollback much easier because both environments report the same signals.

Move services in a fixed order

Stop Drift Early

Get help setting one deploy path, one config source, and one owner per service.

Get Help

A split migration usually fails when teams move too much at once.

Pick one low-risk service first and keep the scope tight. Good first candidates are a reporting worker, an internal admin API, or a cron job that does not handle logins, payments, or core writes.

Freeze unrelated changes around that service while you move it. If developers keep shipping features at the same time, you stop testing the migration and start testing random code changes instead. Even a simple Docker Compose to Kubernetes migration gets noisy when the app changes under your feet.

Move the service in staging before you touch production. Run the same image, the same config, and the same health checks there. If staging traffic is too small to show real issues, run a controlled load test that looks like normal use.

A simple order works well:

Move one service with low business risk.
Leave its dependencies where they are on the first pass.
Test in staging until logs and metrics stay calm.
Shift a small slice of production traffic.
Wait for a steady period, then retire the old route.

After cutover, watch three things closely: errors, latency, and data writes. Errors tell you when requests fail. Latency shows whether the new path slows down under load. Data writes matter because a service can look healthy while quietly writing bad records or writing twice.

Picture a startup moving its email notification worker from Compose to Kubernetes. That is much safer than moving auth first because delayed emails hurt less than locked-out users. The team freezes changes to templates and queue logic, tests the worker in staging, then sends only part of the production queue to Kubernetes. If delivery time and failed jobs stay normal for a few days, they can switch the rest.

Keep the old path alive until the new one has handled a full business cycle without surprises. That should mean normal traffic, normal write patterns, and no manual fixes. Only then should you remove the old deploy job and call the move done.

A simple split example

Imagine your /api/orders service runs in Docker Compose on one VM, and you want to move only that API to Kubernetes first. That is a sensible early split because the service is small, easy to test, and usually sits behind one public endpoint.

Keep the endpoint unchanged. Clients should use the same domain and path before, during, and after the move. Put the traffic split in your proxy or ingress, not inside the app. That keeps the change easy to control and easy to reverse.

Leave the database where it is for the first release. Let both versions of the API talk to the same database with the same schema, as long as the new code stays backward compatible. Moving the app and the database at the same time creates two failure points, and that is where migrations go sideways.

Start with a small traffic slice to Kubernetes. Five percent is enough to learn something real without risking the whole service. The old Compose version still handles most requests, so rollback stays simple.

Compare both sides with the same signals. Check error rates, response times, restarts, and request volume. Read the logs too. A service can look healthy on a dashboard while filling logs with timeouts or retry storms. If your team already uses tools like Grafana, Prometheus, or Sentry, put both versions on the same view so differences are obvious.

Raise traffic in steps only if the numbers stay close. Move from 5 percent to 25 percent, then 50 percent, then all traffic. If the Kubernetes version starts throwing more 500 errors or gets slower under load, send traffic back to Compose first and debug after. A rollback should be one routing change, not a late-night rebuild.

Turn off the old Compose service only after the Kubernetes version has handled full traffic for a few days without surprises. Check background jobs, health checks, and scheduled tasks before you shut the old one down. That final cut matters more than the first deploy.

Build rollback paths you can use

Get Fractional CTO Support

Bring in Oleg for migration planning, architecture calls, and steady technical leadership.

Book CTO

A rollback plan is useless if the team has to argue about it during an outage.

Before you move live traffic, write a short path back to the old setup. Put the switch-back steps in the exact order the team will follow. Name the person who can make the call, who changes traffic, who checks the app, and who watches logs and errors. If traffic moves through a load balancer, proxy, or DNS record, say which setting changes first and how long the change usually takes.

A practical rollback note often covers four actions:

send traffic back to the Compose environment
deploy the last known good image tag there
restore the previous config or secret set
confirm health checks, login, and one real user flow

Keep old images, env files, secret references, and Compose manifests in a place the on-call team can reach quickly. Do not leave them on one engineer's laptop or buried in a chat thread from last month. If access needs approval, fix that before cutover day.

Set a decision timer too. Pick a limit and stick to it. For example, if the error rate stays high for 15 minutes, or checkout fails in two consecutive checks, the team rolls back. Without a clear threshold, people wait, hope, and lose an hour while customers hit a broken system.

Practice one rollback before the first live cutover. Use staging if you have it, or run a controlled test with low-risk traffic. Start on Kubernetes, trigger a fake problem, and switch back to Compose while someone follows the written steps line by line.

That rehearsal usually exposes the real gaps. A secret is missing. An old image tag is gone. DNS takes longer than expected. It is much better to learn that on a quiet afternoon than during a release window.

Mistakes that create drift

The hardest part of a Docker Compose to Kubernetes migration is not the first move. It is the slow drift that starts after it.

One team updates the Compose file, another tweaks a Deployment, and a month later nobody trusts either environment. The service still works, so the mismatch stays hidden until release day. Then staging behaves one way, production behaves another, and rollback turns into guesswork.

Database changes do even more damage. Teams often change schemas before they move traffic because it feels faster. Usually it is not. If Kubernetes expects a new column but Compose still runs the old app, you now have two versions depending on one database in different ways. Even a small change, like making a field required, can break workers that still run on the old side.

Background jobs are another blind spot. Web traffic gets attention, but scheduled tasks, queues, and workers often get left in a gray area. If nobody writes down who owns a job, both environments may run it. That leads to duplicate emails, double billing, repeated imports, or two cleanup tasks fighting over the same records.

Early warning signs are usually obvious once you know what to watch for:

hotfixes land in only one config
secrets use different names or values
one side runs workers and the other side assumes it does
database migrations ship before traffic moves

Secrets drift breaks rollback faster than most teams expect. A rollback only works if the old environment can still start with the same credentials, endpoints, and feature flags it had before. If the Kubernetes side gets a new secret format, rotates a token, or points to a different bucket, the Compose side may come back up broken.

A small ownership file solves a lot of this. Keep one short record for every service: who can edit it, where jobs run, which secrets it uses, and which config is now read-only. If a change does not name one owner, it should not ship.

A short release checklist

Plan Your Cutover

Book a CTO review of service ownership, traffic shifts, and rollback before the split starts.

Book Review

Release day gets chaotic when one service still lives in Docker Compose and another already runs in Kubernetes. A short written check keeps the move boring, and boring is exactly what you want.

Use one shared checklist for every release during the migration:

Put one clear owner next to every service. Use a real name or team name, not "platform" or "backend" if that could mean three different people.
Pick one deploy rule for the release. For example, only Kubernetes gets new versions for migrated services, while Compose gets only emergency fixes for services that have not moved yet.
Confirm the target environment is healthy before you send traffic there. Check the app, its dependencies, and the health endpoint you actually use in production.
Verify logs, metrics, and alerts on the target side before cutover. If errors rise, the team should see them within minutes.
Keep rollback steps short enough to fit on one page. If the plan needs a long document and three approvals, nobody will follow it under pressure.

Say the API now runs in Kubernetes while the worker still runs in Compose. The API owner approves the release, the team deploys only through the Kubernetes pipeline, health checks pass, Grafana and Sentry show normal behavior, and the rollback note says who switches traffic back and how. That is enough structure to avoid a messy release.

Print the checklist, keep it in the release notes, or pin it in team chat. If people cannot find it in 10 seconds, they will guess.

Your next steps

Set the cut line before anyone touches production. Write down every service and assign one owner for the next 30 days: Docker Compose or Kubernetes. If both environments can deploy the same service, drift will start almost immediately.

Do this before the first broken release:

write the ownership list and keep it next to your release notes
move one small service first, such as a worker, cron job, or internal API
run that move in a normal release window and record the exact steps
test the rollback on the same day while the details are still fresh

Pick a service that will not hurt the business much if it misbehaves for a few minutes. A background worker is usually a better first move than your main web app. Avoid login, billing, and anything customers use directly until the team has one clean test behind it.

After the test, review what actually broke. Maybe one config value lived in two places. Maybe logs were easy to find in Compose but harder to trace in Kubernetes. Maybe rollback worked only because one engineer remembered an undocumented command. Those details matter more than a polished plan document.

Then tighten the rules. Remove duplicate deploy paths, shorten rollback steps, and decide who can approve changes while both environments run in parallel. A good Docker Compose to Kubernetes migration gets simpler after the first move, not messier.

Some teams can sort this out on their own. Others keep arguing about ownership, release order, or who makes the final call when risk goes up. In those cases, an outside technical lead can help. Oleg Sotnikov at oleg.is works as a Fractional CTO and startup advisor, and this kind of migration planning is a good fit for that role.

One small migration this week will teach your team more than another month of debate.