Jan 20, 2026·7 min read

AI migration script checklist for safer production changes

Use this AI migration script checklist to review locks, rollback steps, backfills, and edge cases before any model changes production tables.

Table of Contents

Why AI-written migrations break in production

A model can write SQL that looks clean and still freeze a busy table for 40 seconds. That is enough to pile up requests, block writes, and turn a normal deploy into an outage. The script may even work perfectly on a small test database, which makes the risk easy to miss.

The usual problem is simple: the model sees schema rules, not the full life of the app. It can suggest adding a column, changing a type, or filling missing values in one shot because the SQL is valid. On a live system, that same change can lock a table, force a full rewrite, or push storage and CPU hard enough to slow everything down.

Table size changes the risk more than many teams expect. A change that finishes in a second on 50,000 rows can take minutes on 50 million. Indexes make that gap even wider. So do foreign keys, triggers, and old rows with messy data. AI writes for the happy path. Production databases carry years of edge cases.

Traffic patterns matter just as much. A model does not know that your app gets a spike every morning at 9, or that one background job updates the same table all day. It usually will not ask which queries hit that table, how long transactions stay open, or whether workers retry on lock timeouts. Those details decide whether a migration is safe.

Backfill work adds another trap. AI may suggest updating every row in one statement. That can look tidy in review and still swamp live traffic, blow up replication lag, or keep locks open far too long. Small batches are less elegant on paper and much safer in real life.

Use AI as a fast draft writer, not as the final reviewer. Review first, test on realistic data, check lock risk, and write the rollback before anyone touches a production table.

What to gather before the model writes SQL

The checklist starts before any prompt. Models write better drafts when you give them facts instead of guesses. They cannot know whether a table has 20,000 rows or 200 million, whether writes spike every morning, or whether one old job still depends on a column you want to rename.

Put the basics in one note before anyone opens Claude, GPT, or another coding tool. Include the table and columns involved, the current row count, hourly read and write traffic, the code paths that still depend on the old schema, and whether you can deploy live or need a maintenance window. This takes a few minutes and often saves hours later.

Row count changes the whole plan. Adding a nullable column to a small table is often easy. Rewriting a large table, changing a type, or filling a new column from old data can turn into a long operation with locks, lag, or timeouts.

Traffic patterns matter just as much. A change that looks harmless at midnight can hurt badly at 10 a.m. if the app writes to that table all the time. Pull a real hourly view from database metrics or logs. If your busiest hour handles 30 times more writes than your quietest one, the model needs that context.

Code dependencies are where teams get surprised. A migration does not only touch the database. It can break background workers, admin exports, analytics queries, mobile app versions, and one-off scripts nobody remembers until they fail. Write down the exact code paths that depend on the old table or column names.

Decide on the maintenance window early, not after the SQL is already written. If the change needs a short pause in writes, say that up front. If the app must stay fully online, tell the model to produce a phased, backward-compatible plan.

A step-by-step review flow

Start with one plain sentence that defines the change. If the team cannot say it clearly, the model will guess. "Add a nullable customer_note column to orders and fill it for new rows only" is clear. "Update orders for better notes support" is not.

That sentence becomes the test for every later choice. It keeps the review focused and stops the model from mixing schema work, data fixes, and cleanup into one risky script.

Use this review flow before any SQL gets near production:

Ask the model for at least two safe approaches. One option might add a new column and backfill later. Another might create a new table shape and move traffic in stages. A single query often hides bad trade-offs.
Read every statement and ask what it locks, for how long, and on which table. Even a small ALTER TABLE can pause writes if the engine needs a stronger lock than expected.
Split the plan into separate steps. Change the schema first. Run any data backfill in batches later. Clean up old columns, indexes, or defaults only after the app proves the new path works.
Make sure each step has a clear stop point. If step two fails, the team should know whether to pause, retry, or roll back without making the next step harder.
Run the full plan on a recent copy of production data. Old staging data lies. Recent data shows row counts, nulls, odd strings, duplicate values, and slow queries the model never saw.

A small example helps. Say you need a new non-null status column on a large orders table. The unsafe version adds the column with a default, rewrites the whole table, updates every row, and drops old logic in one go. The safer version adds it as nullable, deploys app code that writes both columns, backfills in batches, verifies counts, and only then tightens the constraint.

That is the point of this checklist: let the model suggest options, then have a human cut the plan into calm, testable pieces.

Check lock risk first

Before reading the SQL line by line, ask what locks the change will take and how long they might stay in place. A migration can look harmless and still block writes, stall reads, or pile up waiting queries behind one slow statement.

AI often gets the syntax right but misses the runtime cost. That matters most on busy tables. If the model suggests an ALTER TABLE change, do not assume it is cheap just because the statement is short.

Start with the lock level. Some changes take an exclusive lock, even if only for a moment. On a quiet table, that may pass unnoticed. On a table that handles constant inserts or updates, a brief exclusive lock can turn into a real outage because sessions queue up fast.

Then check for work that rewrites the whole table. Adding a column with a heavy default, changing a data type, or forcing row-by-row updates can touch every record. Long index builds can cause similar pain if the SQL does not use the online or concurrent option your database supports.

Before approving the plan, answer five basic questions: does any statement block reads, writes, or both; will the database rewrite the full table; how large is the table right now; how many writes hit it each minute; and which step will run the longest?

That last question matters more than teams think. If one step may run for 20 minutes on production data, plan around that number, not the 5-second steps before it. Check row count, index size, and recent write volume. If staging data is close to production size, time the migration there and add a safety margin.

If the risk still looks high, change the plan instead of hoping production will be fine. Split the migration into smaller steps, create indexes separately, or move the change to lower-traffic hours. Nighttime is not always quiet, so use actual traffic patterns, not guesses.

This part is boring, but it prevents the worst kind of failure: a migration that technically works while the app locks up around it.

Write rollback steps before deployment

Review Your Migration Plan

Get a senior check on locks, backfill steps, and rollback before deployment.

Book Review

A rollback plan is part of the change, not something you add after the SQL works. Production failures rarely happen in a clean, all-or-nothing way.

Start by splitting each action into two groups: reversible and one-way. Adding a nullable column or creating a new table usually has a simple reverse step. Dropping a column, rewriting data in place, or shrinking a type often does not. When a step is one-way, say that clearly and name the fallback, whether that means restoring a backup or switching the app back to the old code path.

Write the reverse steps before the forward steps. It sounds strict, but it saves time. If you cannot explain how to undo phase two in a few clear lines, phase two is probably too large for one deployment.

A workable rollback plan should say where you can stop safely between phases, which app version works with both the old and new schema, which command or script undoes each reversible step, and what the team will do if the migration only partly finishes.

Partial success is common. A table change may finish while the app deploy fails. A new column may exist while the data copy stops halfway through. Plan for that middle state on purpose. Keep old reads working until the new path is proven. Keep old writes, or dual writes, until you confirm the new data is complete.

Clear stop points matter more than clever SQL. A simple phased plan is easier to reverse: add the new structure, deploy code that can handle both versions, move data, switch reads, then remove the old path later. Each phase needs a checkpoint where the team can pause, verify, and either continue or back out.

This is where AI often gets overconfident. A model can write valid forward SQL and still miss the human question that matters most: "If this stops halfway, what do we do in the next 10 minutes?" If the script does not answer that, it is not ready for production.

Plan data backfill without hurting live traffic

Backfill work is where many safe-looking migrations turn into production pain. The schema change may finish fast, but one giant update on a busy table can chew through CPU, fill write-ahead logs, slow replicas, and block normal queries.

Tell the model to avoid all-at-once updates unless the table is tiny and quiet. On live systems, small batches are usually safer because you can stop, inspect, and resume without turning a bad guess into an outage.

Keep the backfill small and observable

A good backfill job moves through rows in a predictable order, usually by primary key or creation time. It updates a limited batch, commits, waits a moment if needed, and then continues.

That gives you a few numbers worth watching while traffic stays live: rows updated per batch, query time for each batch, replica lag, lock waits or deadlocks, and the app error rate.

Pick a batch size you can explain, not one the model invented because it sounded reasonable. Start small, measure the real effect, and increase only if the database stays healthy.

Late writes matter too. While the script copies old rows, the app may still create new ones. If the new column or table keeps changing during the backfill, plan for dual writes, a second catch-up pass, or both.

A simple example with an orders table

Put Guardrails Around AI

Get practical help using AI for code and infra without handing over production decisions.

Book Advisory

Say you want to add a new billing_country field to an orders table. An AI tool may jump straight to NOT NULL because the app "needs" the field. That is where production trouble starts. Old rows do not have a value yet, and a fast-looking change can fail or block writes.

A safer version starts with a nullable column. That keeps the table usable while you move data in stages.

ALTER TABLE orders ADD COLUMN billing_country text NULL;

Now the app has room to change without breaking old data. New orders can start writing billing_country, while old orders still work.

If you already store similar data somewhere else, such as shipping_country, backfill in small batches. Do not try to update millions of rows in one shot. Small batches cut lock time, reduce load, and make it easier to stop if something looks wrong.

A simple batch approach looks like this:

UPDATE orders
SET billing_country = shipping_country
WHERE billing_country IS NULL
  AND id > 100000
  AND id <= 101000;

Run that pattern over ranges until you finish. Watch error rates, query time, and write latency while it runs. If traffic spikes, pause and continue later.

The app change matters just as much as the SQL. For a while, the app should write both the old source field and the new billing_country field. That overlap period gives you a safety net. If a job misses some old rows, new orders still stay correct.

Before you add a constraint, check the data instead of assuming the backfill worked. Count rows where billing_country is still null. Look for odd cases too, like blank strings, test orders, imported records, or canceled orders that skipped the normal flow.

When the column is clean and the app has written the new field long enough to prove it works, then add the constraint.

ALTER TABLE orders
ALTER COLUMN billing_country SET NOT NULL;

Slow, boring migrations usually beat clever ones when production tables are involved.

Mistakes teams make with AI-generated migrations

Review AI Migration SQL

Check real production constraints before one clean query turns into downtime.

Review SQL

Teams get burned when they treat model output like finished work. The SQL may look clean, but production databases have habits that a generic prompt will miss. PostgreSQL, MySQL, and managed cloud setups all behave differently when you add columns, rebuild indexes, change defaults, or touch large tables.

One common mistake is trusting generated SQL without checking database-specific details. A model might suggest a statement that works on a test instance but causes locking pain in the real setup. Small differences matter, like how indexes build, how replicas catch up, or how long a metadata lock lasts under load.

Another mistake is packing a schema change and a data backfill into one migration. That looks neat in a pull request. It feels terrible in production. If one deploy both adds a column and updates millions of rows, you can slow writes, create replica lag, and keep transactions open longer than expected.

Teams also forget that production has traffic, long-running queries, and replicas. A migration that finishes in 12 seconds on staging can drag on for minutes when an old transaction blocks it. During that time, queue workers pile up, app requests wait, and read replicas drift behind.

Rollback plans are often too optimistic. After a migration changes data, rollback is rarely a fast reverse button. If you rewrite values, split columns, or drop old fields too early, the old state may be gone. You may need a forward fix, a restore, or manual cleanup instead.

Tiny test datasets hide ugly surprises. Ten thousand rows will not show the same behavior as ten million. They also miss edge cases like nulls in old records, duplicate values, broken foreign keys, or old app versions still writing the previous shape.

A short review catches most of this. Check that the SQL matches your exact database engine and version. Split schema changes from heavy backfills. Ask what happens to replicas and long transactions. Assume data-changing rollbacks will be slow or incomplete. Test with realistic volume, not toy data.

Bad migration scripts rarely look reckless. They look reasonable right up to the moment they hit a busy table.

Quick checks and next steps

A migration can look safe in staging and still freeze writes in production for 30 seconds. That is enough to cause timeouts, queue buildup, and a long night for the team.

Use this checklist as a stop sign, not a formality. If any answer feels vague, slow down and fix the plan before the model touches live tables.

Ask how long the change can block reads or writes. Adding an index, changing a column type, or rewriting a large table can hold locks longer than expected.
Check whether you can stop after each step without damage. Good migrations have clean pause points, so you can deploy part one, watch the system, and continue only if it stays healthy.
Test on data that looks like production, not a tiny sample. A query that finishes in 2 seconds on 10,000 rows may run for 20 minutes on 80 million.
Look for ugly data before you run anything. Nulls, duplicates, broken defaults, and late writes often break the script, or worse, leave half-correct data behind.
Make sure rollback is real. If step three fails, the team should know exactly what to undo, what to keep, and how to confirm the database is consistent again.

One small habit helps a lot: write down the expected row count, runtime, lock risk, and fallback step next to each SQL statement. That forces clear thinking and makes review much faster.

Human review still matters, even with a strong model. AI can draft SQL fast, but it cannot judge production risk the way an experienced engineer can.

If your team wants a second review before a production run, Oleg Sotnikov at oleg.is does this kind of work as a Fractional CTO and advisor. A short review before deployment is usually cheaper than fixing a bad migration after users start seeing errors.

Frequently Asked Questions

Why does an AI migration work in staging but fail in production?

Because staging hides the hard parts. Production has bigger tables, live traffic, long transactions, old data, and replicas. A query that finishes fast on a small test set can lock a busy table or hammer the server on real data.

What should I tell the model before it writes SQL?

Give it facts, not guesses. Include row count, read and write volume, table and column names, database engine and version, code paths that still use the old schema, and whether you allow a maintenance window. With that context, the draft usually gets much closer to a plan you can review.

Should I put schema changes and backfill in one migration?

No. Split them. Change the schema first, then backfill in batches, then remove old parts after the app proves the new path works. That keeps each step smaller and easier to stop or reverse.

How do I check lock risk quickly?

Ask a simple question for every statement: what does it lock, and for how long? Then check table size, write rate, and whether the change rewrites the whole table or builds an index. If one step might run for minutes, change the plan before deploy.

When do I need a maintenance window?

Choose one when the app cannot stay compatible during the change or when a step blocks writes for too long. Decide early. If the app must stay online, use a phased plan with backward-compatible code and small backfill jobs.

What makes a rollback plan actually usable?

Write the rollback before the forward steps. Name the stop point after each phase, the app version that works with both schemas, the exact reverse command for reversible steps, and the fallback for one-way changes. If you cannot explain the next 10 minutes after a failure, the plan is too large.

How should I backfill a large table?

Run small batches in a steady order, usually by primary key or time range. Commit each batch, watch query time, replica lag, lock waits, and app errors, then continue only if the database stays healthy. Giant updates look neat in review and hurt real systems.

When is it safe to add a NOT NULL constraint?

Add the new column as nullable first and let the app write it for new rows. Backfill old rows, check for nulls and bad values, and wait until the app handles the new field without trouble. Then add NOT NULL as a separate step.

Which edge cases do teams miss before a migration?

Look for nulls, blank strings, duplicates, broken foreign keys, odd old records, and late writes that arrive during the backfill. Also check background jobs, exports, analytics queries, and old app versions. Those forgotten paths break migrations more often than the SQL does.

Is AI still useful for production migrations?

Yes, as a draft tool. It helps you generate options and rough SQL fast, but a human still needs to review locks, traffic, rollback, and data shape. If your team wants a second pair of eyes before production, Oleg Sotnikov reviews this kind of change as a Fractional CTO and advisor.