Nov 08, 2025·8 min read

One local command to prove a change and cut team rework

One local command to prove a change keeps lint, tests, and seed data in sync, so engineers, contractors, and AI helpers stop wasting time on rework.

Table of Contents

Why handoffs break so often

Most handoffs fail in a dull, predictable way. One person says, "it works on my machine," pushes the branch, and moves on. The next person pulls it, runs the app, and hits a lint error, a broken test, or data that does not match the screenshots.

That usually starts with local drift. One laptop has an older lint rule cached. Another has a test database full of leftover records. A third machine still has sample data from last month, so a form shows options that no longer exist. The code might be fine. The environment tells a different story.

Lint rules cause more trouble than people expect. One developer saves files with one editor setup while a contractor uses another. The diff looks small, but the review fills with formatting noise, import order changes, and fixes that have nothing to do with the feature.

Tests fail for the same reason. Teams often assume everyone knows which services to start, which env file to copy, or which records to load before running the suite. Long-time team members carry that knowledge in their heads. Contractors and assistants do not. They follow the repo, miss one hidden step, and lose an hour guessing.

Seed data is often the worst part. If people create test records by hand, they all test against slightly different worlds. One person has an admin account with full access. Another has a half-broken account left over from an old migration. An assistant may generate code that passes against empty data but breaks as soon as real sample records appear.

The cost is easy to see. Review slows down. People leave comments for issues that a local check should have caught in two minutes. The original author makes a fix, pushes again, and the same cycle starts over. A simple feature can bounce between engineer, reviewer, contractor, and assistant three or four times before anyone talks about the actual product change.

A single proof command cuts through that mess. If everyone runs the same lint, test, and seed steps in the same order, the team spends less time arguing with the setup and more time working.

What one command should cover

A good proof command answers one plain question: "Can someone else pull this branch, run one thing, and trust the result?" If that answer changes from one machine to another, people start asking for screenshots, side notes, and manual checks.

The command should run the same checks in the same order every time. Start with the cheap failures and move to the slower ones. Run lint first because style errors and simple mistakes are fast to catch. Run unit tests next because they check logic without much setup. Run seeded app checks last because they show whether the app still works with known data.

That order matters. If lint fails, stop there. Nobody needs to wait for tests and app startup when a missing import or broken format already tells you the branch is not ready. Fail fast, print the error, and exit.

The output should stay short enough that a contractor, a founder, or a new engineer can read it without digging through noise. A small summary works better than a wall of logs. "lint: pass", "tests: pass", and "seed check: fail on checkout step" is usually enough. Save the long logs for verbose mode or CI artifacts.

Seeded app checks belong in the command because unit tests do not catch every handoff problem. A page can pass tests and still break when the app starts with fresh data. One seeded run can catch missing migrations, bad defaults, broken permissions, or a form that only fails with real-looking records.

Use one name everywhere. If your laptop uses make prove, CI should use make prove too. If one person runs npm test, another runs a shell script, and CI runs something else, you now have three definitions of done. That is where rework starts.

Build the command step by step

Start with the part your team already trusts. In most codebases, that is lint. People may argue about test coverage or sample data, but they usually agree on formatting, imports, and basic code rules. Put that first. If the existing lint command works, keep it. Do not rebuild it just to make the new script look clever.

Then add tests, but only the ones people can run every day without dreading them. If the command takes 15 or 20 minutes, people will skip it and promise to run it later. Later often never comes. The daily set should finish in a few minutes. That is enough to build trust and still fit normal work.

Some checks need real records to mean anything. A page may load fine with an empty database, then fail the moment a user has a role, a plan, or a few linked records. Load fresh seed data before those checks run. Keep that data plain and predictable. One normal user, one admin, and one or two realistic records per case is usually enough to catch common breakage without turning seed files into a second app.

The script itself should stay simple. Prepare or reset the database, load the same seed data every time, run lint, then run the fast test suite. Wrap those steps in one script instead of pasting three copied commands into chat or a task ticket. People forget flags. They skip setup. They run commands in the wrong order. One script removes that drift.

Name it plainly. check or verify is better than something cute. The goal is zero guessing.

Then test the script on a clean machine. Use a fresh laptop, a new dev container, or a wiped local setup. If the command fails there, it still depends on hidden state. That is the problem you want to find before someone else pulls your branch.

This kind of boring local proof is where a lot of team friction disappears. It is especially useful when engineers, contractors, and AI tools all touch the same feature.

Keep seed data boring and predictable

Seed data should help people verify a change, not turn every handoff into a small investigation. When a teammate, contractor, or assistant runs the same project setup, they should see the same users, the same orders, and the same test results.

Most seed problems come from trying to make fake data feel realistic. That sounds nice, but it creates noise. If names, dates, IDs, or timestamps change on every run, people waste time asking whether the code changed or the data changed.

A good seed set is small and plain. Add only the records your checks need. If your local validation command runs lint, tests, and a quick app smoke check, seed only the data those steps depend on. A giant sample database slows setup and hides the few rows that actually matter.

Use fixed values every time. Pick one customer name, one account ID, one invoice date, and keep them stable. That makes screenshots easier to compare, test failures easier to read, and bug reports much clearer. "User 42 cannot log in" is far better than "some random user created this morning failed in my run."

The reset process matters just as much as the data itself. Every developer should rebuild the database the same way, with the same command and the same order of steps: drop, recreate, migrate, seed. No manual cleanup in the middle. No side note in chat about a table you need to clear first.

A simple rule set works well here. Create only the records your checks need. Hardcode names, dates, and IDs. Use the same reset script on every machine. Remove random generators from seed files. Keep timestamps fixed unless a test truly needs time movement.

Teams using AI coding tools feel this even more. If the seed changes on each run, the assistant may produce different diffs, different snapshots, or different guesses about what broke. Boring data keeps the conversation on the code.

Use the command in everyday handoffs

Set Up One Proof Command

Turn one active codebase into a clean, repeatable handoff workflow.

Start With One Repo

A handoff gets messy when the next person has to guess how to check the work. One proof command removes that guesswork. If the change is ready, the command passes on a clean machine with the expected seed data. If it fails, the work is still open.

Ask for the command output before review starts. That habit saves a surprising amount of time. A reviewer should not spend the first 20 minutes installing a missing package, repairing seed data, or finding out that tests only pass on one laptop.

Keep the rules the same for everyone. Contractors should get the same setup notes as full-time engineers, with the same command, the same seed steps, and the same expected result. Separate instructions create separate bugs.

A handoff note does not need much. Include the exact command to run, the last success output or a short summary, any seed data changes that affect the feature, and anything the reviewer must add to local env settings.

AI assistants need the same guardrails. If you use an assistant to write or edit code, make the task definition simple: the job is done only when the local validation command passes with current seed data. That keeps the assistant from stopping after it fixes the visible bug while leaving lint errors, broken tests, or stale fixtures behind.

Treat a failed run as incomplete work, not as a review task. Reviewers should judge the change, not repair the basics around it. If the command is red, send it back. That sounds strict at first, but it removes a lot of low-grade frustration from the handoff process.

It also keeps status honest. "Works for me" is vague. "Passes locally with seed v12 using make verify" is clear.

I would make this a team norm: no review starts without a passing run. Teams that follow it spend less time on environment issues and more time on the actual change.

A simple example from a feature change

A small form change can waste half a day if nobody checks it the same way. Picture a team adding a new "VAT ID" field to a customer billing form.

The engineer updates the form, adds the database column, and gets the screen working. On a quick manual check, the field appears, saves once, and looks done.

A contractor works on validation in the same branch. They tighten the rule so "VAT ID" must be present and at least 8 characters long. That sounds fine, but one old seed record still uses an empty value, so a fresh local setup now fails during seed.

An assistant adds a regression test for the new field. The test is close, but it includes an unused import and a variable name that breaks the team lint rule. The file looks harmless until someone runs the full checks.

Without a shared command, this turns into review ping-pong. The reviewer pulls the branch, hits a seed failure, asks the contractor to fix it, then notices the test does not pass lint, then finds the form test still fails because the engineer forgot to include vatId in the final submit payload.

With one command, the branch tells the truth before anyone opens review. A single local run such as ./prove-change or npm run prove can check the branch in the same order every time: lint first, then seed a clean database, then run the form test.

The output is blunt, which is exactly what you want. Lint fails on the assistant's unused import. Seed fails on the old empty "VAT ID" record. The test fails because the submitted payload still omits vatId.

Now each person fixes their own part before handoff. The assistant cleans the test file in two minutes. The contractor updates the seed record to match the new validation. The engineer adds the missing field to the request payload and reruns the command.

That is the real value of a single proof command. It does not make the team smarter. It removes guesswork, catches mismatched assumptions, and stops three small mistakes from landing on one reviewer at the same time.

Mistakes that waste time

Use AI Without Extra Noise

Build an AI-augmented development process that still stays predictable on real work.

Plan My Workflow

Most teams do not lose hours on hard bugs. They lose them on avoidable friction. A local check that takes 20 minutes, floods the terminal, and still misses the real issue teaches people to skip it.

One common mistake is stuffing every end-to-end test into the default local run. Full browser flows matter, but they are a bad fit for the proof step people use all day. If a change to a form label or API field forces every engineer to wait for a giant suite, people stop trusting the process and start hand-waving results.

Seed scripts cause a similar mess when they try to recreate half the product. Teams often load dozens of users, old records, fake billing states, and random edge cases into local databases. Then the real failure hides under noise. A small feature change should not depend on a mountain of unrelated data just to show that it works.

A typical case is simple. A contractor updates a signup rule, the test fails, and everyone blames the code change. After 30 minutes, someone notices the seed script still creates an outdated plan record that no longer exists. The bug was in the setup, not in the feature.

Another slow drain is letting CI run a different command than local machines. If a developer runs test-local but CI runs extra setup, a different seed step, and another lint config, the team ends up with two definitions of done. "Works for me" turns into a daily argument.

The warning signs show up early. The local run takes longer than a short review. CI fails on steps nobody ran locally. The same test fails once, then passes on rerun. The output hides the first real error. People paste screenshots instead of sharing command results.

Flaky tests deserve less patience than many teams give them. A flaky test is a broken check. If people rerun until green and move on, they train everyone else to ignore failures too. Fix it, quarantine it for a short time, or remove it from the default path. Do not leave it there as background noise.

Noise in the script matters more than people think. Good output tells you what ran, what failed, and what to open next. That is enough. If your proof command prints 800 lines before the first useful clue, nobody will read it carefully.

The best local validation command is a little boring. It runs the same way on every machine, uses the same seed rules as CI, and fails fast when something breaks. Boring saves time.

Quick checks before you hand work over

Align Your Whole Team

Give engineers, contractors, and assistants the same finish line for every change.

Improve My Team

A handoff should be boring. Someone pulls your branch, runs one command, and sees the same result you saw. If that does not happen, the next hour turns into screenshots, guesswork, and "can you try again?"

Start from a clean branch or a fresh checkout before you send anything over. That small reset catches a lot: forgotten generated files, local config drift, and changes that only pass because your machine still has old state lying around.

This matters even more when you rely on a single proof command. The command should work without your personal shortcuts. If you have to say "ignore that warning" or "run this extra script first," the handoff is not ready.

Give the seed step a quick check of its own. Do not assume it created what your change needs. Open the app or inspect the database and confirm that the exact records exist, especially if the work depends on a specific account type, feature flag, or sample project.

Keep that check simple. If a new settings screen needs one admin user and one team with billing turned on, make sure the seed creates exactly that. Vague seed data leads to strange failures later.

When the command fails, read the first failure, not the last line. The last line often says only that the process exited with an error. The useful clue is usually higher up, where lint first complained, a test first broke, or the seed step first missed a required record.

A quick mental sort helps: did setup break before the app started, did seed data create the wrong state, did lint catch a style or type issue, or did a test fail because behavior changed? That is usually enough to point you in the right direction.

Your handoff note should include the exact command you ran and the exact result. Paste the terminal output, or at least the short success or failure block, instead of writing "all good on my side." That gives the next person something concrete to compare.

A note like this works well: branch checked from clean state, command ran, seed created one demo workspace and one admin user, lint passed, one test failed in billing summary after the copy change. That is enough for another engineer, a contractor, or an assistant to pick up the work without a round of basic follow-up questions.

Five extra minutes here can save a full round of back-and-forth later. That trade is usually worth it.

What to do next

Pick one repo and make the command part of your normal definition of done. Do not start with every project at once. One team, one codebase, and one clear rule works better: if a change cannot pass the local check, it is not ready to hand off.

That rule matters most in mixed teams. Engineers, contractors, and AI assistants all work a little differently. A shared command closes that gap. When everyone runs the same lint, test, and seed steps, reviews get shorter and fewer fixes bounce back later.

Before you spread the workflow wider, trim the slow or shaky parts. A command that fails for random reasons will die fast, even if the idea is good. If one test flakes once a day, fix it or remove it. If seed data takes ten minutes to load, cut it down until people will actually run it.

A simple rollout is enough. Add the command to one active repo. Require it before merge or handoff. Remove unstable steps that waste time. Give one person ownership of the script and its docs. That last part matters more than it sounds. Tools change, models change, and package scripts drift. If nobody owns the command, it goes stale and people stop trusting it.

If you want outside help, keep the ask narrow. Ask for a review of the workflow, not a full rebuild on day one. This is the kind of practical engineering process work Oleg Sotnikov focuses on at oleg.is, especially for startups and small teams trying to tighten handoffs, reduce rework, and make AI-assisted development behave like part of the team instead of a source of extra noise.

The best next step is small and boring. Put one proof command in one repo this week. Run it on real feature work. Then count how many review comments, broken seeds, and "works on my machine" replies disappear after that.

Frequently Asked Questions

What should one proof command include?

Run lint, the fast test suite, and a seeded app check in one script. Add the database reset and seed step if your app needs real records to show the actual problem.

What order should the checks run in?

Start with the cheapest failure first. Run lint, then tests, then the seeded app check so people do not wait through slow steps when the branch already has a simple error.

How fast should the command be?

Keep the daily run short enough that people will actually use it. A few minutes is a good target for normal work; if it takes 15 or 20 minutes, people will skip it.

Why add seed data to the command?

Because unit tests miss setup problems. A seeded run catches broken migrations, bad defaults, missing permissions, and forms that only fail when real-looking data exists.

How much seed data do we need?

Use a small, fixed set of records that match the checks you run. One admin, one normal user, and a few stable records usually catch common breakage without turning setup into a project of its own.

Should local and CI use the same command?

Yes. Use the same command name and the same steps everywhere so local runs and CI mean the same thing. If they drift apart, the team ends up with two versions of done.

What if the command fails on a clean machine?

Treat that as a real bug in the setup, not as a weird one-off. Clean up hidden state, remove machine-specific assumptions, and keep testing until a fresh checkout runs the script the same way.

Should review start before the command passes?

No. Ask for a passing run first, then review the change itself. Reviewers should spend time on the product change, not on fixing missing packages, stale seed data, or skipped checks.

How should AI assistants fit into this workflow?

Give the assistant the same finish line you give a person: the task is done only when the proof command passes with current seed data. That keeps it from stopping after a partial fix.

How do we roll this out without making work slower?

Start with one active repo and one plain rule: no handoff without a passing run. Trim slow or flaky steps early, give one person ownership of the script, and adjust only after the team uses it on real changes.