Jul 27, 2024·8 min read

Repo rules for AI code that cut repeat review fixes

Repo rules for AI code help teams turn repeat review comments into tests, linters, and templates that catch mistakes before pull requests pile up.

Why the same comments keep coming back

The most annoying review comments are not the hard ones. They are the ones your team has already written ten times this month.

A pull request opens, and the same notes show up again: rename this function, add the null check, move this helper out of the page file, write the missing test. None of these fixes are big on their own. Together, they waste hours and slow every merge.

The pattern is familiar. A handler gets a vague name like processData instead of something tied to the feature. A nullable field gets used without a guard because the happy path worked once. A branch adds tests for success but skips empty input or failed requests. A file lands in the wrong place because the repo never made the preferred layout obvious.

AI tools make this worse when the repo leaves too much room for guessing. The model sees what already exists, follows the easiest pattern, and fills gaps with whatever looks plausible. If your codebase accepts mixed naming, uneven test coverage, or random file structure, the tool will repeat that mess with a lot of confidence.

That is why one-off fixes do not stick. A reviewer can clean up one pull request, but the next generated branch often brings back the same issue. The team keeps paying for the same lesson because nothing in the repo teaches it early.

Rules fix the problem at the right point in the workflow. A linter can reject weak names or unsafe access. A test can catch a missing null case before review. A template can push people to place files in the right folder and include the expected test file. These checks are boring. That is exactly why they help. Boring rules save human attention for the comments that need judgment.

Senior engineers should not spend their time typing "please add a nil check" for the fifteenth time. If a mistake repeats, treat it like a repo problem, not a people problem.

What should become a rule first

If the same review note shows up every week, stop treating it like a one-off mistake. It is a good candidate for a rule. The first rules should target repeated, boring problems, not clever edge cases.

Pick comments that have one clear right answer. A rule works best when a tool can say pass or fail without a debate. "Add a test when business logic changes" is clear. "Make this code cleaner" is not.

A short filter helps before you automate anything:

Does this comment appear often?
Can a tool check it with a clear pass or fail?
Does it waste review time when humans check it by hand?
Could it cause bugs, security issues, or broken builds if it slips through?

The best first targets usually sit in the middle of review threads. Missing tests, unchecked errors, unsafe defaults, forgotten migrations, or files in the wrong place are good examples. Reviewers should not spend the same ten minutes on those points in every pull request.

Skip taste debates. If two senior engineers can both defend different answers, do not turn that into a hard rule yet. The same goes for rare edge cases. A mistake that happens twice a year does not deserve the same effort as one that appears every Friday afternoon.

Production risk matters more than annoyance. If a repeated issue can break auth, corrupt data, expose secrets, or bypass logging, move it to the top of the list. If it only changes how tidy the diff looks, leave it for later.

A simple ranking method works well. Give each repeated comment two scores from 1 to 5: review time and production risk. Add them together. Start with the highest totals.

Small teams usually end up with a practical first batch: required tests for changed logic, lint checks for unsafe patterns, and pull request templates that ask for the same missing context every time. Those rules remove friction early and give reviewers more time to think about design, trade-offs, and product behavior.

Start with the easiest guardrails

Good repo rules should block the cheap mistakes first. Start with checks that already have clear tools and clear results. A formatter, a linter, and one standard test command catch a lot of review noise before a person reads the diff.

Put those checks in the same place for every repo. If one project uses make test, another uses npm test, and a third expects three separate commands, people forget. AI tools forget too. Pick one entry point for tests, one place to define import order, and one file that explains naming rules.

At this stage, simple setups beat custom automation. Run the formatter on every commit or in CI. Run the linter in CI with a short config. Document file naming, import order, and test commands in the repo root. Add branch and pull request templates for repeated asks.

Templates help more than teams expect. If reviewers keep asking for the same context, make authors provide it up front. A pull request template can ask for user impact, test steps, config changes, and a rollback plan. A branch naming rule can stop vague names like fix-stuff and make branch purpose clear at a glance.

Keep each rule narrow. Teams resist a new rule that tries to solve five problems at once. They usually accept one that removes a single annoying repeat comment. "All new handlers need a test" is easy to understand. "All code must follow a smart architecture checklist" usually dies after a week.

This matters even more when AI writes part of the code. The most common mistakes cluster around the same basics: missing tests, messy imports, odd file placement, and pull requests with no useful context. You do not need a fancy internal platform to fix that. You need a few boring rules that run every time.

Turn review comments into tests

The fastest way to cut repeat review work is simple: stop arguing with the same bug in every pull request, and make the repo catch it first. Good rules often start as one sentence a reviewer keeps typing.

Write that sentence in plain language. Skip team slang. A good rule sounds like this: "If an API call fails, return the right status code and a stable error code." Anyone on the team can understand that, including the person who updates the test six months later.

Then shrink the mistake into the smallest example you can make. Do not start with a full feature branch. Build a tiny case that shows the bad behavior in a few lines of code or one request and one response. Small examples are easier to keep, easier to run, and much easier to trust.

The workflow is straightforward: pick one review comment that appears every week, create a tiny example that reproduces it, add a test that fails on the bad version, add the fix, and run that test in CI so the branch stops before review.

Say your team keeps telling AI tools not to swallow database errors. Make a small handler that catches the error and returns "ok" anyway. Write a test that expects a real error response. When AI code repeats the old mistake, the branch fails in CI before a reviewer has to point it out again.

CI matters because local habits drift. One developer runs the full suite, another skips it, and rushed code slips through. If the branch cannot merge while the test fails, the rule becomes real.

Every rule also needs an owner. Pick one person or one team to update the test when the codebase changes. Without ownership, old tests turn noisy, people stop trusting them, and someone starts bypassing them "just for now." That is how rules decay.

If a comment shows up often enough to annoy a reviewer, it is probably ready to become a test. Once the repo catches it on its own, review can focus on design and product choices instead of the same fix again.

Use lint rules for style and safety

Plan Your AI Dev Setup

Get hands-on advice for code review, model use, testing, and automation.

Get Advice

If the same pull request keeps coming back with an unused helper, a forgotten await, or a loose any, review is happening too late. A linter catches those problems in seconds.

AI-generated code often looks fine at a glance. Then the small issues start to show up: dead imports, async calls that never get awaited, types that got widened because the model could not infer them cleanly. These mistakes are not rare. They are exactly the kind of repeat work a machine should block before a human reads the diff.

A short ruleset usually covers most of the pain. Fail on unused imports and variables. Flag promises that are created but never awaited or handled. Disallow any where a real type should exist. Keep import order consistent. Stop cross-folder imports that break your architecture. Enforce simple naming rules for files, components, hooks, and tests.

Folder boundaries matter more than many teams expect. If frontend code starts pulling in server-only modules, or feature code reaches deep into another feature's internals, the repo gets messy fast. AI tools do this a lot because they optimize for whatever import path seems to work, not the boundary your team meant to keep.

Keep lint rules readable. If the config is so large that nobody knows what it enforces, the team will ignore it until it breaks something. Start with a short set of rules that fix repeat comments, then add more only when the need is obvious.

Use templates when code needs the same shape

When new code follows the same pattern every time, a template saves more review time than another written guideline. People skip docs, especially when they are moving fast. A starter file gives them structure before they make the usual mistakes.

This works well for new API endpoints, background jobs, UI components, migration scripts, and admin tools. If each one needs the same tests, logging, error handling, and docs note, put that shape in the repo from the start. Then the author edits a working frame instead of building from memory.

A good template does more than create empty files. It includes the parts reviewers keep asking for: a test stub, a short docs section, and common safety checks. In practice, a new endpoint template might already include request validation, a sample test file, a place to document error codes, and a note for metrics or alerts.

Placeholders matter. They should force a real choice, not let people leave vague defaults behind. Good prompts ask someone to choose the auth mode, state failure behavior, list side effects, and add a rollback plan.

That catches lazy copy-paste work early. It also makes AI-generated code less slippery, because the model has to fill in specific decisions instead of producing a generic shell.

Templates need upkeep. If the repo changes and the template still uses old patterns, it spreads bad code faster than no template at all. Review templates on a schedule, or any time the team changes testing, logging, naming, or folder layout.

A simple rule helps: if reviewers leave the same structural comment three times, make a template.

A simple example from one feature branch

Get Fractional CTO Support

Bring in senior help for repo standards, testing, architecture, and AI development.

Book Call

A small product team kept getting the same review comments on AI-written TypeScript code: missing null checks and no tests. The branch looked simple enough. It added a "resend invite" action for admins.

The first pull request passed a quick manual check, but the code assumed every account had a contact record. One customer account did not.

const email = account.contact.email;
await inviteService.resend(email);
return { ok: true };

Review notes were short and repetitive:

Guard against account.contact being null
Add a test for the missing contact case
Do not patch it with ! just to silence the type error

That pull request went through three review rounds. The author fixed the null check, then added a test, then adjusted the error response. Nothing in the repo stopped the same mistake from showing up again in the next branch.

So the team made three small changes instead of relying on memory.

They added one lint rule: @typescript-eslint/no-non-null-assertion
They added one failing test for contact = null
They created a starter template for new action handlers with a guard clause and a matching test stub

The template was plain on purpose. It started with input checks before any service call:

if (!account.contact?.email) {
  throw new AppError("Missing contact email");
}

On the next branch, the team built a "pause subscription" action. The AI used the starter template, wrote the guard first, and copied the test shape from the earlier fix. The pull request was not perfect, but the usual comments were gone.

Review got much shorter too. The first branch took most of an afternoon because the reviewer had to ask for the same repair twice. The next one merged after one round with a single note about naming.

That is the point. You do not need a giant policy file. You need a small rule for the mistake that keeps coming back. If one branch teaches you that null checks and tests are missing, turn that lesson into something the repo can enforce on the next branch.

Mistakes that make rules backfire

Rules save time only when they stop repeat mistakes without slowing normal work. Teams usually get into trouble when they turn taste into policy, pile on custom checks, or force everyone through the same path.

The first trap is simple: not every review comment should block a merge. If one reviewer prefers shorter function names and another wants more comments, that belongs in examples or team notes, not CI. Save hard rules for problems that cost real time later, like missing tests, unsafe queries, broken types, or skipped error handling.

Another common mistake is building clever checks that nobody can fix. A custom script can feel smart on day one, then become dead weight a month later. If your team cannot explain what the check does, update it quickly, and test it when the codebase changes, do not ship it. A plain linter rule that people trust beats a homegrown gate that breaks every other Friday.

Build failures also need plain language. "Structure check failed" is weak. "API handler in file X is missing auth middleware" tells the author what went wrong. "Add the shared request schema before merge" is even better because it points to the fix.

Teams also forget to remove old manual review steps after automation lands. If a test already proves that generated SQL uses the safe helper, reviewers should stop checking that by hand in every pull request. Keeping both steps trains people to ignore one of them.

Leave room for exceptions. Migration scripts, emergency patches, generated files, and vendor code often need a different path. Make that path explicit: allow an override, ask for a short reason, and assign one person to approve it. That keeps the rule useful without turning it into a wall.

If a rule creates more discussion than the original mistake, the rule needs work.

Quick checks before you merge

Set Repo Rules That Stick

Get Oleg's help defining clear checks your team can run before every merge.

Get CTO Help

A rule earns its place only if it stops a mistake your team sees often. If it catches one odd edge case from six months ago, it will feel like punishment, not help.

Before you make any new check required in CI, read it like a new teammate would. When it fails, the message should point to the problem in plain language and suggest the fix. "Auth check missing in API handler" is useful. "Validation failed" is not.

Speed matters more than many teams admit. If one more test adds three or four minutes to every pull request, people will resent it fast. Keep the default path light. Run the small, common checks on every branch, and save heavier jobs for nightly runs or a later stage.

A quick pre-merge pass helps. Check that the rule targets a common review comment. Read the failure message out loud and see if it makes sense. Measure how much time it adds to CI. Try it on real open branches, not toy examples. Write down the one case where a person can skip it.

That last step prevents a lot of friction. Some rules need a human exception, but that exception should be rare and easy to explain. A short note in the repo is enough if it says who can skip the rule, when, and how to record the reason in the pull request.

Teams that work with AI-generated code usually learn this after a few painful rollouts: a rule that blocks normal work is still a bad rule. Test it on actual feature branches first. If it catches the right problem, stays fast, and speaks clearly, then make it part of the merge path.

What to do next

Open the last month of pull requests and find three comments that keep coming back. Pick the comments that burn time every week, not the ones that sound dramatic. Missing edge-case tests, loose null checks, repeated file structure, and forgotten logging are usually good places to start.

Then turn each one into a different guardrail. Make one a test, one a linter rule, and one a template. That split works well because each problem has a natural home. A repeated behavior bug belongs in a test. A repeated style or safety fix belongs in linting. A repeated code shape belongs in a template or snippet.

Keep the first pass small. Apply the rule to one active part of the repo, then watch how the next few pull requests go. If developers keep fighting the rule, it is probably too broad. If nobody notices it because automation catches the issue early, you picked the right target.

Track a few simple numbers for two or three weeks: review rounds per pull request, time from open to merge, and reopened pull requests after merge. Those numbers tell you if the rule removed rework or just moved it somewhere else. If review rounds drop from three to one on the same kind of change, that is a real win. If merge time stays flat but reopened pull requests rise, tighten your tests before you add more lint rules.

If your team wants outside help, Oleg Sotnikov at oleg.is works as a Fractional CTO and startup advisor for startups and smaller companies. He helps teams put practical repo standards, testing, infrastructure, and AI-augmented development workflows in place without turning the process into a burden.

Frequently Asked Questions

How do I know a review comment should become a repo rule?

Make it a rule when the same comment shows up again and again, and a tool can check it with a clear pass or fail. Good first candidates include missing tests, null checks, unsafe error handling, and files landing in the wrong place.

What should I automate first?

Start with the boring issues that eat review time every week. Missing tests for changed logic, loose type safety, forgotten awaits, and weak pull request context usually give fast wins.

Should I use a test, a linter, or a template?

Use a test for behavior, a linter for style or safety, and a template for repeated structure. If the problem changes what the code does, write a test. If the problem changes how the code looks or how safely it runs, use linting. If authors keep building the same file shape from memory, add a template.

Why does AI-generated code keep making the same mistakes?

AI tools copy the patterns they see and fill gaps with whatever looks close enough. If your repo accepts mixed naming, uneven tests, or messy folder layout, the model will keep producing more of the same.

How small should a new repo rule be?

Keep the first version narrow and easy to explain. A rule like "new handlers need a test" works well because people understand it fast and CI can check it without debate.

Do I need custom automation for this?

Most teams do not need custom scripts at the start. A formatter, a linter, one standard test command, and a couple of templates solve a lot of repeat review work with less upkeep.

How do I keep CI from getting too slow?

Run fast checks on every branch and move heavier jobs to a later stage or a nightly run. If a new rule adds too much time to every pull request, trim the scope or pick a cheaper check first.

Who should own repo rules?

Give each rule an owner who updates it when the codebase changes. Without ownership, tests get noisy, templates go stale, and people start bypassing the checks.

When should I allow exceptions to a rule?

Allow exceptions for cases like emergency fixes, migrations, generated files, or vendor code. Keep the escape hatch simple: require a short reason and one named approver so the team does not turn every exception into an argument.

How can I tell if the new rules are actually working?

Watch review rounds per pull request, time from open to merge, and reopened pull requests after merge. If review rounds drop and the same comments stop showing up, the rule helps. If merge time grows and bugs still slip through, adjust the rule.