Apr 22, 2025·8 min read

Test selection for generated changes that saves CI time

Test selection for generated changes helps teams run only the checks that match each edit, cut CI wait time, and avoid full-suite runs for small updates.

Test selection for generated changes that saves CI time

Why small edits still trigger every test

A lot of generated edits are tiny. One prompt updates a validation rule, renames a field, or changes a helper in one folder. The diff is small, but the CI pipeline often reacts as if the whole product changed.

Most teams start with the safest rule they can defend: run everything on every change. It is easy to explain, easy to automate, and hard to argue with when nobody fully trusts the link between files and tests. If something slips through, people blame the selective logic first.

There is a practical reason for that caution. Many codebases grew without a clear map between file paths and test scope. A change in one place can affect shared types, build steps, generated clients, or deployment settings somewhere else. When nobody can say what depends on what, full-suite runs feel safer.

That default tends to stick for a few simple reasons. Teams rarely maintain dependency maps as the repo changes. Shared folders slowly attract more and more imports. Generated code lands in places that look local but affect several modules. And CI rules stay simple because nobody wants to debug a missed test later.

The cost shows up quickly. Reviews wait on long pipelines. Branches pile up. Developers rebase more, rerun more, and waste time proving that a tiny change did not break half the product. On a busy team, that drag eats into the whole week.

Slow checks also change behavior. People stop sending small cleanups. Some batch unrelated edits into one pull request to make the wait feel worth it. Others skip local checks and hope CI catches problems later. None of that improves quality.

That is why test selection for generated changes matters. The goal is not to test less. The goal is to match each change to the cheapest checks that still make sense, so a small edit does not trigger hours of unrelated work.

Decide what counts as a small change

Start with the repository tree. A small change is not about line count. Ten lines in billing can matter more than two hundred lines in help docs.

File paths give you rules people can read and trust. Commit messages are often vague, and generated code can hide real impact. If you want test selection for generated changes to work, sort files by path first and decide on checks second.

Some paths are usually safe enough for a light set of checks. Docs, copy, and style updates rarely need the whole suite. That usually includes documentation files such as README files, changelogs, and help content, along with translation files, product copy, stylesheets, icons, and comments or examples that do not affect runtime.

Those changes still need basic protection. Run linting, formatting, spellcheck, or a quick UI build when the path suggests it. That catches obvious mistakes without spending another twenty minutes on database and API tests.

Other paths should go straight into a stricter bucket. Authentication, billing, permissions, shared API contracts, and database code are rarely small, even when the diff looks tiny. One line in a migration, token check, or pricing rule can break a real user flow.

The same is true for shared code that many services import. If a generated change touches a core package, treat it as risky until you prove otherwise. Teams lose faith in selective test runs when a supposedly small change breaks checkout or login.

Set a default rule for everything that does not match a known path. Unknown files are not safe files. Send them to a medium tier instead, such as lint, unit tests, and a build for the affected package.

That default gives you room to improve the map without missing things quietly. Over time, you can move folders into lighter or stricter groups based on real failures instead of guesses.

Map files and folders to the right checks

Most repositories already contain the first draft of a test map. If your code lives in folders like web/, api/, mobile/, or infra/, use that structure. Good selective test runs start with names the team already understands, not with a rules engine nobody wants to read.

Keep the first version boring. A top-level folder often tells you enough about which checks matter. You can add finer rules later for shared libraries, generated code, or awkward cross-project dependencies.

A simple map can be as direct as this:

  • web/ runs frontend unit tests, type checks, and a quick build
  • api/ runs backend unit tests and contract checks
  • mobile/ runs mobile lint and the fastest simulator smoke test
  • infra/ runs formatting, validation, and policy checks
  • docs/ runs spelling or markdown checks only

Pick the fastest useful checks for each area. That word matters. If a folder change almost never breaks browser journeys or broad integration tests, do not trigger them by default. Save the slower suites for wider changes, nightly runs, or cases where shared code changed.

You also need a short list of checks that always run no matter which files changed. Keep that list small or the whole idea falls apart. For most teams, one build sanity check, one security or secret scan, and one smoke test are enough.

Store the mapping in the repository, close to the code. A small YAML or JSON file in a ci/ folder works well because anyone can review it in the same pull request that changes the code layout. That matters when someone adds a new folder and forgets to map it.

Keep the file readable. People should understand it in under a minute. If one path maps to ten jobs and several exceptions, the logic is already too hard.

A quick example makes the benefit obvious. If a generated change touches only docs/faq.md, run markdown checks and stop. If it touches api/routes/ and web/src/hooks/, run backend and frontend checks together, plus the short always-run list. That gives you useful coverage without paying for the whole suite on every edit.

Build the first version in small steps

Start with what your team changes most often. Look at a week or two of pull requests and write down the paths that show up again and again: docs, UI files, API handlers, database migrations, build scripts, and shared libraries.

That short list is a safer starting point than trying to map the whole repo at once. If you begin with every folder, you will create too many weak rules and people will stop trusting the results.

For each common path pattern, write one plain rule. Keep it narrow. If someone edits docs/, run spelling and markdown checks. If they change a React component, run the UI lint step, the unit tests for that app, and maybe a quick build. If they touch a migration file, run schema checks and database tests.

A good first version of test selection for generated changes sends each rule to one small test set before it tries anything clever. The goal is to save time on obvious cases, not to predict every side effect on day one.

A simple rollout usually looks like this:

  • Pick 5 to 10 path patterns that appear every week
  • Map each pattern to the cheapest relevant checks
  • Send unclear or mixed changes to the full suite
  • Keep a log of what the rule chose and what failed later

The fallback matters. When rules overlap, or a change touches shared code, run everything. The same goes for generated code that lands in several folders when the real source of truth is not obvious. A conservative fallback feels less clever, but it keeps confidence high.

After the first week, review the results with the team. Look for two problems: rules that missed breakage and rules that still trigger too much work. Tighten both.

One team might notice that changes in src/ui/ often pull in shared helpers from src/common/. That is a sign to widen the rule a little. Another team may learn that markdown edits almost never need app tests. That rule can stay fast.

Keep the first version small enough that people can understand it, question it, and improve it.

Run the cheapest checks first

Set a safe fallback
Define clear defaults for unknown paths, generators, and mixed pull requests.

Order matters more than most teams admit. If the first job in CI takes ten minutes, people stop trusting selective test runs and hit "run all" just to feel safe.

Start with checks that are fast, local, and easy to explain. Formatting, linting, and type checks catch a surprising amount of breakage in under a minute, especially for generated changes that touch many files but barely change behavior.

A simple order works well:

  • Run format checks first
  • Run lint and type checks next
  • Run small unit tests for the touched area after that
  • Run wider integration tests only when the changed files sit in shared code or cross service boundaries
  • Keep the full suite for risky paths, release branches, or cases where the file map gives a weak match

This order saves time because each step answers a different question. Format and lint tell you whether the code is clean enough to read and merge. Type checks tell you whether the change broke contracts. Small unit tests tell you whether the edited logic still works.

After that, widen the net only when the file path suggests wider impact. A change in a button component does not need database tests. A change in an auth helper, billing module, shared schema, or build script often does.

This is where test selection for generated changes becomes practical. Generated code can touch many files at once, so you need cheap filters before the expensive confidence checks. Otherwise one small edit in a narrow area can still wake up the whole pipeline.

Keep a fallback rule for uncertainty. If a file has no mapping yet, or if the generator changed a shared template, run the broader set. That is slower, but it protects trust while your mapping improves.

On a lean CI setup, this can cut a lot of waste. A pull request that fails formatting in 20 seconds should stop there. There is no reason to wait for integration tests that would fail ten minutes later for the same change.

A simple example from one pull request

A code assistant opens a pull request with four changed files. The edits are small, but they touch different parts of the project, so one blanket test run would waste time.

src/ui/ProfileCard.tsx
src/ui/ProfileCard.test.tsx
db/migrations/20260410_add_customer_status.sql
.github/workflows/ci.yml
README.md

The UI change is just text in a profile card. That should trigger lint, a quick type check if your team uses one, and the component test for ProfileCard. You do not need checkout tests, mobile tests, or a full browser run because a label changed in one screen.

The migration needs a different path. Run schema checks, apply the migration to a throwaway database, and run the API tests that depend on that table. If the migration changes a column used by customer endpoints, those endpoint tests should run even if the frontend barely changed.

The CI file deserves attention first. Before any app tests start, validate the pipeline file itself. If the workflow syntax is broken, fail early and stop there instead of spending minutes on tests that will never reach deployment.

The README edit is the easiest case. A markdown check and a spelling check are enough. If README.md is the only changed file, the pull request should finish in a minute or two.

When all four file types appear in the same pull request, the system should combine the relevant checks instead of falling back to the whole suite. A reasonable result looks like this:

  • UI lint and the touched component test
  • Schema validation and migration dry run
  • API tests tied to the changed database area
  • CI pipeline validation
  • Markdown and spelling checks

That still skips slow browser journeys, load tests, and unrelated service tests. For many teams, that turns a 30 minute pipeline into something closer to 8 or 10 minutes. That is the whole point of selective test runs: match each file to the smallest set of checks that can still catch real problems.

Common mistakes that break trust

Review risky repo areas
Check auth, billing, schema, and shared code before rules miss bugs.

Trust drops the moment selective test runs miss a bug that the full suite would have caught. One bad miss can make developers stop believing the map, and then they ask CI to run everything again. That is why test selection for generated changes needs a bias toward safety, not optimism.

The first mistake is mapping too narrowly. A rule like "changes in this folder only run this tiny test set" looks neat until that folder imports shared code from somewhere else. Common libraries, config files, build scripts, database schemas, and API contracts often affect more than one area. If a path can touch shared behavior, map it to broader checks.

Another mistake is freezing the rules while the repo keeps moving. Teams rename folders, split packages, add shared utilities, and move tests. The map gets stale faster than people expect. If nobody updates it, CI starts making strange choices. Review the mapping any time the repo structure changes, and give one person or team clear ownership.

Too many slow checks can break trust too. If one path triggers five long suites, people stop seeing the point of selective runs. A small UI text fix should not wait on heavy database and load tests unless that file really affects them.

A few habits keep the system honest:

  • Keep a visible fallback rule for unknown or risky changes
  • Recheck shared folders more often than isolated ones
  • Remove duplicate triggers when two rules call the same slow test
  • Treat generated edits and human edits with the same risk rules

That last point matters. AI can make careful, tiny changes, but it can also touch code in ways that look local and still break shared behavior. Do not give generated edits a free pass. They need the same guardrails and the same fallback path.

Hidden fallback rules create their own problems. If developers do not know when CI switches to broader checks, they cannot tell whether a run was selective or conservative. Put that logic plainly in logs and docs. Imperfect rules are easier to trust than secret ones.

What to measure after rollout

Reduce noisy pipeline runs
Trim broad checks without ignoring shared code and risky paths.

Test selection for generated changes only works if people trust it. That trust comes from numbers, not guesses.

Start with the miss rate. If a selective run says a change is safe, but the full suite, staging, or production later finds a bug that the skipped tests would have caught, count it. Even a low miss rate matters. One missed payment bug can wipe out the time you saved on fifty clean pull requests.

A simple rule helps: log every escaped defect for 30 to 60 days, then check whether your path rules skipped a test that should have run. If they did, fix the rule and note the pattern. You want fewer misses over time, not perfect numbers on day one.

Watch the time savings

Two more measures tell you whether the system is paying off in daily work:

  • total CI minutes before and after rollout
  • average time from pull request open to the first useful test result
  • average review time for small changes
  • percent of pull requests that use selective runs

CI minutes are easy to track, but review time often tells the better story. If developers get feedback in 6 minutes instead of 25, they keep moving. Reviewers spend less time waiting for green checks before they merge a typo fix or a small UI change.

Ask the team where the rules feel noisy. Developers notice broad rules fast. If changing one file in a docs folder still triggers half the backend checks, people stop believing the mapping. Short feedback works best here. A quick note in the pull request template or a simple weekly review is enough.

Keep the rules in sync with the codebase. Teams move folders, split services, rename packages, and create shared libraries. Old mappings quietly rot. Then your selective runs become too wide, or worse, too narrow.

A monthly check is usually enough. Compare recent file moves against the mapping table, update rules for new services, and remove entries that no longer point to anything. If you treat the map like real code, it stays useful.

Quick checks and next steps

Write one rule for files your map does not recognize. If the team has to guess, people stop trusting the system. A safe default is simple: unknown paths run a broader set of checks than known low-risk edits. That costs a bit more, but it avoids silent misses.

Keep one clear owner for the mapping file. One person does not need to write every rule, but one person should approve changes, remove old entries, and settle edge cases. Shared ownership sounds nice until nobody fixes a bad mapping.

Start small. Pick one repo with active pull requests and a test suite that already hurts a little. Test selection for generated changes is much easier to judge in one place than across ten repos with different habits, naming, and risk levels.

A short checklist helps:

  • Define the fallback rule for unknown paths
  • Assign one owner for the mapping file
  • Track skipped tests and later failures together
  • Review missed cases once a month
  • Expand only after one repo works well

That monthly review matters more than people expect. Look at changes that slipped through, tests that should have run, and folders that now mean something different from what they meant three months ago. Repos drift. Your map has to move with them.

If a team copies the same setup everywhere too early, it usually copies the same mistakes too. One service may map cleanly by file path, while another needs tags, ownership data, or dependency rules. Earn confidence in one repo first, then reuse only the parts that held up.

Keep the first version boring. A plain YAML or JSON file that maps paths to checks is often enough to prove the idea. You can add smarter logic later if the simple version saves time without causing misses.

If your team needs help untangling CI rules or building practical AI-first development workflows, Oleg Sotnikov at oleg.is works as a Fractional CTO and startup advisor. He helps small and mid-sized teams with infrastructure, automation, and AI-augmented software development when internal bandwidth is tight.

Frequently Asked Questions

What is test selection for generated changes?

It means your CI chooses tests from the files a change touches instead of running everything every time. If a bot edits docs/ or one UI component, CI runs the cheapest checks that fit that area and skips unrelated suites.

Why not run the full test suite on every change?

Full-suite runs feel safe, but they slow reviews, pile up branches, and waste CI minutes on tiny edits. Selective runs keep the same goal of catching bugs, but they match each change to the smallest useful set of checks.

How should I decide what counts as a small change?

Do not use line count. Use file paths and risk. A 200-line docs edit may stay low risk, while a one-line billing or auth change deserves broader tests right away.

Which files usually need only light checks?

Start with places like docs/, copy files, translations, styles, icons, and comments that do not change runtime behavior. For those paths, run formatting, linting, markdown checks, spellcheck, or a quick UI build if the folder needs it.

Which changes should trigger broader tests right away?

Treat auth, billing, permissions, database code, shared API contracts, build scripts, and common libraries as risky. Even tiny edits there can break login, checkout, deploys, or several apps at once, so run wider tests for those paths.

What should always run on every pull request?

Keep this set small. A build sanity check, a secret or security scan, and one smoke test usually give enough baseline protection without turning every selective run into another full pipeline.

How do I roll this out without losing trust?

Begin with 5 to 10 path patterns you see every week. Map each one to cheap, relevant checks, send mixed or unclear changes to the full suite, and log what CI chose so you can fix weak rules fast.

What if one pull request changes files in several parts of the repo?

Combine the checks for every touched area instead of defaulting to every test in the repo. If a pull request changes web/, api/, and a migration, run the frontend checks, backend checks, and database checks together, then stop there unless shared code or unknown paths raise the risk.

How do I handle unknown paths and shared code?

Use a clear fallback. If CI does not recognize a path, or if the change hits shared templates, common libraries, or generated code with wide impact, run a broader set of checks. Unknown should mean cautious, not light.

What should I measure after rollout?

Track miss rate first: count bugs that later show up in skipped tests, staging, or production. Then watch CI minutes, time to first useful result, review time for small pull requests, and how often developers actually use the selective path rules.