AI generated regression tests, but show me the file
The top five results for this phrase describe what AI regression generators do. None of them show the artifact: the actual file the AI produces, where it lands on your disk, whether you can grep it, diff it, or check it into git. That is the only question that matters when you are going to maintain the thing for a year. This page answers it with a file path, a line number, and a watcher loop.
The question the SERP dodges
Open the first five Google results for “ai generated regression tests” and count how many show you the file the AI produces. The number is zero. You get marketing about self-healing selectors, anecdotes about a 70% reduction in manual work, lists of vendors. Every page stops before the interesting line: what is the on-disk format, and where does it land?
That omission is deliberate. If the answer is “a record in our SaaS”, the rest of your regression tooling is coupled to a subscription. If the answer is “a proprietary YAML export blob”, you can technically download it, but nothing outside the vendor UI can read it. Either way, your regression tests are not text, which means they cannot live in the same pull request as the feature they cover, and your team loses the only review workflow that actually catches regressions early: a readable diff.
Assrt is built on the opposite bet. The AI generator writes a markdown file. The file sits at a path you can ‘ls’. The watcher loop under the hood makes hand-edits first-class. The rest of this page is evidence for that claim, starting with the two files side by side.
What the AI generation looks like, two ways
# exported from acme-test-cloud # proprietary format, only runs inside the vendor SaaS test_id: 0f8b3a2d-92e1-4c0a name: Sign in with demo account version: 7 author: ai-synthesizer@system region: us-east-2 last_run: 2026-04-18T09:12:44Z session_key: a9f0d1...REDACTED actions: - kind: NAV target_selector_id: 148301 payload_ref: scfg://bucket/14/22/8ff.json - kind: TYPE target_selector_id: 148312 payload_ref: svar://demo_email_v3 - kind: CLICK target_selector_id: 148317 policy: auto_heal - kind: ASSERT_TEXT target_selector_id: 148322 payload_ref: svar://welcome_text_v2 # note: this file is an export artifact, # not a primary source. changes must be made # in the web editor at app.acme-test.com
- payload_ref:// points at a private bucket, unreadable outside the SaaS
- target_selector_id is an opaque integer mapped inside the vendor DB
- session_key ties the file to a single authenticated account
- The export is an artifact, not a source; primary edits happen in their web UI
A real AI generation, in full
This is what lands at /tmp/assrt/scenario.md after one assrt_plan call against a typical SaaS marketing plus auth app. Three cases, each under ten lines, each one independent so case two does not depend on case one's state. The prompt caps the output at eight cases (“Generate 5-8 cases max” at server.ts line 236) so the first run is narrow and reliable, and you grow the file by hand as you cover more flows.
The 14 lines that keep the file editable
Most AI test generators treat the output as one-shot: generate, display, save to their DB, never look at the file again. Assrt keeps a live fs.watch on the scenario file so human edits and AI generations both flow through the same path. Under the hood it is fourteen lines of Node, reproduced below verbatim from the open source package.
Two details make this work. First, the debounce: a tight 1000ms window, short enough that the PR-review feel is instant, long enough that a multi-line save in vim only fires one sync. Second, the echo guard at line 136. When Assrt writes the plan itself, it stamps lastWrittenContent; on the next tick, the watcher reads the file, sees the content matches, and does nothing. Only edits the runtime cannot explain, in other words, your edits, propagate.
How the round-trip flows
The AI generation is the first write into the file. Everything after is a round-trip between your disk and the cloud plan record, mediated by that single fs.watch.
scenario.md round-trip
A full edit-and-sync in the terminal
This is the loop from the outside: one generation, one hand-edit, and the sync event, all visible in stdout. No browser tab, no vendor login. The diff at the bottom is what a reviewer sees in a pull request.
Why this format wins for regression, specifically
A regression test is a test you expect to re-run dozens of times against a moving codebase. The hard part is not writing it the first time, it is deciding whether today's failure is your code breaking or your test drifting. That decision is much easier when the test is a markdown file in git: you look at the last commit that touched it, you read the prose, you know.
A two-line change in prose is a two-line diff. No screenshot of a dashboard, no vendor timeline, no “compare versions” modal. Your code reviewer reads the change the same way they review a pull request.
saveScenarioRun stores the exact plan text as planSnapshot at scenario-store.ts line 169. If today's run fails and last week's passed, you can diff the two plan snapshots and see whether anyone edited the test in between.
Because the plan says “click the Sign in button” instead of a CSS selector, the agent calls snapshot() on every run and picks a fresh [ref=eN] from the accessibility tree. A CSS rename does not break the test; a removed button does.
Every run writes /tmp/assrt/<runId>/ with a WebM video, numbered screenshots, events.json, and the pass/fail JSON. Attach the tarball to the bug; anyone opens it without an account.
What the on-disk format buys you
Every row below is about the file itself, not a feature the dashboard wraps around it.
| Feature | Typical SaaS AI tester | Assrt |
|---|---|---|
| On-disk format | Proprietary YAML export or DB row | Plaintext markdown with `#Case` headers |
| Where the test file lives | Vendor cloud, opaque session | /tmp/assrt/scenario.md (move into repo) |
| Diff in a pull request | Dashboard toggle between versions | `git diff` reads as prose |
| Round-trip human edits | Edit in web UI only | fs.watch + 1s debounce syncs back |
| Plan history per run | Hidden in the platform timeline | planSnapshot field per run record |
| Hand-edit without regenerating | Usually requires a new AI generation | Change the markdown, save |
| Export and take elsewhere | Locked-in format, unreadable outside | Copy the file, nothing else needed |
| Cost at team scale | ~$7.5K/mo typical seat pricing | $0 plus LLM tokens |
Prices are list-tier seats and exports as documented by each vendor's public pricing; your contract may vary.
Adopting AI generated regression tests without lock-in
Five steps, each one is a file move or a command. None of them require signing into a platform.
Generate the first pass with `assrt_plan`
Point the plan tool at your URL. The generator (constrained by the prompt at server.ts line 236) writes 5-8 `#Case` blocks to /tmp/assrt/scenario.md. This is the only step that burns LLM tokens for the regression file itself; everything after it is text editing.
Move the file into your repo
`mv /tmp/assrt/scenario.md tests/regression/signup.md` and commit. You have now captured the regression plan as source. Next run just points `--plan` at the repo path. If you change nothing else, every rerun is a true regression: same plan text, same agent, same pass criteria.
Edit the file when the app genuinely changes
UI drift does not usually require an edit — the agent calls `snapshot()` at run time and picks a new ref. But when intent changes (a step is added, a field is renamed in the spec, a new assertion is needed), open the file and edit the prose. One line diff, one PR, reviewed like any source change.
Re-run by scenarioId for stable history
After the first save, Assrt assigns a scenario UUID. `assrt_test({ url, scenarioId })` fetches the latest stored plan and runs it. Each run record persists `planSnapshot` (scenario-store.ts line 169) so you can always tell whether today's failure is because the test changed or the app did.
Tar the run directory as a CI artifact
Every run writes `/tmp/assrt/<runId>/` with the video, numbered screenshots, events.json timeline, and results JSON. In CI, upload that directory as a workflow artifact. No dashboard to sign into, no paid seat to review a failure; your teammates open a zip.
Generate the first scenario.md
Run `npx assrt-mcp` against your local dev server, let the planner write 5-8 `#Case` blocks to /tmp/assrt/scenario.md, then move the file into your repo. The watcher takes care of the rest.
Get started with Assrt →Questions about AI generated regression tests
What format do most tools store AI generated regression tests in?
A vendor database row behind a dashboard. Momentic, Testim, mabl, Virtuoso, Functionize, Tricentis Tosca all persist the AI generated test as a record in their cloud, usually exposed to you as a visual editor and a proprietary YAML export you can't parse anywhere else. That means you can't grep across 400 tests, you can't diff a regression test in a pull request, you can't reconcile two copies after a merge, and you can't fork one to parameterize it without opening the vendor UI. The on-disk format matters more than any feature bullet on those pages, which is why none of the top-ranking guides mention it.
Where does Assrt put the generated test file?
At `/tmp/assrt/scenario.md`. The path is pinned in `assrt-mcp/src/core/scenario-files.ts` at line 16 as the `ASSRT_DIR` constant, then joined with `scenario.md`. Every call to `assrt_plan` or `assrt_test` writes the current plan text to that exact file. The metadata (scenario UUID, name, origin URL) lives next to it as `scenario.json`. Run artifacts land under `/tmp/assrt/<runId>/` with video, screenshots, events, and the pass/fail result. You can `tar czf regression.tgz /tmp/assrt/` and hand the whole thing to a teammate; everything is a boring filesystem path, nothing is locked to a session.
What does the AI actually generate inside scenario.md?
Five to eight `#Case N: name` blocks of plain English, max. That ceiling is explicit in `assrt-mcp/src/mcp/server.ts` inside the `PLAN_SYSTEM_PROMPT` at line 236: `Generate 5-8 cases max — focused on the MOST IMPORTANT user flows visible on the page.` Each block has a header like `#Case 1: Demo account can sign in`, followed by numbered steps in regular prose. At run time, the regex at `agent.ts` line 569 (`/(?:#?\s*(?:Scenario|Test|Case))\s*\d*[:.]\s*/gi`) splits the file into scenarios and hands each block to the planner model. Because the file is markdown, GitHub renders it natively in PRs.
If I edit the generated file by hand, does my edit get lost on the next generation?
No. The file is watched by an `fs.watch` listener in `scenario-files.ts` at line 97. Every change kicks a `setTimeout` debounce set to 1000 milliseconds at line 102, and when that timer fires, `syncToFirestore` reads the current contents and calls `updateScenario` against the central API. So your hand-edit is the new source of truth one second after you save. If you regenerate instead, the newer plan overwrites the file, but your previous version is preserved inside the run record as `planSnapshot` (saved via `saveScenarioRun` in `scenario-store.ts`, line 169). You have a full plan history per run, keyed by run UUID.
How does this compare to a Playwright test file in my repo?
A Playwright spec is code: TypeScript that imports `test`, chains Playwright method calls, and gets compiled against your `playwright.config.ts`. An Assrt scenario file is intent: a markdown document a person wrote. Both drive the same engine (Assrt spawns `@playwright/mcp` under the hood), but the scenario file says what to verify and the spec says how to verify. The advantage for regression suites is that the scenario file does not break when a CSS selector changes. The regex parses headers, the agent calls `snapshot()` at run time to get a fresh accessibility tree, and the planner picks a new ref. The spec, by contrast, has to be republished whenever the DOM drifts.
Can I check the AI generated file into git alongside the app it tests?
Yes, and this is the point. Copy `/tmp/assrt/scenario.md` into your repo at something like `tests/regression/signup.md`, commit, push, and you have a regression test that sits in the same pull request as the code change that created the feature. Every diff is human-readable: a colleague reviews `+ Click the Submit button - Click the Confirm button` and understands the test change in one second. No vendor plugin, no visual editor, no separate account. Running the scenario is `npx assrt-mcp --url <url> --plan tests/regression/signup.md`.
What stops the fs.watch loop from syncing a generation back over itself?
A single-variable echo guard. When Assrt writes the plan to disk, `writeScenarioFile` sets `lastWrittenContent = plan` first (scenario-files.ts line 44). When the watcher fires, `syncToFirestore` reads the current contents and compares against `lastWrittenContent` (line 136); if they match, it skips the sync. The only writes that propagate to cloud storage are the ones the watcher cannot explain, i.e., human edits. This is the reason you can open the file in your editor, make a change, save, and see the central API receive a PATCH one second later without getting into a loop with the agent's own writes.
Does this only work for locally generated regression tests, or can the AI re-run the same plan later?
Each generation gets a UUID. Pre-run, the MCP server calls `saveScenario` and gets back a scenario ID (server.ts around line 410). Re-running later is `assrt_test({ url, scenarioId })`; the server calls `fetchScenario(scenarioId)`, pulls the saved plan text, and runs it against the current URL. This is how regression works in Assrt: the scenario ID is stable, the plan text is versioned, and each run produces a `planSnapshot` record so you can tell whether today's failure is because the test changed or because the app changed. The test IS the regression test, there is no separate compiled artifact.
Which model writes the AI generated regression tests?
By default, Claude Haiku 4.5 (model id `claude-haiku-4-5-20251001`, pinned at `assrt-mcp/src/core/agent.ts` line 9 as `DEFAULT_ANTHROPIC_MODEL`). Haiku is the smallest current Claude model and is used because plan generation is a cheap vision + text task: screenshot + accessibility tree in, 5-8 markdown case blocks out. You can override with `--model` to Sonnet 4.6 or to Gemini 3.1 Pro (default Gemini model is set at line 10). None of this is hidden; every inference call is made from your machine with your own API key, stored in macOS Keychain via `assrt-mcp/src/core/keychain.ts`.
Does the plan file work in CI, or is it tied to my laptop?
The plan file is portable text. In CI, check it into the repo, install the npm package, and run `npx assrt-mcp --url http://localhost:3000 --plan tests/regression/signup.md --headed=false`. The agent launches Chromium headless through `@playwright/mcp`, runs the scenario, and writes results under `/tmp/assrt/<runId>/`. Upload that directory as a workflow artifact and the plan, video, screenshots, and events.json show up in the Actions UI. Because the generator output is a text file, it does not depend on any dashboard or session that expires.
How many regression cases should one scenario.md hold?
Five to eight when auto-generated; as many as you want by hand. The 5-8 ceiling only constrains the AI generator (server.ts line 236) because narrow focused plans are more reliable for the first pass. When you edit the file yourself, add as many `#Case` blocks as you need; the regex at agent.ts line 569 does not care. A sensible pattern is one scenario.md per user flow (signup.md, checkout.md, settings.md), each holding 5-10 cases. The per-file cap keeps any single run under a few minutes and keeps failures easy to diff.
What if I want to diff two AI generations to see what changed?
That is just `git diff scenario.md` once the file is checked in. Because the format is unindented markdown with one header per case, a diff reads cleanly even for a non-tester. This is the specific capability vendor dashboards cannot match. When Momentic or Virtuoso regenerates a test, the change is a database mutation; to see it you open the UI, toggle two versions, and read a rendered timeline. With a markdown file under git, the regeneration is a diff in the PR, and your code reviewer can approve or reject it the same way they approve a source change.
The only number you need to remember about the watcher loop. Pinned at line 102 of scenario-files.ts. Short enough to feel instant, long enough that a multi-line save only fires one PATCH.
How did this page land for you?
React to reveal totals
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.