Auto generate end to end tests where the share link works before the run starts
Most auto generation tools hand you a share link only after the run finishes. You fire the test, wait 30 to 90 seconds for it to complete, then go back and copy the URL into your teammate's message. By then they have moved on. Assrt does it the other way around. The runner pre-saves the auto-generated scenario, claims a UUID, and emits paste-able artifact URLs in the response payload before the browser has even launched. Your teammate clicks the link in Slack while the test is still mid-flight, and the recording starts playing the moment the upload settles.
The whole pattern fits in three regions of one open source file. The pre-flight save lives at server.ts lines 407 to 425. The deterministic URL builder is at lines 676 to 685. The background artifact upload is fire and forget at lines 740 to 744. Every claim on this page points at a file path and a line number you can read in the open source assrt-mcp repo.
The shape of the problem
Two separate things tend to get bundled under "auto generate end to end tests": generating the test plan from a URL, and producing a usable artifact a team can collaborate on. Most tools do the first part well and the second part as an afterthought. The plan gets generated, the run executes, the run record gets a URL after the fact, and you scroll back through your terminal to find it. By the time the URL exists, the conversation that motivated the test has moved on.
The expensive thing is not the plan generation. Plan generation is one model call against three screenshots and 8000 characters of accessibility tree text, finishing in ten or fifteen seconds. The expensive thing is the latency between firing the test and being able to share the link with someone else. Pre-flight URLs collapse that latency into one HTTP round trip.
The pre-flight, in one block
Below is the entire mechanism. Five real lines of logic plus error handling. The pre-flight runs only when autoSave is true and there is no caller-supplied scenario ID. It calls saveScenario with the auto-generated plan plus metadata, gets back a real UUID, then rewrites /tmp/assrt/scenario.md so the file watcher syncs against the right scenario for the rest of the run. After this point, every URL you build from resolvedScenarioId is a real one.
“That is the entire mechanism for the share-link half. One HTTP round trip claims the UUID. The browser does not launch until the round trip returns. Everything you do with the URL after that point is independent of test duration.”
How the URL becomes paste-able
The runner controls the artifact filenames. Playwright's video recording lands at a fixed path the runner copies to recording.webm. The execution log is always called execution.log. Screenshot filenames are zero-padded indices the runner generates as the test progresses. Because the runner picks every name, buildCloudUrls can compute the full set of artifact URLs from just the scenario UUID and the run ID. Nothing in the URL depends on what the test does or how it terminates.
One HTTP round trip turns a freshly generated plan into a paste-able link
Scenario page URL
Resolves the moment the pre-save returns. Shows the auto-generated plan, the URL under test, optional variables, and (eventually) every run attached to this scenario. Pattern: /s/<uuid>. Built at server.ts:683 inside buildCloudUrls.
Video URL
Pattern: /s/<uuid>/runs/<runId>/recording.webm. Hits 404 during the run, hits the WebM file the moment the background upload at server.ts:740-744 completes. Filename is fixed at 'recording.webm' so the URL is computable upfront.
Log URL
Pattern: /s/<uuid>/runs/<runId>/execution.log. The log file is named exactly that at server.ts:601, so the deterministic URL is the same string every time.
Screenshot URLs
Pattern: /s/<uuid>/runs/<runId>/<idx>_step<n>_<action>.png. Each screenshot's filename is built at server.ts:468 as the run progresses, then included in the artifactNames.screenshots array passed to buildCloudUrls.
Why these URLs are computable upfront
The runner controls the file basenames. 'recording.webm' is a fixed string Playwright lands the video at. 'execution.log' is a fixed string. Screenshot filenames are zero-padded indices the runner generates as it goes. Run a hundred scenarios, the URL shape is identical every time.
The deterministic URL builder
Once the pre-flight has claimed the UUID and the run starts, the runner builds the artifact URLs in two passes. First it knows the basenames immediately because they are runner-controlled (execution.log, recording.webm). Second it accumulates the screenshot basenames as the test runs. The full cloudUrls object lands in the response payload before the function returns.
End to end timing, in one sequence
Here is the whole loop laid out as a sequence diagram. The horizontal axis is wall-clock time. The point of interest is the third message: the share link is in the response payload long before the browser opens. Every dotted line after that is happening while your teammate already has the URL.
Auto generation, pre-flight save, run, and background upload
The six steps in the lifecycle
Six steps, each one fires from a single region of server.ts. The third step is where the URL becomes paste-able. Everything before it is local work. Everything after it is artifact production that the URL eventually resolves to.
Auto generate the plan from a URL
assrt_plan launches a local Chromium, takes three screenshots at scroll offsets 0, 800, 1600, slices the accessibility tree to 8000 characters, sends that to claude-haiku-4-5-20251001 with the 18-line PLAN_SYSTEM_PROMPT, and writes the 5-8 #Case blocks to /tmp/assrt/scenario.md. server.ts:786-862. One model call. No proprietary intermediate format.
Pre-flight save claims the UUID
Before the browser launches for the run, the runner POSTs the plan plus metadata to the scenario API. The API returns a UUID. The runner immediately rewrites /tmp/assrt/scenario.md with that UUID in the metadata header so fs.watch syncs the right scenario. server.ts:407-425.
Build the deterministic URLs
buildCloudUrls turns the scenario UUID, the run ID (a separate crypto.randomUUID() at line 429), and the intended artifact basenames into a flat object of paste-able URLs. The URLs are returned in the response payload. server.ts:676-685.
Run the actual test
Now the browser opens. The agent reads the plan, calls snapshot to get the live accessibility tree, picks ref IDs, drives the page, asserts on visible text and roles. The video is recorded to videoDir, screenshots to screenshotDir, log to runDir/execution.log. agent.run() at server.ts:573.
Save the run record
saveScenarioRun writes a row keyed on the scenario UUID with passedCount, failedCount, totalDuration, and the full reportJson. The video, log, and screenshots are uploaded in the same call's .then() handler. server.ts:723-746.
Background upload finalizes the URLs
uploadArtifacts is a fire-and-forget POST that lands the WebM, the log, and each screenshot at the URLs that were already in your teammate's clipboard. Failures log to stderr but do not affect the response. server.ts:740-744.
What it actually looks like in your terminal
Two commands. The first auto-generates a plan from a URL. The second fires the run and prints the share link two log lines into the response, before the browser has even opened. Pasting that line into Slack is the workflow that the rest of the design is in service of.
Side by side against the post-hoc URL pattern
Five rows. The left column is what every cloud-hosted auto-generator does today: the share link only exists once the run is done. The right column is the pre-flight pattern: the link is the contract, the artifacts are what the URL eventually serves.
| Feature | Typical SaaS auto-generator (post-hoc URL) | Assrt (pre-flight UUID + deterministic URLs) |
|---|---|---|
| When the share link becomes valid | After the run completes and the run record writes to the vendor's database. Median latency 30-90 seconds depending on test length. | After the pre-flight POST to the scenario API. One HTTP round trip, 100-300ms. The browser has not opened yet. |
| What clicking the link mid-run shows | 404 or 'run not found' until the test is done. Some vendors show a half-built page that hides the artifacts. | The scenario page with the auto-generated plan. Run section is empty until the upload completes, then populates without a refresh. |
| What happens if the run crashes | Share link may never become valid. The vendor's run record may not exist if the runner died mid-execution. | Scenario is already saved from the pre-flight, so the link resolves to a scenario page with zero attached runs. Still useful: shows what was supposed to be tested. |
| How the artifact filenames are picked | Vendor-controlled, often UUIDs that are computed post-run. The URL cannot be predicted from the outside. | Runner-controlled fixed names: recording.webm, execution.log, <idx>_step<n>_<action>.png. The URL is computable from the scenario UUID and run ID alone. |
| How to opt out | Usually impossible without changing tools. Self-hosting requires an enterprise plan. | Set ASSRT_NO_SAVE=1 (server.ts:404). The pre-flight is skipped, the scenario gets a local- prefixed pseudo-ID, the run runs locally only. |
Background upload, fire and forget
The artifacts upload to the deterministic URLs after the response has already been returned to the caller. The upload is wrapped in a .catch()so failures land in stderr but do not affect the response body. By the time this code runs, the share link has been in your teammate's clipboard for sixty seconds.
The opt out, and what you trade away
Set ASSRT_NO_SAVE=1 before firing the run and the pre-flight does not happen. The check is one line at server.ts:404. The scenario gets a local- prefixed pseudo-ID, the run executes against your machine only, and the response carries no cloudUrls field. You still get the local Markdown plan at /tmp/assrt/scenario.md, the local video file, the local screenshots, the local results JSON. Air-gapped CI runners and offline development setups use this mode by default. The trade-off is that there is nothing to paste into Slack from this machine; the test is a local artifact only.
one number to take with you
The pre-flight is 0 HTTP round trip, on the order of 0ms. From that point forward the share link is in your hand. The test still takes 30 to 90 secondsto run; the artifacts still take a few seconds to upload. But the latency between "I fired the test" and "my teammate has a clickable URL" is one round trip, not the full duration of the run. Multiplied across the dozens of times a week a dev fires an auto-generated regression test against staging, the difference is real.
Want to wire pre-flight share links into your auto-generation flow?
Bring a URL we can run an auto-generation against. We will pre-save the scenario, paste the link into a thread, then watch the run together while the artifacts upload behind it.
Frequently asked questions
What does it actually mean that the share link is valid before the test runs?
It means the moment you call assrt_test or fire `npx @m13v/assrt-mcp run`, the runner does one HTTP POST to the scenario API, gets back a UUID, builds the deterministic artifact URLs (page, video, log, screenshots, results), and returns those URLs in the response payload before Chromium has even launched. The implementation is at /Users/matthewdi/assrt-mcp/src/mcp/server.ts lines 407 to 425, with the comment 'Pre-flight: create scenario UUID BEFORE test execution so cloud URLs are deterministic'. You can copy that response into a Slack thread, post it on a PR, or pin it in a Notion doc, and the URLs will resolve to the right artifacts the second the upload finishes after the run. Other auto-generation tools assign the share link as a side effect of a completed run, so the URL does not exist until the test is done. Assrt inverts that: the URL is the contract, the artifacts are what the URL eventually resolves to.
How is this different from Playwright codegen, ZeroStep, or Mabl when it comes to auto generation?
Playwright codegen records you clicking through the app and emits TypeScript spec files to your local disk. There is no share link at all; you commit the file to your repo and your teammate reads it. ZeroStep runs scenarios in their cloud and generates a run record with a URL after the run finishes; the URL exists only post-hoc. Mabl, Testim, and Functionize all follow the same post-hoc pattern. Assrt's MCP runner does the auto-generation step locally (three screenshots, 8000 chars of accessibility text, one model call), then pre-saves the scenario to its scenario API to claim a UUID, then runs the test, then writes artifacts back to deterministic paths that the URL already pointed at. The auto-generation, the share link, and the run are three separate steps that can be observed independently.
What exactly is in the pre-flight save? What happens between the assrt_test call and the browser opening?
The pre-flight is one HTTP POST to the scenario API with a JSON body containing the auto-generated plan text, the scenario name, the URL under test, the optional pass criteria, the optional variables map, and the optional tags. The API responds with a UUID. Then the runner immediately rewrites the local file at /tmp/assrt/scenario.md with that UUID as part of the metadata header (scenario-files.ts:42-48 calls writeScenarioFile with the real ID), so the fs.watch debounce loop that syncs edits during the run will sync against the right scenario. Only after this is the browser launched (server.ts:527, McpBrowserManager.launchLocal). The whole pre-flight is one round trip plus one local file write, on the order of 100-300ms.
Why does this matter? It feels like a small thing.
It is small in lines of code and large in workflow consequence. The Reddit-y dev pattern that drove the design: someone in #engineering says 'something is off in the staging signup flow', a teammate fires an auto-generated regression test against staging, and pastes the run link into the thread. With post-hoc URLs you have to wait 30-90 seconds for the run to finish, then go back, copy the URL, paste it. With pre-flight URLs the link goes into the thread first, then the run happens, then teammates click and see the live recording the moment the upload completes. Same artifacts, three fewer context switches per incident. Multiplied across a team of 8 hitting this pattern 5 times a week, that is a real time saving without any change to the underlying test runner.
What gets uploaded in the background versus what is in the URL immediately?
The URLs themselves are computed deterministically from the scenario UUID, the run ID (a separate crypto.randomUUID() at server.ts:429), and the file basenames the runner intends to produce: 'recording.webm' for the video, 'execution.log' for the log, '<index>_step<n>_<action>.png' for each screenshot, and 'latest.json' for results. The buildCloudUrls helper at server.ts:683 turns those names into URLs the moment the run begins. The actual files land at those URLs only after the run completes and the background uploadArtifacts call resolves (server.ts:740-744, fired with .catch() so it does not block the response). If you click the URL during the run you get a 404; click it after the run and you get the file. The contract is 'this URL will eventually serve this artifact', not 'this URL serves this artifact right now'.
What if the run crashes? Does the share link still resolve?
If the run completes, the saveScenarioRun call at server.ts:723-746 writes a run record with whatever artifacts did make it (a partial video, a truncated log, the screenshots taken before the crash). The share link points at those partials. If the entire process is killed before saveScenarioRun fires, the scenario itself is still saved (because of the pre-flight at line 407), so the scenario page resolves and shows the plan with no run attached. The worst-case failure mode is a scenario page that exists with zero runs, which is still useful: a teammate can see what was supposed to be tested even if nothing actually was.
Where does the auto-generation prompt come from? Is the auto-generated test a black box?
The auto-generation prompt for a URL is one constant called PLAN_SYSTEM_PROMPT at /Users/matthewdi/assrt-mcp/src/mcp/server.ts lines 219 to 236. It is 18 lines long. It tells the model what tools the runtime agent has, what output format to use (#Case N: name followed by 3-5 actions), and six rules: each case must be self-contained, selectors must be specific (text, not CSS), only verify observable things (visible text, page titles, URLs, element presence), keep cases short, do not test what is behind authentication unless visible, generate 5-8 cases max. There is no hidden second prompt and no proprietary YAML output. The output is a Markdown string the runner writes to /tmp/assrt/scenario.md (scenario-files.ts:17). You can read every byte of the auto-generation pipeline in one sitting.
How do I actually fire an auto-generation today? What is the first command?
Through an MCP client like Claude Code: connect the assrt-mcp server (one config line, see the README at /Users/matthewdi/assrt-mcp/README.md), then ask 'Use assrt_plan against http://localhost:3000'. Claude calls the MCP tool, the tool launches a local Chromium, takes three screenshots at scroll offsets 0, 800, 1600, slices the accessibility tree to 8000 characters, sends that plus the 18 line prompt to claude-haiku-4-5-20251001 with max_tokens 4096 (server.ts:829-834), and writes the resulting #Case blocks to /tmp/assrt/scenario.md. From the CLI: `npx @m13v/assrt-mcp@latest plan --url http://localhost:3000`. Both paths produce the same Markdown output and the same pre-flight UUID once you call the run step. The whole loop is auditable end to end.
Can I disable the pre-flight save? I do not want to send my plan to a remote API.
Yes. Set the environment variable ASSRT_NO_SAVE to any non-empty value before running. The check is at server.ts:404: `const autoSave = !process.env.ASSRT_NO_SAVE`. With auto-save off, the pre-flight does not happen, the scenario gets a `local-` prefixed pseudo-ID instead of a real UUID, and the run executes against your machine only. The trade-off is that you do not get cloud artifact URLs to share, but you do get every other piece of the pipeline: the local Markdown plan, the local video file, the local screenshots, the local results JSON. Air-gapped CI runners and offline development setups use this mode by default.
I came here from a Reddit thread complaining that auto-generated tests are too slow to share. Does this fix that?
Most of the perceived slowness is the time between 'I fired the test' and 'I have a URL my teammate can click'. Pre-flight URLs collapse that gap to one HTTP round trip, on the order of 100-300ms. The test itself still runs in real time (30-90 seconds for a typical signup or checkout flow with the AI agent), and the artifacts still take a few seconds to upload. But the share link is in your hand the moment you fire the test, which means the rest of the latency happens out of band. If your concern is 'auto generation is so slow that a teammate has to wait until I copy the URL', this design pattern resolves it. If your concern is 'the test itself takes too long', that is a different problem and the answer is to write tighter #Case files (3-5 actions per case, not 10).
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.