Test automation software tools, measured by what lands on your disk.
Every roundup of this category compares features. Codeless versus script. AI versus scripted. Cross-browser coverage. Enterprise SSO. Those are all fine. None of them answer the question a buyer should ask first: when the run finishes, what is on my filesystem? For Assrt the answer is 0 concrete files per run, written to a directory you can tar. For the SaaS platforms that share this shelf the answer is usually a URL. That difference is the whole piece.
The shelf label versus the install footprint
Every list in this category puts open-source libraries (Playwright, Cypress, Selenium) next to hosted platforms (Testim, Mabl, Functionize, testRigor, Katalon, Virtuoso) next to AI agents (QA Wolf, Momentic) and ranks all of them on the same feature axes. The axis those lists never draw is where your test actually lives after a run. For a library you installed, the answer is obvious: the test file is on your disk, the trace is on your disk, the video is on your disk. For a hosted platform, the answer is the same word every time: our cloud.
Buyers who treat "open source" as a proxy for "lives on my machine" often discover after signing up that several of the "open source" entries still require a free-tier SaaS dashboard to schedule, store results, or replay videos. The open repo is the client; the authoritative history lives elsewhere. The fair way to rank this category is to run each tool once, look at your disk, and ask: if I cancel tomorrow, do I still have this test?
The anchor: what one assrt run writes
Run npx assrt run --url http://localhost:3000 --plan checkout.md --video --json once and read the tree it leaves behind. Every path below comes out of assrt-mcp/src/mcp/server.ts lines 429-606 and src/core/scenario-files.ts lines 16-48.
The code that writes each of those files
These are the writes, in order. Nothing about the tree is generated by a helper library or abstracted behind a logging SDK. FourwriteFileSynccalls and twomkdirSynccalls are the entire mechanism.
Why zero-padded screenshot filenames matter
A small convention with an outsized effect on debugging. The filename is built from String(screenshotIndex).padStart(2, "0") at server.ts line 468, which means the directory listing you get from ls matches the execution order without a sort flag. Scroll the folder in Finder; you are watching the test replay.
The plan is the test; the plan is a file
The whole scenario is one Markdown file on your disk. Commit it. Diff it. Have two humans edit it. The agent reads the same file you do, and scenario-files.ts watches it with fs.watch so any hand edit syncs to the cloud record on a 1-second debounce.
The verdict is a file too
Every scenario ends with a single JSON file at /tmp/assrt/results/<runId>.json, plus an alias at latest.json. Each entry is a structured list of scenario results: the assertions that ran, their evidence strings, the path to the video, and a count of screenshots. You can jq across every past run without asking a dashboard for permission.
The six writes, in order
Six filesystem operations cover the entire artifact pipeline. Every one corresponds to a real line in assrt-mcp/src/mcp/server.ts.
crypto.randomUUID() picks the run ID
Before Chrome starts, a UUID is generated so the on-disk path and the cloud URL are deterministic. /tmp/assrt/<runId>/ exists before a single screenshot is captured.
mkdirSync(runDir/screenshots) and mkdirSync(runDir/video)
Two subdirectories are created up-front. Every emit event knows where its artifact goes without an async path-resolution step. Server.ts lines 431-432 and 541.
Every PNG lands as NN_stepN_<action>.png
String(index).padStart(2, '0') at line 468. A 47-step run sorts cleanly in any file browser; a 200-step run needs padStart(3) which we already plan to bump when the cap moves.
Playwright writes video/*.webm; we rename to recording.webm
At lines 584-590 we diff the webm files present before and after the session and rename the new one to recording.webm. One predictable filename per run, not a Playwright-hash you cannot remember.
execution.log and events.json written on exit
Both are produced from the same allEvents array; the log is string formatted for grep, the json is structured for jq and other machines. Written in the same milliseconds, so they never disagree.
player.html generated with the video inlined as a <video> tag
Self-contained HTML at runDir/player.html. Open it directly or serve it on 127.0.0.1 with assrt's built-in Range-supporting server (server.ts lines 118-215); seek works on multi-megabyte .webm without a full download.
From one run to every artifact
The flow below is the whole pipeline. A single CLI invocation fans out into a small set of named files on disk, and nothing is implicit or lazy: by the time the command returns, every file exists at a path you can predict.
assrt run -> /tmp/assrt/<runId>/
Hosted platform versus a tool that leaves a tree
Same category on a roundup shelf, very different answer to the question that matters after a year of use: where is my history?
Hosted test automation SaaS vs a local-artifact tool
Same shelf label, different install footprint.
| Feature | Hosted test automation SaaS | Assrt (local artifacts) |
|---|---|---|
| Where the test plan lives | Vendor DB; edits happen in the vendor's UI | /tmp/assrt/scenario.md on disk; git-friendly Markdown |
| Where the test log lives | Vendor DB; ZIP export if they allow it | /tmp/assrt/<runId>/execution.log + events.json, per run |
| Where the video lives | Vendor CDN; URL expires when account does | /tmp/assrt/<runId>/video/recording.webm (local .webm) |
| Offline replay | Requires dashboard login | Open /tmp/assrt/<runId>/player.html directly in any browser |
| Runtime | Vendor cloud executors | Local Chromium via Playwright MCP on your machine or CI |
| Package | Closed-source platform; per-seat contract | assrt-mcp on npm, MIT-licensed, one dev dependency |
| If the vendor vanishes | Dashboard returns 502; test history is gone | npx assrt still works from your node_modules cache |
The numbers for one representative run
A ten-turn guest-checkout scenario on a moderate-sized SPA. These are the file counts and sizes you can expect on disk; adjust for longer scenarios linearly.
A nine-item portability check for any tool in this category
Run the tool once. Open the disk. Go through the list. Any entry a tool cannot honestly claim is the missing piece between shelf label and actual software.
What a local-artifact tool looks like
- The plan is a plain text file you can open without any vendor app
- Every run writes a self-contained directory, not a pointer to a cloud record
- Screenshots are numbered so directory sort equals execution order
- The video is a real .webm next to the plan, not a streaming URL
- There is a standalone HTML player on disk that works offline
- The event log exists in both human and machine readable form
- The package is installable without an account (npx or npm install)
- The LLM call goes from your machine to the model vendor, not through the test-tool vendor
- A single tar of /tmp/assrt/ is the complete artifact of everything you ran
Where Playwright, Cypress, and Selenium fit
These three pass the portability check on their own; they always have. They are libraries you write tests against, your test files are yours, and their runners dump traces to your disk. What this page argues is not that they should be replaced; it is that most of the other entries on the same roundup should be honest about how little they leave behind. Assrt sits a layer above Playwright, uses the Playwright MCP protocol under the hood to drive a real browser, and inherits Playwright's video/trace machinery for the captures on disk. If you already have a Playwright suite you love, keep it. Use Assrt for the surfaces where selectors drift faster than humans can chase them; keep the .spec.ts files for everything deterministic.
Run one scenario on your app; we read the tree together
Fifteen minutes, your URL, one Markdown case. We will run it live, then open /tmp/assrt/<runId>/ side by side and walk through what ended up on disk and why.
Book a call →Frequently asked questions
What counts as a test automation software tool, versus a SaaS platform with a CLI?
A software tool runs on your machine, stores artifacts on your filesystem, and keeps working if the vendor's website is down. A SaaS platform runs tests in the vendor's cloud, stores traces in the vendor's database, and stops producing usable output the moment billing lapses. Most roundups mix the two under a single label because both are sold as 'tools for test automation.' The practical test: delete the vendor's dashboard URL from your browser history and cancel the account. If you still have every past test run, plan text, video, and assertion log on disk, it was software. If the test history is gone, it was a dashboard. Assrt is the first: every scenario is a Markdown file, every run is a directory under /tmp/assrt/, and the video player is a standalone HTML file that works on any localhost server.
Exactly what files end up on disk after one assrt run?
One call to assrt_test creates /tmp/assrt/<runId>/ and writes four things into it: screenshots/NN_stepN_<action>.png (one PNG per turn, zero-padded filename so ls sorts them in order), video/recording.webm (the full Playwright video for the session), execution.log (the human-readable event stream), events.json (the same events in parsed form), and player.html (a self-contained video player with timeline markers, generated fresh each run). At the parent level, /tmp/assrt/scenario.md is the test plan, /tmp/assrt/scenario.json is its metadata, /tmp/assrt/results/latest.json is the last verdict, and /tmp/assrt/results/<runId>.json is the verdict for every historical run. Tar the /tmp/assrt directory and you have a portable artifact nobody else is holding a copy of.
Why zero-padded screenshot filenames? Is that a small detail?
It is a small detail with a real consequence. The filename convention in assrt-mcp/src/mcp/server.ts line 468 is String(screenshotIndex).padStart(2, '0') + '_step' + currentStep + '_' + currentAction + '.png'. The zero-padding makes ls and every file browser sort the screenshots in true execution order, not lexical order. In a 47-step scenario, step 10's screenshot sorts before step 2's without padding; with padding, '10_step10_click.png' comes after '02_step2_click.png'. You get a flipbook of the run by scrolling a directory, not a sorted query in a hosted viewer. Rare tool decision; most capture tools dump screenshots with a uuid or timestamp, which is precisely what breaks casual inspection.
Is it actually open source, or is it one of those 'open source with a cloud' setups?
The assrt-mcp package on npm is MIT-licensed and runs end-to-end on your machine. There is an optional cloud component at app.assrt.ai that gives each scenario a shareable URL, but the cloud is a mirror, not the source of truth. scenario-files.ts watches /tmp/assrt/scenario.md for edits and pushes them to Firestore on a 1-second debounce (lines 97-103), and buildCloudUrls in scenario-store.ts constructs the public URL deterministically so it works before the upload completes. If app.assrt.ai returns 500, writes to local scenario.md still succeed, the next assrt run still executes, the video player still plays. The cloud is off-by-default if you register as local-only (scenarioId starts with 'local-' at scenario-store.ts line 127). Delete the /tmp/assrt directory and your tests are gone; that is the authority.
Every list puts Playwright, Cypress, and Selenium at the top. How is Assrt different from those?
Playwright, Cypress, and Selenium are libraries you program against. You write test.spec.ts, call page.locator('button.submit'), and the runner executes your code. They are software tools by every definition, and they are excellent. Assrt uses Playwright under the hood, via the Playwright MCP protocol, but the layer above is different: you do not write test.spec.ts. You write a Markdown plan with #Case blocks, and an agent turns that plan into runtime tool calls (navigate, click, type_text, assert) against the live accessibility tree. For someone who already maintains a Playwright suite, Assrt is not a replacement; it is a faster layer for the flows where selectors drift faster than humans can update them. For someone choosing their first automation tool, Assrt skips the selector layer entirely and writes plain Markdown scenarios you still own on disk.
What does the vendor keep if I only use the local install, no cloud account?
Nothing about your tests. The package is installed from npm, the browser is launched on your machine via Playwright MCP, the LLM call goes directly from your machine to Anthropic's API using your ANTHROPIC_API_KEY, and artifacts are written to /tmp/assrt/. We get zero scenarios, zero traces, zero screenshots. If you set ASSRT_TELEMETRY=0 no run metadata ever leaves your machine. The only exception is when you explicitly pass a scenarioId that was created against app.assrt.ai; then the scenario plan (the Markdown text) is synced to that cloud record so it survives a disk wipe. That sync is a read-write mirror, not an authority: the disk copy is always canonical. If the company vanishes tomorrow, you run npx assrt from your existing node_modules and everything works.
How do I read a past run without any vendor UI?
Open /tmp/assrt/<runId>/player.html in any browser; that is a standalone HTML file generated at server.ts line 619, with a small JavaScript player that loads video/recording.webm next to it. The timeline shows step markers, so you can click to jump to the turn where an assertion failed. For the event-level view, cat /tmp/assrt/<runId>/execution.log prints the scenario start, every tool call, every reasoning step, every assertion pass or fail, and the complete_scenario summary. For machine-readable post-processing, /tmp/assrt/<runId>/events.json has the same content as structured JSON. You can grep across every past run by running grep -r 'FAIL' /tmp/assrt/; you cannot do that against a vendor dashboard.
Is there a comparison of actual install footprints across the tools these lists rank?
Not that I have found, because the answer is embarrassing for most of the list. Of the 16 or so tools routinely ranked in this category, three or four produce fully local artifact trees (Playwright, Cypress, Selenium, Assrt). The rest store the canonical test log, video, and assertion history in the vendor's cloud, with a 'download as ZIP' button that bundles some subset. The download button is the shelf label; it is not the same as a tool that writes a self-contained directory every run without a cloud round-trip. If you rank the roundup by 'post-run, is the test log on my filesystem?', most entries fail silently.
Can I keep the whole /tmp/assrt tree in git?
Plans go in git, artifacts usually do not. The convention we use: commit /tmp/assrt/scenario.md (or move it into your repo under tests/ if you prefer), ignore the /tmp/assrt/<runId>/ directories because they contain binary artifacts that bloat a repo. If you want an audit trail, upload the video and events.json to your own object store from CI; the run ID is stable and deterministic so the paths in a CI artifact server mirror the local paths. A small assrt-ci.sh one-liner (cp -r /tmp/assrt/$RUN_ID $ARTIFACTS_DIR) is enough. The key property is that the files exist as files, not as blobs held by a third party.
What is the smallest real command to produce this directory?
npx assrt run --url https://your.app --plan '#Case 1: Load the homepage and assert the H1' --video --json. That installs assrt-mcp from npm into your npx cache, launches Playwright, navigates to the URL, writes /tmp/assrt/<runId>/screenshots/, video/recording.webm, execution.log, events.json, and player.html, and prints the JSON verdict to stdout. No account, no login, no browser-redirected auth flow. ANTHROPIC_API_KEY is the only environment variable required; everything else has a sane default. Check the directory with ls -la /tmp/assrt/ once the command returns. That is the whole surface.
How did this page land for you?
React to reveal totals
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.