Automation tools in QA: rank them by what lands on disk
Every "top 10 automation tools in QA" listicle ranks by feature checklists. I want to propose a harder test: after one real run, what files are in the folder? If the answer is a vendor dashboard URL and nothing else, you do not own your tests. This page walks through the exact artifact folder Assrt writes, with file paths, sizes, and the source lines that produce each file.
What Assrt writes, every single run
Run one assrt_test call. Here is the exact tree on disk afterwards. No dashboard, no API calls, no server-side rendering required to see what happened. All paths verified in assrt-mcp/src/mcp/server.ts and assrt-mcp/src/core/browser.ts.
One MCP call, six artifacts
The agent takes three inputs (an MCP call, a markdown plan, a live page) and emits six files you keep forever. Nothing in the fan-out below depends on assrt.ai being online.
assrt_test fan-out
The six artifacts, one at a time
Each card below maps to a real file on disk after a run. Hover any card to see it emphasized; the descriptions list the exact format, the exact location, and the tool you can use to open it with zero Assrt knowledge.
scenario.md
The test plan, in #Case N: name markdown. One-character shell edit is a valid test rewrite. fs.watch reloads it with a 1 second debounce.
results/latest.json
Structured run report. Documented fields: runId, passed, passedCount, failedCount, duration, scenarios[], screenshots[], videoPlayerUrl, cloudUrls.
video/recording.webm
1600x900 VP9. Red cursor dot, click ripples, keystroke toasts, green heartbeat pulse in the corner. Plays in VLC; no vendor player required.
video/player.html
Single self-contained HTML5 page. Keyboard shortcuts 1, 2, 3, 5, 0 set playback to 1x, 2x, 3x, 5x, 10x. Space toggles play. Arrows seek +/-5s.
screenshots/
Numbered PNGs: 00_step0_init.png, 01_step1_screenshot.png, and so on. Step index zero-padded; any diff tool can compare two runs.
execution.log + events.json
Plaintext timestamped log plus full structured event trace. Feed events.json into any observability stack without a Assrt-specific parser.
The test plan, in full
This is a real example of what Assrt writes to /tmp/assrt/scenario.md. Two test cases, bullet steps, no selectors, no YAML structure tags. The parser is a single regex in assrt-mcp/src/core/agent.ts that splits on /(?:#?\s*(?:Scenario|Test|Case))\s*\d*[:.]\s*/gi. You can write this file by hand in any text editor and the agent will run it.
The result, structured and readable
And the matching run report, written to /tmp/assrt/results/latest.json and a historical copy at /tmp/assrt/results/{runId}.json. Every field documented. No XML, no vendor-specific enums, no compile step between you and the data.
Vendor YAML vs Assrt markdown
Same test, two very different files. The left one only compiles through the vendor's recorder. The right one is the entire test.
The same 'signup validation' test, in two tools
# Proprietary vendor test, compiled from a visual recorder.
# You cannot edit this file by hand without breaking the recorder.
id: 7f2c-4a8e-b
testName: signup_validation
steps:
- $action: click
$target: { $xpath: "//header//button[contains(@class,'primary')]" }
$retryPolicy: { $maxAttempts: 3, $wait: 2000 }
- $action: type
$target: { $css: "form#signup input[name='email']" }
$value: { $secret: "ref:env.TEST_EMAIL" }
- $action: assertVisible
$target: { $css: "div.toast.toast-success" }
meta:
createdBy: recorder@vendor.io
recorderVersion: 18.2.1
proprietaryFlags: [ "internal_only", "compiled" ]Real numbers from the implementation
Not marketing figures. These are constants and behaviors verified in the Assrt source tree. Every number below has a file-and-line reference available on request.
Video overlays
0 injected before record
Red cursor dot, click ripples, keystroke toasts, and a green heartbeat pulse in the bottom-right corner that forces macOS compositor redraws. Implemented in assrt-mcp/src/core/browser.ts.
Player speeds
1x, 2x, 3x, 0x, 0x
Bound to keyboard keys 1, 2, 3, 5, 0 in the standalone player.html. Default on open is 5x. Arrow keys seek +/-5s. Space toggles play.
Retry backoff
0s / 0s / 0s
Linear, not exponential. The formula is literally (attempt + 1) * 5000. Fatal API errors skip retry entirely.
The tax of tools that fail the portability test
If you have shipped with one of these tools, you have paid at least one of the following costs. Most teams pay several, every quarter.
A five-step portability test for any QA tool
Run this checklist on whatever automation tool you are evaluating next. If it fails step two or three, the artifacts belong to the vendor, not to you. If it passes all five, your tests survive acquisitions, price hikes, and migrations.
Open. Play. Parse. Zip. Survive.
Open the scenario in a plain text editor
Can you read and edit the test in vim, VS Code, or GitHub's web UI without launching the vendor tool? Assrt stores scenarios as /tmp/assrt/scenario.md in #Case N: format. If your tool fails this step, every future edit pays the recorder tax.
Play the recording outside the vendor's player
Assrt writes recording.webm as raw VP9. It opens in VLC, in ffmpeg, in any browser. The included player.html is a convenience, not a dependency. If your tool's replay only works inside a logged-in dashboard, you do not own that evidence.
Parse the run result with any JSON reader
results/latest.json has a documented schema (runId, passed, passedCount, failedCount, duration, scenarios[], screenshots[], videoPlayerUrl, cloudUrls). jq handles it, Python's json module handles it, Go's encoding/json handles it.
Zip the folder, mail it, replay on another machine
The entire /tmp/assrt/{runId}/ directory is self-contained. No absolute URLs, no cloud references, no vendor SDK. A teammate on a plane with no internet can still watch the exact failure at 5x speed.
Survive a 10x price increase from the vendor
Assrt is MIT licensed and self-hosted. The scenario, the runner, the MCP server, and the browser profile all live on your laptop. If the project gets acquired and someone raises the price, nothing breaks. Test portability IS pricing insulation.
How Assrt passes each check
The same five tests, scored against what Assrt actually does. Every check reflects behavior you can verify by running npx assrt-mcp once.
Assrt portability scorecard
- Test plan opens in vim without the vendor app
- Video replay plays in VLC
- Result file has a documented JSON schema
- Run folder replays offline from a zip
- License lets you fork if pricing breaks
- Runtime sends zero network calls beyond Playwright and the LLM
Assrt vs paid SaaS automation tools in QA
The features that matter when you open the folder.
| Feature | Paid proprietary recorder | Assrt |
|---|---|---|
| Test plan format | Proprietary YAML / JSON emitted by a recorder | Plain markdown: #Case N: name followed by bullet steps |
| Run video format | Cloud-hosted replay behind auth | recording.webm (1600x900 VP9) + standalone player.html |
| Result access | Dashboard only; scrape an API to export | results/latest.json with documented schema |
| Retry policy | Selector retry table configured per step | Fresh accessibility snapshot on failure; 4-attempt LLM retry with 5s/10s/15s backoff |
| Offline replay | Requires vendor cloud | Zip the run folder, open on any machine |
| Pricing model | $5K to $7.5K per month at enterprise scale | MIT licensed; self-hosted; pay only your LLM bill |
| CI integration | Install the vendor's CLI + inject a token | Pass the markdown as the plan arg; read the JSON result |
Prices cited are public enterprise quotes from platforms in the same category, April 2026.
The one scenario the others cannot run
Offline replay of a failed test. Copy the run folder to a USB stick, hand it to a teammate without internet, open the player, and see the failure at 5x speed. The folder contains its own recording, its own player, its own plan, and its own structured result. Nothing in it phones home.
Anchor fact
Per-run folder at /tmp/assrt/{runId}/ ships a 1600x900 VP9 recording.webm next to a standalone player.html whose keyboard shortcuts 1, 2, 3, 5, 0 bind to 1x, 2x, 3x, 5x, 10x playback. No asset paths are absolute. No script tag points at an external origin. Zip the folder, unzip on another machine, the replay still works.
Quickstart: ship one portable test in 3 minutes
One shell line installs the MCP server, one MCP call runs the test, and the run folder is written to /tmp/assrt/ on your machine. That is it.
Want to pressure-test your current QA tool against the portability checklist?
Bring your vendor's artifact folder. We will open every file live on the call and map it to the five-step test above.
Book a call →Automation tools in QA: common questions
Why should I pick automation tools in QA by the artifacts they leave on disk, not by feature lists?
Feature lists rot. The artifact folder is what actually stays in your repo. If your automation tool stores scenarios inside a SaaS dashboard, videos behind an auth URL, and results inside a proprietary format, your QA suite dies the day the vendor raises the price or shuts down. The only way to know whether you own your tests is to open the folder the tool writes to after a run. For Assrt the folder is /tmp/assrt/{runId}/ and every file in it is either plain text, JSON, PNG, WebM, or standalone HTML. Nothing depends on the assrt.ai website to keep working.
What exact files does Assrt write on every test run?
The MCP server writes /tmp/assrt/scenario.md (the test plan in markdown, #Case N: name format), /tmp/assrt/scenario.json (UUID, name, URL), /tmp/assrt/results/latest.json (structured result), and a timestamped per-run folder /tmp/assrt/{runId}/ containing screenshots/00_step0_init.png, 01_step1_screenshot.png ..., video/recording.webm (1600x900 VP9), video/player.html (standalone HTML5 player with keyboard shortcuts 1, 2, 3, 5, 0 for speeds 1x, 2x, 3x, 5x, 10x and Space / Arrow keys for seek), execution.log, and events.json. See assrt-mcp/src/core/browser.ts for the video writer and assrt-mcp/src/mcp/server.ts for the scenario and result writers.
What does the recording actually look like?
Playwright MCP records the raw window at 1600x900 with VP9 codec. Before each frame, Assrt injects four visual overlays via injected JavaScript: a red cursor dot that follows the pointer, click ripples that expand out from clicks, keystroke toasts in the top-right corner that show each key press as it happens, and a green heartbeat pulse in the bottom-right corner used to force the macOS compositor to redraw reliably. All four are implemented in assrt-mcp/src/core/browser.ts around line 29 to 98. The file is a plain .webm you can open in VLC or ffmpeg without Assrt at all.
How do retries work when the model or the browser flakes?
The agent retries up to 4 times total (1 initial + 3 retries) on LLM 429, 503, or an error matching /overloaded/i. Backoff is (attempt + 1) * 5000ms, so the delays are 5s, 10s, 15s. Fatal API errors (anything matching /tool_use|tool_result|invalid_request/i) skip retry and fail immediately. Browser tool failures are handled differently: instead of a sleep-and-retry, the catch block calls browser.snapshot() again, attaches the fresh accessibility tree to the tool result, and lets the model re-reason. That replaces the classic retry-with-the-same-selector loop you get from most automation tools in QA.
Is the video player really standalone?
Yes. player.html is a single self-contained HTML5 file that loads recording.webm from the same directory. It has no external script tags, no vendor SDK, no analytics pixels. Keyboard shortcuts: 1, 2, 3, 5, 0 set playback speed to 1x, 2x, 3x, 5x, 10x; Space toggles play; Left/Right seek by 5 seconds. The default speed on first open is 5x (configurable with autoOpenPlayer: false on the MCP call). You can zip the run folder, mail it to a teammate on a plane, and the replay still works.
Where are scenarios stored, and can I version them in git?
Runtime source of truth is /tmp/assrt/scenario.md, which is plain markdown. Editing the file triggers fs.watch with a 1000ms debounce that syncs edits back to Firestore for cross-device sharing (assrt-mcp/src/core/scenario-files.ts, debounce timer). But the markdown is the canonical artifact, and you can commit the same file into your repo. In CI you pass the markdown as the 'plan' parameter on assrt_test, bypass the watcher, and log only the structured result. git diff shows real prose changes, so PR reviews work.
How does Assrt compare to paid SaaS automation tools in QA on pricing and lock-in?
Paid enterprise QA automation platforms land in the $5K to $7.5K per month range once you add parallel execution and test data features. Tests live in their web UI, video replays require their cloud, and exporting the suite typically means scraping a JSON API. Assrt is MIT licensed, the test agent and MCP server run entirely on your machine, and the per-run artifact folder is enough to replay, debug, and attach to a Jira ticket without an account anywhere. The only optional cloud call is the scenario sync to Firestore, which you can disable with ASSRT_NO_SAVE=1.
Does Assrt generate real Playwright code like competitors promise?
Not as .spec.ts files, and that is deliberate. Generated Playwright code with explicit selectors becomes maintenance debt the moment the DOM moves. Assrt instead stores the scenario as prose (#Case N: name followed by bullet steps) and resolves each step against the live accessibility snapshot through Playwright MCP at runtime. The tool calls themselves are Playwright calls (click, type_text, select_option, navigate, press_key, evaluate, assert), defined in assrt-mcp/src/core/agent.ts lines 16-196, so the browser driver is genuinely Playwright. What you do not get is a brittle .spec.ts sitting in your repo waiting to break.
What does a good 'portability test' look like for any QA automation tool?
Five checks. One, can you open the test file in a plain text editor without running the tool? Two, does a run produce a video you can play in VLC, or only a replay viewer that requires a login? Three, is the result a machine-readable file with a documented schema, or an HTML page you have to scrape? Four, can you zip the run folder and mail it to someone who has never heard of the tool, and can they still see what failed? Five, does the pricing model still let you keep running the tool if the vendor 10x's the price tomorrow? Assrt passes all five by design; most proprietary automation tools in QA fail at least three.
How do I try Assrt end to end without committing?
One shell line: npx assrt-mcp. That registers the MCP server with your coding agent (Claude Code, Cursor, or anything supporting MCP). Then from your editor call the assrt_test tool with a URL and a markdown plan. The first run creates ~/.assrt/browser-profile (persistent Playwright profile), writes /tmp/assrt/scenario.md, produces /tmp/assrt/results/latest.json, and auto-opens the standalone video player at playback speed 5x. Nothing is sent anywhere you did not explicitly configure.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.