A test automation tool where the test file is a Markdown document, not code
Every other guide about test automation tools compares programming languages, parallel shards, and pricing tiers. None of them answer the only question an AI-era team should be asking: what exactly is the file your tests live in, and who is allowed to edit it. Assrt's answer is one path, one format, and one watcher. This is the whole design, and the article a buyer guide will not print.
The gap
Comparison articles sort by language. The actual question is about the file.
Type "test automation tool" into any search bar and you will get a wall of listicles: Selenium 4, Cypress, Playwright, Mabl, Testim, QA Wolf, BrowserStack. They sort by supported languages, parallel run count, and starter plan price. They almost never describe the single most consequential thing about the tool you are about to adopt, which is the format your tests live in and who can edit that format once the product is live.
Two historical answers have dominated. A script based tool (Selenium, Cypress, Playwright) stores tests as compiled code in a language you picked. A recording based tool (Mabl, Testim, QA Wolf) stores tests as a proprietary YAML or binary tree inside their cloud. Both lock you in differently. Scripts demand rewrites every time the UI shifts. Recordings demand their vendor to stay alive.
Assrt does neither. Your plan is a plain English file. The path is not a secret. The format is Markdown. And the file is watched, so edits propagate without a build step. If you want to understand whether this matters for your team, keep reading. If you want to skip to the anchor, it starts three sections down at /tmp/assrt/scenario.md.
Playwright MCP tools wired in
agent.ts lines 16–196
Debounce before cloud sync
scenario-files.ts line 102
License cost per seat
vs $7,500+/mo enterprise AI testing SaaS
The anchor
One path, one format: /tmp/assrt/scenario.md
When you run assrt_test via the MCP server or the CLI, the first thing the tool does is write your plan to disk. Not a temp buffer, not an in-memory object: an actual Markdown file at a predictable path. Here is what ends up in that file:
That is the entire specification. There is no hidden step list, no generated Playwright .spec.ts sitting behind it, no XPath cache. At the next run the file is read back in, handed to an LLM agent, and the agent interprets each #Case block against a fresh accessibility tree. That is the whole design.
And the file is watched
Line 97 registers a Node fs.watch callback on the scenario file. Every edit, whether you made it in your editor or a coding agent made it mid-session, resets a 1000ms debounce timer. When the timer fires, syncToFirestore() pushes the new content to shared storage so the next run, on your machine or a teammate's, sees the latest plan. No build pipeline, no artifact upload.
How the file moves
Three editors, one source of truth, no compile step
scenario.md is the only artifact that matters
Anatomy of a run
What actually happens when you press run
The plan gets written to disk
writeScenarioFile() at scenario-files.ts line 42 dumps your #Case text verbatim to /tmp/assrt/scenario.md. An accompanying scenario.json stores id, name, url, and updatedAt.
fs.watch starts watching the file
startWatching() on line 90 installs the Node watcher with a 1000ms debounce. Any future edit triggers syncToFirestore() so teammates see the change.
The preflight checks the target URL
Before burning time on Chrome, Assrt does a HEAD request to your URL with an 8000ms timeout. A wedged dev server fails fast with an actionable error instead of manifesting as an opaque MCP disconnect.
Playwright MCP launches the browser
Local by default. --headed for a visible window, --isolated for an in-memory profile, --extension to attach to your running Chrome via CDP. The browser boots once and is reused across scenarios.
The LLM rereads scenario.md and drives the browser
claude-haiku-4-5-20251001 by default. It receives the accessibility tree, picks one of eighteen tools (navigate, click, type_text, wait_for_stable, http_request, and so on), and iterates until it can call complete_scenario.
Assertions, a video, and the results file land on disk
Every assert() call is logged. A .webm recording is auto-opened in a player at 5x speed. Results sit at /tmp/assrt/results/latest.json so the coding agent can Read them without a network call.
What it looks like
A single run, from the shell
Put the plan inline or in a file. No config, no describe/it boilerplate.
The coding agent loop
The first test automation tool your coding agent can actually fix
Because the plan is a Markdown file at a known path, the same agent that wrote your feature can fix the broken test. There is no specialised test DSL to learn. The agent reads /tmp/assrt/scenario.md with its ordinary Read tool, edits one line, and the file watcher picks up the change for the next run. That is the whole loop:
How this differs from what you already have
Assrt vs a typical script-based or recorded automation tool
| Feature | Script / recording tool | Assrt |
|---|---|---|
| Where your test plan lives | Compiled .spec.ts files, or a proprietary YAML DSL inside a SaaS UI | Plain Markdown at /tmp/assrt/scenario.md, synced to cloud via fs.watch |
| What your coding agent can do with it | Read only, or regenerate from scratch and overwrite your diff | Read and Edit the file with normal tools, changes picked up on next run |
| What happens when a selector changes | Script breaks, CI red until someone rewrites getByRole | LLM rereads the plan, looks at the accessibility tree, finds the new label |
| Source of the test driver | Closed binary, vendor-hosted cloud | Open source TypeScript. Self-host it or run npx @assrt-ai/assrt |
| Typical team license | $7,500/mo and up for enterprise AI testing SaaS | Free. You pay the Anthropic or Google bill for the model, nothing else |
| What you keep if you walk away | Nothing. The YAML does not run anywhere else | A Markdown file and a Playwright MCP config. Both open formats |
Assrt is not a drop-in replacement for high-scale shard/parallel-heavy CI setups. It is optimised for the dev-loop case where a coding agent writes the feature, writes the test, and fixes the test.
What the agent can do
18 tools, defined in one file, readable in ten minutes
Page interaction
navigate, snapshot, click, type_text, select_option, scroll, press_key, wait, wait_for_stable, screenshot, evaluate. The agent gets an accessibility tree, picks a ref like e5, and acts on it.
Disposable email and OTP
create_temp_email spins up a disposable address. wait_for_verification_code polls the inbox. The system prompt teaches the agent to paste OTPs into split single-character inputs via a single ClipboardEvent, rather than typing one field at a time.
External API verification
http_request lets the agent poll Telegram, Slack, GitHub, or any webhook endpoint to verify that an action in the web app produced the expected external effect.
Assertions and suggestions
assert logs a pass/fail with evidence. suggest_improvement flags UX bugs the agent spotted while running the plan, so every test run doubles as a light product review.
Continuous page discovery
Every URL the agent visits is queued for auto-discovery. A secondary model generates extra test ideas for that page in the background, up to MAX_DISCOVERED_PAGES = 20.
Variables and pass criteria
Pass variables as {{KEY}} placeholders and they get interpolated into the plan. Pass passCriteria as free text and the agent must verify every condition or mark the scenario failed.
Want to see scenario.md on your own app?
Bring one user flow that keeps breaking. We will turn it into plain English in a file you own, run it live, and show you the video.
Book a call →Frequently asked questions
What is a test automation tool in 2026, and why are plain English plans relevant?
A test automation tool runs browser interactions against a real application and reports pass/fail without a human clicking through. Until recently that meant a compiled script (Selenium, Cypress, Playwright) or a point-and-click recorder that exports a proprietary YAML DSL (Mabl, Testim, QA Wolf). Both assume the test is static code. In an agent-native tool like Assrt, the plan is plain text that an LLM interprets at runtime. The practical payoff is that minor UI edits (a relabelled button, a reordered form) do not break the test, and a coding agent can open the same file and patch it the way a human would.
Where does Assrt actually store my test plan?
On disk at `/tmp/assrt/scenario.md`. The file is created by `writeScenarioFile()` in `src/core/scenario-files.ts` (line 42 in the assrt-mcp source). Results for the last run land at `/tmp/assrt/results/latest.json`, and per-run artifacts live at `/tmp/assrt/<runId>/`. Every assrt_test call rewrites `scenario.md` with the current plan, which means Claude Code, Cursor, or any other file-tool-equipped agent can Read and Edit it without a custom API.
How does Assrt detect when I or the agent edit the Markdown file during a session?
`startWatching()` calls Node's `fs.watch(SCENARIO_FILE, { persistent: false })` (line 97 in scenario-files.ts). The change event is debounced for 1000ms, then `syncToFirestore()` pushes the new content to the shared store. The next assrt_test run reads the updated plan on disk before handing it to the agent. One caveat: scenarios with IDs prefixed `local-` are skipped, so offline-only plans stay local.
If the plan is plain English, what actually drives the browser?
Playwright MCP. Assrt wraps `@playwright/mcp` and exposes a fixed tool list (eighteen tools in `agent.ts` lines 16 through 196) to the LLM. The model gets an accessibility snapshot of the current page, decides which tool to call (`click`, `type_text`, `press_key`, `wait_for_stable`, etc.), and Assrt proxies the call into Playwright. The agent also owns scenario orchestration, disposable email creation, and OTP pasting for split code inputs.
Which model runs my tests, and can I change it without rewriting anything?
Default driver is `claude-haiku-4-5-20251001`, set on line 9 of `assrt-mcp/src/core/agent.ts`. You can pass `--model` to the CLI or `model` to the MCP tool to swap in another Anthropic model, or `--provider gemini` for Google's `gemini-3.1-pro-preview`. The plan file does not change. You pay the model provider directly; Assrt itself is free and open source.
How do I run it without leaving data in someone else's cloud?
Run the CLI locally with `npx @assrt-ai/assrt run --url <your-url> --plan <...>`. Launch is local by default. Pass `--isolated` to keep the browser profile in memory only, no disk persistence. Skip scenario sync by setting `ASSRT_NO_SAVE=1` or by using a `local-` scenario ID. The test plan stays as a file you own, the video recording lands in `/tmp/assrt/<runId>/video/recording.webm`, and no cloud account is required to run a full end-to-end test.
Does 'the LLM interprets the plan every run' mean my tests are flaky?
Less flaky than you expect, and the failure mode is different. The agent grounds every step in the accessibility tree it just pulled from the page, so a one-off DOM change in isolation usually works. What fails is genuinely ambiguous steps ("click the button" on a page with ten buttons). The fix is to make the plan more specific the same way you would make a bug report more specific, not to rewrite selector code. Pass `passCriteria` to enforce explicit conditions the agent must verify.
How does this differ from recording-based tools like Mabl, Testim, or QA Wolf?
Recording tools capture selectors at record time and store them in a proprietary format. You get locked into their cloud and their DSL. If their service goes down or raises prices, your tests are stuck. Assrt stores the plan in a Markdown file you can commit to git, runs on open-source Playwright MCP, and is driven by a commodity LLM API. The words 'export to plain code' are not a feature you have to ask for, they are the default storage format.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.