An AI testing tool whose test file is a file, not a row in a database

Every article on AI testing tools I've read this year compares features. Self-healing locators, natural-language recorders, flake detection, dashboards. None of them asks what the test itself is, or where it lives, or whether you can take it with you when the vendor relationship ends.

Assrt's answer is three literal paths on your disk: /tmp/assrt/scenario.md, /tmp/assrt/scenario.json, /tmp/assrt/results/latest.json. The plan is Markdown. The watcher is fs.watch with a 1000 ms debounce. The browser driver is Microsoft's own @playwright/mcp package. Nothing about the contract requires Assrt to exist tomorrow. That is the thing the feature comparisons miss.

Matthew Diakonov, Written with AI

Published April 23, 202610 min read

4.8from 196 engineers

Test plan is Markdown at /tmp/assrt/scenario.md

fs.watch debounce set to 1000 ms at scenario-files.ts:102

Browser driver is @playwright/mcp (browser.ts:284), not a fork

MIT licensed, 1,087 lines in agent.ts you can read today

Your test should be a file

three paths, one Markdown plan, fs.watch at 1 second

scenario.md — your plan in plain Markdown

scenario.json — id, name, url metadata

results/latest.json — last run, assertions, timing

fs.watch — 1000 ms debounce on every edit

@playwright/mcp — the real browser driver

0:00 / 0:05

3 paths

“The authoritative copy of your test never leaves your disk. Everything Assrt does sits on top of a file you can open, diff, and commit.”

scenario-files.ts (lines 16-48)

What "AI testing tool" usually means, and why that framing is a trap

The category as it is marketed right now is a managed SaaS with an LLM glued on top. You record a flow in their visual editor. The recording is stored as rows in their Postgres instance. The AI helps you author assertions and heal locators when something drifts. The dashboard is lovely. The price is between roughly $1,000 and $7,500 per month for a team. Your tests are features of their product, not files on your machine.

This shape works, right up until you decide to leave. At that point the question that should have been asked on day one finally comes up: what is the test, actually? If the answer is "a thousand rows in their internal schema, accessible only through their UI," you are not leaving cleanly. You are either rewriting the suite against a new tool or paying to keep the lights on. The tool chose you, not the other way around.

Assrt was built the other way. The primary artifact is a Markdown file on your disk. Everything else in the product is a convenience layered on top of that file.

The contract: three paths

Open /Users/matthewdi/assrt-mcp/src/core/scenario-files.ts. The first 20 lines define the entire filesystem surface Assrt exposes. Here is that block, verbatim:

assrt-mcp/src/core/scenario-files.ts (lines 16-20)

What those three paths give you, at a glance

/tmp/assrt/scenario.md

Your plan. Plain Markdown, one `#Case N: name` heading per scenario, plain English steps below. Written at scenario-files.ts line 45 on every run. Edit it in your editor of choice; fs.watch picks up the change.

/tmp/assrt/scenario.json

Metadata: id, optional name, target URL, updatedAt timestamp. Written beside the plan at line 46. Tells Assrt which scenario this file belongs to so reruns work across sessions.

/tmp/assrt/results/latest.json

Last run output: per-case pass/fail, assertions, timing, step log. Plus a timestamped sibling keyed by run ID (line 82) so you build a local history without any database.

fs.watch, 1000 ms debounce

Every edit to scenario.md fires a watcher callback (line 97). The callback cancels any pending timer and schedules a new one 1000 ms out. Only the last save wins. The file is never out of sync with your intent.

No 'import/export' step

Because the plan is already a file, there is nothing to import into Assrt and nothing to export from it. The file IS the test. Commit it to git next to your code and you own your test suite, not a license to it.

How the watcher works, and why 1000 ms is exactly the right number

The loop that keeps your on-disk plan and your cloud storage in sync is 22 lines long. It opens a non-persistent fs.watch on the plan file, and every time the watcher fires, it resets a timer. One second after your most recent keystroke, the file content is pushed to Firestore. The next assrt_test call reads the file back off disk, so your edits always land before the next run.

1000 ms was chosen because that is how long humans pause between save and second-guessing. Shorter debounces fire multiple syncs for a single edit burst. Longer debounces delay the "my change is live" feeling past the threshold where you trust it. One second feels like exactly the moment you expected, which is what you want from a watcher you should not have to think about.

assrt-mcp/src/core/scenario-files.ts (lines 90-111)

The data flow end to end

Your editor is a peer of the test runner, not a client of a SaaS UI

The browser driver: not a fork, just @playwright/mcp

The other place vendor lock-in usually hides is the browser runner. Most AI testing tools ship a proprietary fork of Playwright or a custom driver you have to learn. Assrt does not. The browser layer is Microsoft's own @playwright/mcp package, spawned over stdio by the Assrt agent at line 284 of browser.ts. The 18 tool names the agent calls (navigate, snapshot, click, type_text, the rest) are standard @playwright/mcp tools. Any other MCP client that spawns the same package can drive the same browser in exactly the same way.

assrt-mcp/src/core/browser.ts (around line 284)

What a real run looks like from the terminal

assrt run — local dev server

The full lifecycle of a test, step by step

You call assrt_test with a URL and a plan

Either from Claude Code via the MCP tool, or from your terminal with `assrt run`. The plan can be inline text or a path to an existing Markdown file in your repo.

Assrt writes your plan to /tmp/assrt/scenario.md

Via writeScenarioFile (scenario-files.ts line 42). The same call writes scenario.json with the scenario id, name, and URL. From this moment on, the file is the authoritative copy of your test.

fs.watch starts, with a 1000 ms debounce

startWatching (line 90) opens a non-persistent watcher on scenario.md. Any edit reschedules a 1-second timer; the last edit within that window wins. No race, no double-sync.

@playwright/mcp spawns and drives the browser

browser.ts line 284 resolves the @playwright/mcp package and spawns it over stdio. The AI agent issues standard tool calls (navigate, snapshot, click) and the Playwright MCP server executes them.

Results land at /tmp/assrt/results/latest.json

writeResultsFile (line 77) emits the latest run plus a sibling keyed by run ID. Step log, assertions, pass/fail, timing. No database, just JSON you can pipe into grep or jq.

You edit. You commit. You walk away.

Because all three files live on your disk under paths you control, nothing Assrt does is required to keep the test. Copy /tmp/assrt/scenario.md into your repo, run `git add`, you are done.

one run, one Markdown plan, three on-disk artifacts

Edit scenario.md

fs.watch fires

1 s debounce

assrt_test reads file

@playwright/mcp runs it

results/latest.json

The shape of the category, in one row

Every other tool in this space competes on features inside their UI. Assrt competes on what survives your disk.

Feature	Typical AI testing SaaS	Assrt
Test artifact	Row in vendor database, exported as bespoke JSON	Markdown at /tmp/assrt/scenario.md you own
Edit workflow	Their recorder UI, inside their app	Your editor. fs.watch syncs on save with 1s debounce
Browser driver	Proprietary fork or wrapper	@playwright/mcp spawned at browser.ts line 284
Leave the tool	Cancel subscription, tests become unreachable	Copy three files, commit to git, continue
Hosting	SaaS, your data on their servers	Self-hosted. Local Chromium. Your API keys.
Price to run your own tests	Seat fees plus usage, often $1,000-$7,500 per month	$0. Bring your Anthropic or Gemini key.
Audit the runner	Closed source	1,087 lines at assrt-mcp/src/core/agent.ts to read

The field we are in

Tools that show up in "best AI testing tool" guides. Most are SaaS products. All of them store your tests inside their application.

testRigorMablTestimQA WolfRainforest QAOctomindVirtuosoApplitoolsKatalonACCELQFunctionizeBrowserStack AILambdaTest KaneAISauce Labs AI

Sixteen things that follow from "the test is a file"

Plan is Markdown, not JSON or YAMLThree fixed paths, no opaque storagefs.watch with 1-second debounce@playwright/mcp as the browser driver1,087 readable lines in agent.tsWorks without a cloud accountRuns against localhost by defaultYour Anthropic or Gemini key, not theirsDisposable inbox from temp-mail.iohttp_request tool for webhook asserts#Case N: format, one line to learnVideo auto-saved to ~/.assrtClaude Haiku 4.5 by defaultSwap to Gemini 3.1 Pro with one env varMIT licensedNo locator DSL to memorize

By the numbers

paths on disk define the entire contract

0 ms

fs.watch debounce at scenario-files.ts line 102

lines of open TypeScript in agent.ts

to self-host with your own API key

So what do you do with this

If you are picking a tool to write browser tests with an AI agent, read the other guides about self-healing locators and dashboards. They are correct about the features that tools compete on. Then ask the question those guides skip: where is the test? If the answer is anywhere other than a file in a path you control, you will feel the cost of that answer every time you try to change the runner, move between machines, audit the suite, or walk away.

If the answer needs to be a file, there are a few options. You can write raw Playwright by hand, which is fine if you have engineers whose full-time job is test authoring. You can use Playwright MCP directly without a scenario layer, which works but throws out the pass/fail structure. Or you can use Assrt, which stores your plan as Markdown, uses @playwright/mcp underneath, and stays out of the way of your editor.

The install is one command: npx @assrt-ai/assrt setup. Your first test is a Markdown file you can write in your editor. Everything after that happens under a path you control.

Want to watch it run against your app?

30 minutes. Your URL. Your repo. Live test plan in Markdown, video of the run, nothing stored behind a login.

Frequently asked

What exactly is the 'test file' when you use Assrt as an AI testing tool?

A Markdown file at /tmp/assrt/scenario.md. Each scenario is a heading of the form `#Case N: name` followed by plain English steps. You can open it in any editor, grep it, diff it against a previous version, or paste it into a pull request description. The file is written on every assrt_test call at scenario-files.ts line 45 and read back on the next run. Two sibling files complete the contract: /tmp/assrt/scenario.json holds metadata (id, name, url, updatedAt) and /tmp/assrt/results/latest.json holds the most recent run output. The three paths are defined as constants at scenario-files.ts lines 16-20. That is the whole filesystem surface.

Why does the test format matter compared to features like self-healing locators or no-code authoring?

Because features depreciate, formats escape. Locator intelligence and no-code recorders live inside a vendor's UI, which means your tests live there too. If you cancel the subscription, the recorder JSON either stays locked in the vendor's cloud, exports as something only their engine understands, or emits brittle Playwright with bespoke custom commands. Markdown is none of those things. The #Case format defined in scenario-files.ts was chosen so a human reader, a grep pipeline, or a completely different test runner can all read the same file. You can move off Assrt by continuing to store the same Markdown and wire it to another runner yourself, because the file does not encode anything proprietary.

How does the fs.watch debounce actually work, and what is the 1000 ms number for?

Line 102 of scenario-files.ts starts a fs.watch on /tmp/assrt/scenario.md with persistent: false. Every mutation fires an event, but the handler does not sync immediately. It schedules a setTimeout for 1000 ms (line 102) and cancels any previous pending timer. If you save the file three times in half a second while iterating on a step, only one sync fires, one second after your last save. That matches how a human edits Markdown: save, tweak, save, tweak, save, walk away. The syncToFirestore call at line 141 then pushes the current text to cloud storage if you have an account. If you do not, the sync is a no-op; the local file is still the source of truth.

What browser driver sits underneath Assrt? Is it a proprietary recorder?

No. The browser layer is Microsoft's own @playwright/mcp package, spawned over stdio at assrt-mcp/src/core/browser.ts line 284, with its output directory at ~/.assrt/playwright-output (line 291). Assrt never ships its own clone of Playwright. The AI agent calls structured tools named navigate, snapshot, click, type_text, scroll, press_key, wait, evaluate — and those tools are @playwright/mcp tools. The 18 tool definitions at agent.ts lines 14-196 wrap them with testing-specific additions like create_temp_email and http_request, but the base is a standard Playwright MCP server that any other MCP client could also drive.

What happens to my test file if I leave Assrt tomorrow?

You already have it. Because /tmp/assrt/scenario.md is a real file on your real disk, nothing Assrt does is required to keep reading it. In practice the right move is to copy /tmp/assrt/scenario.md into your repo next to your Playwright project (or anywhere else your team looks for tests), commit it, and version it like any other text asset. The plan text survives on its own. If you want to keep running it, you can either keep using Assrt, or write your own test runner that reads the same #Case format; the format is documented in the README and the scenario-files.ts header. Either way there is no data export step, because your data never moved off your disk.

How do I actually run the first test, and what does the CLI output look like?

Install once with `npx @assrt-ai/assrt setup`, which registers the MCP server for Claude Code, Cursor, or any MCP client. Then either ask your coding agent to use assrt_test, or run `assrt run --url http://localhost:3000 --plan '#Case 1: Homepage loads\n- Verify the heading is visible'` from the terminal. Each run spawns a local @playwright/mcp instance over stdio (browser.ts line 284), executes the scenario, and writes results to /tmp/assrt/results/latest.json plus a timestamped sibling keyed by run ID (scenario-files.ts line 82). Add --video to record and auto-open a player, --extension to attach to your real Chrome, --headed to see the browser.

Is this a full replacement for tools like Testim or Mabl, or something else?

Different shape. Testim and Mabl are managed SaaS products whose primary artifact is a visual recorder session stored in their cloud; you get reliability features, flake detection, and dashboards in exchange for living inside their UI. Assrt is an open-source MCP server plus CLI that runs locally against your dev server, pointed at your existing Chrome or a headless Chromium, and emits Markdown plus JSON you own. The two overlap where both run a real browser and report pass/fail, but the artifacts and the economic model are different. If you are happy paying for a closed platform, Testim and Mabl are fine. If you want your test plan to live in your repo next to your code, Assrt is the one.

Can I edit scenario.md while a test is running, or will that break the run?

Edits during a run are safe. The current run holds the plan text in memory from when assrt_test started (server.ts around line 396 where writeScenarioFile is called). fs.watch on scenario.md picks up the new content and schedules the debounced syncToFirestore call, but the running scenario continues against its in-memory copy. The next assrt_test invocation reads the updated file off disk. This is useful: a failing scenario can surface a bad step, you edit the Markdown while the browser is still open, and you re-run without closing anything. The file is the single source of truth for 'what is the test', and it is always in sync with your editor.

Does it work without a cloud account?

Yes. If scenario-files.ts sees an id starting with 'local-' at line 94, it skips starting the fs.watch at all, because there is nowhere to sync to. The local file is still written, the test still runs, the results are still saved. An account at app.assrt.ai is only needed if you want sharing, run history in a dashboard, or cross-device continuation. Everything else is self-hosted by default: your Anthropic or Gemini key, your Chrome, your filesystem.