The other shape of the same line item
Test automation services, compiled down to three MCP tools and one file on disk
Every other page written on this topic describes a retained QA team, a quarterly SOW, and a catalog of capabilities. Fine. This page is the other version: what happens when the same catalog is a local MCP server, three registered tools, and four known paths on disk. The service is assrt_test, assrt_plan, and assrt_diagnose. The bill of materials is /tmp/assrt/scenario.md. The deliverable is a webm you can scrub at 5x.
The word "services" used to mean a team
Look at every other page written on this topic and the shape is identical. A services firm opens with a vertical directory (BFSI, healthcare, retail), lists capability pillars (framework design, cross-browser execution, CI/CD integration, defect triage), and closes with a contact form. The actual delivery mechanism is a team of engineers, a project manager, a portal, a quarterly statement of work, and a retainer that starts somewhere around 7,500 dollars a month.
Every one of those line items still has to exist for a test to run. Someone writes the plan, someone launches a browser, someone asserts the result, someone watches the recording, someone files a defect when it breaks. What changed is that every one of those line items is now a tool call that a coding agent can make locally, against your real application, using your real Chrome profile, against your real auth session.
This piece is an inventory. Three tools. Four file paths. One child process. Every claim is pinned to a line number you can open in your own checkout of m13v/assrt-mcp.
The three-tool catalog
A traditional services engagement has five or six capability pillars on its slide. The MCP-server version has three, because that is all you actually need: one tool to run a plan, one tool to make a plan, one tool to explain why a plan failed. They all live in the same file.
assrt_test
Run a #Case plan against a URL and return a structured pass/fail report. Takes plan text or a saved scenarioId, and returns passedCount, failedCount, per-scenario assertions with evidence, screenshot file paths, and a videoPlayerUrl you can click. Registered at server.ts line 335. This is the equivalent of the "regression suite execution" line item on a traditional services SOW, except the bill of materials is the arguments object, not a Statement of Work.
assrt_plan
Point it at a URL. It launches a local browser, scrolls through the page three times, collects snapshots and visible-text extracts, then asks the model (claude-haiku by default) to return #Case blocks in the exact executable format. Registered at server.ts line 768. This is where a services firm would charge for "test case design workshops."
assrt_diagnose
Takes a URL, a failed scenario, and the evidence that came back. Returns root-cause analysis plus a corrected #Case block that would pass. Registered at server.ts line 866. This collapses the "defect triage and test maintenance" service line item into one tool call.
One markdown file
/tmp/assrt/scenario.md. Plain text. cat it, vim it, commit it. scenario-files.ts line 97 watches it with fs.watch, debounces edits for 1 second, and syncs the content back to cloud storage so a run on another machine picks up your fix. The service catalog is literally a file path.
Persistent browser profile
~/.assrt/browser-profile. First run logs you in; every subsequent run resumes the session. browser.ts line 313 mkdirs it, cleans stale singleton locks, and passes --user-data-dir to @playwright/mcp so cookies and localStorage survive across test runs.
Scenario UUIDs
Every assrt_test run pre-saves the scenario with a UUID before execution (server.ts line 408) so cloud URLs are deterministic. Pass the UUID back in later with scenarioId and you re-run the exact same plan. No portal required; the ID is a plain string.
scenario.md + profile + token → three tools
The service, on disk
A services firm will send you a PDF runbook listing deliverables. The MCP-server version has a runbook too; it is the filesystem. Four paths, all visible with ls, all readable with cat, none of them locked behind a SSO-gated portal.
The watcher is the whole "publish changes" story
In a typical services arrangement, updating a flaky test means a ticket, a sprint, a pull request against the vendor-owned test repo, and a propagation window before the change takes effect in the next nightly run. In the MCP-server version, it is a file save. The watcher below is the whole mechanism.
When you save /tmp/assrt/scenario.md inside an editor during or after a run, Node's fs.watch fires. A one-second debounce collects any follow-up keystrokes. Then the sync job reads the file, compares it to the last content the server itself wrote (to avoid echoing its own writes), and pushes the diff back to cloud storage. The next call to assrt_test (yours, or anyone else's who runs against the same scenarioId) picks up your fix.
The first run, in six steps
A traditional onboarding is a kickoff call, access provisioning, environment parity verification, and then a first smoke run. Here is the MCP-server equivalent.
npx @assrt-ai/assrt setup
Registers the MCP server globally in your coding agent, installs a CLAUDE.md hook that reminds the agent to run tests after any user-facing change, and ships a local binary you can also invoke via plain bash. One command, no account, no onboarding call.
First assrt_test call against your local dev server
Your agent calls assrt_test with a URL and a plan. server.ts line 397 writes /tmp/assrt/scenario.md to disk. The singleton browser is lazily constructed (line 515). launchLocal spawns @playwright/mcp over stdio, kills any orphan Chrome processes pinned to the same profile dir, clears stale SingletonLock files, and starts recording video.
Agent runs your scenarios, cursor overlay shows what happened
The model drives Chromium through browser_click, browser_type, browser_snapshot, browser_evaluate. A visible red cursor dot is injected into the page (browser.ts CURSOR_INJECT_SCRIPT) so the recorded webm shows every interaction, not just final states. Each assertion becomes a log line tagged PASS or FAIL with evidence.
Results land as JSON, video auto-opens
The report is written to /tmp/assrt/results/latest.json AND /tmp/assrt/results/<runId>.json. A self-contained player.html is generated next to the webm and opened in your default browser (unless you pass autoOpenPlayer: false). The video defaults to 5x playback so a 90-second session reviews in 18 seconds.
Something failed, you open assrt_diagnose
Pass the URL, the failing #Case, and the failure evidence back to assrt_diagnose. It returns root cause plus a corrected scenario in the same #Case format. Paste the corrected block into /tmp/assrt/scenario.md and re-run; scenario-files.ts line 97 auto-syncs your edit to cloud storage in the background.
Next week, you rerun a scenario by UUID
Pass scenarioId to assrt_test instead of plan. server.ts line 380 pulls the stored scenario (plus any passCriteria, variables, or tags that were saved with it) and hydrates it back into a run. There is no portal to log into, no project dashboard to share, no seat to provision.
What the terminal looks like on run one
One command to set up, one tool call from your coding agent, and the rest is the runner. No status meeting, no weekly check-in, no "we are blocked on access to the staging environment."
The retainer, line by line
None of this argues a services firm is wrong to exist. Some teams need a human on retainer and that is a reasonable choice. What has changed is that every item on the services catalog now maps to something the MCP server does directly. Here is that mapping, one row at a time.
| Feature | Traditional services firm | Assrt MCP server |
|---|---|---|
| What you are buying | Access to a team of QA engineers and a services portal | A local MCP server that exposes three tools to your coding agent |
| Where the tests live | Proprietary test case format inside the vendor portal | /tmp/assrt/scenario.md, plain markdown, greppable and diffable |
| How you edit a flaky step | File a ticket with the services team, wait for the sprint | Save the file; fs.watch syncs the change on the next keystroke pause |
| Browser used under the hood | Vendor-branded cloud grid or a managed Selenium farm | Your own Chromium via @playwright/mcp@0.0.70, persistent profile |
| Logged-in session testing | Services team adds a test user to the vendor secrets vault | extension: true reuses your real Chrome at ~/.assrt/extension-token |
| Video recording | An upsell feature on the platinum plan | Always on; webm plus self-contained player.html auto-opens at 5x |
| Rerunning a specific scenario | Find it in the portal, re-queue, wait for a worker | Pass scenarioId (UUID) back into assrt_test; same plan hydrates |
| Failure diagnosis | Services engineer triages the failure next business day | assrt_diagnose returns root cause + corrected #Case in one call |
| Pricing shape | Per-seat or per-engagement retainer, usually quarterly | Self-hosted, MIT-licensed, your own model token spend |
| Source-of-truth artifact | Whatever the vendor portal exports when you ask nicely | Four known paths on disk (scenario.md, profile, token, output) |
The deliverable is a webm you can scrub
On a traditional services engagement, the evidence artifact is usually a screenshot bundle attached to a Jira ticket, maybe a Loom the QA engineer recorded after the fact. On the MCP-server version, every run records a full webm with a red cursor overlay injected into the page (browser.ts CURSOR_INJECT_SCRIPT draws the dot, the ripple on click, and a keystroke toast). A self-contained player.html is generated next to the video (server.ts line 618) with keyboard shortcuts for playback speed. Default is 5x, which turns a ninety-second run into an eighteen-second review. Space toggles play. Arrow keys seek. 1, 2, 3, 5, and 0 set playback to 1x, 2x, 3x, 5x, and 10x.
The player itself is served from a localhost HTTP server with byte-range support so the scrub bar is smooth (server.ts lines 168-198). It auto-opens in your browser when autoOpenPlayer is not explicitly set to false. For CI runs, pass autoOpenPlayer: false and the webm path alone is returned in the JSON report.
The path to check, if you are skeptical
If any of the above feels like marketing, here is the twenty-minute verification route. Clone github.com/m13v/assrt-mcp. Open src/mcp/server.ts at line 335. That is the full schema and body of assrt_test. Scroll to line 768 for assrt_plan and line 866 for assrt_diagnose. Three top-level registrations. No fourth hidden one.
Then open src/core/scenario-files.ts and read lines 16-20 for the path constants, and line 97 for the watcher. Three files, about 200 lines, is the whole public surface area of "where are my tests, and how do they stay synced." That is a catalog small enough to hold in your head.
Want a walkthrough with the files open?
Bring your repo, we will run assrt_test against your local dev server and show the four paths light up in real time.
Pinned claims, with file line numbers
If it is a server, not a services firm, what exactly do I get?
Three MCP tools, registered inside assrt-mcp/src/mcp/server.ts. assrt_test (line 335) runs a #Case block against a URL and returns a structured pass/fail report with screenshots, a webm recording, and an execution log. assrt_plan (line 768) navigates to a URL, takes three scrolling screenshots, and asks the model to generate a #Case catalog for the page. assrt_diagnose (line 866) takes a failed scenario and its evidence and returns a root-cause analysis plus a corrected #Case. These are not endpoints behind a portal; they are tools your coding agent (Claude Code, Cursor, any MCP client) can call directly, locally, through stdio.
Where are my tests actually stored?
One plain-markdown file. scenario-files.ts line 17 defines SCENARIO_FILE as /tmp/assrt/scenario.md. Every time assrt_test loads or runs a scenario, writeScenarioFile writes the plan to that path (server.ts line 397, then again with the real ID at line 420). You can cat it, vim it, git-commit it. A traditional services engagement would hand you a proprietary YAML dialect or a vendor-locked portal; Assrt hands you a file path the entire unix toolchain already knows how to operate on.
How does an edit to the markdown file get back to the cloud?
scenario-files.ts line 97 calls fs.watch on /tmp/assrt/scenario.md. Any edit triggers a 1 second debounce timer (line 100) that then calls syncToFirestore (line 130). The sync reads the current file content, compares it against the last content the server itself wrote (to avoid echo loops), and posts the diff to updateScenario. That means you can open the file in an editor during a run, fix a flaky step, save, and the next assrt_test call will use the updated plan. There is no publish button, no commit step, no separate CI integration.
What is in ~/.assrt, and why does it matter?
Three things. ~/.assrt/browser-profile is a persistent Chromium user-data-dir (browser.ts line 313). It holds cookies, localStorage, and logged-in sessions across test runs. That is what makes the extension-free local path usable for authenticated flows: log in once, every subsequent run resumes the session. ~/.assrt/extension-token is a single-line file containing the PLAYWRIGHT_MCP_EXTENSION_TOKEN used when you connect to your real Chrome instead of a fresh profile (browser.ts line 224). ~/.assrt/playwright-output is where @playwright/mcp writes page snapshots as .yml files, which the runner then reads back and truncates at 120,000 characters (browser.ts line 296).
What does the runner actually use under the hood?
It wraps @playwright/mcp, pinned at 0.0.70. browser.ts line 284 resolves @playwright/mcp/package.json, joins cli.js, and launches the child process with StdioClientTransport from the MCP SDK. The spawn args at line 296 are --viewport-size 1600x900 --output-mode file --output-dir ~/.assrt/playwright-output --caps devtools, with --headless added for default runs. Every interaction the agent wants to perform (click, type, snapshot, evaluate) becomes a browser_* tool call against that child process. The underlying browser is vanilla Chromium driven by vanilla Playwright; no re-implementation, no custom protocol.
How is this different from hiring a managed QA team?
A managed QA team writes tests, runs them, and ships you reports. They bill per engineer, per month, usually at retainer rates that start around 7,500 dollars a month and climb. Assrt writes the same tests (via assrt_plan), runs them (via assrt_test), and diagnoses failures (via assrt_diagnose), but the compute is your machine and the price is your own model token spend. The value traditional services firms create by hiring, training, and retaining QA engineers is moved into the prompt inside agent.ts line 236, which is the instruction set for the model that acts as the QA engineer during each run.
What about the video recordings every services firm calls out as evidence?
Built in, no contract required. When assrt_test runs, browser.ts enables video via the devtools capability and calls startVideo/stopVideo around the run (server.ts lines 546 and 578). The webm lands in /tmp/assrt/<runId>/video/recording.webm. A self-contained player.html is generated alongside it (server.ts line 618) with keyboard shortcuts: space to play/pause, 1/2/3/5/0 to switch between 1x/2x/3x/5x/10x playback, arrow keys to seek 5 seconds. The player auto-opens in your browser by default (autoOpenPlayer defaults to true at server.ts line 403). Pass autoOpenPlayer: false if you do not want the window to pop.
Can it reuse my already-logged-in Chrome instead of a fresh browser?
Yes. Pass extension: true to assrt_test. browser.ts launchLocal (line 258) resolves the extension token from three places in priority order: explicit parameter, PLAYWRIGHT_MCP_EXTENSION_TOKEN env var, then the saved file at ~/.assrt/extension-token. If none exist, it throws ExtensionTokenRequired (line 12) with step-by-step instructions for obtaining the token via npx @playwright/mcp@latest --extension. Once you paste the token once, it is written to disk and every future call just works. This is the path for testing flows behind authentication, payment providers, or SSO where launching a brand new browser would fail.
What does the first run actually cost?
One command and your own LLM tokens. npx @assrt-ai/assrt setup registers the MCP server with your coding agent, drops a CLAUDE.md hook, and wires the three tools into your next session. The default model is claude-haiku-4-5-20251001 (agent.ts line 9). A typical five-case run on a small app uses a handful of tool calls and costs single-digit cents in Anthropic tokens. There is no per-seat fee, no minimum commitment, no cloud runtime to provision. The code is MIT-licensed at github.com/m13v/assrt-mcp; self-hosting is the only hosting.
Which file should I open first if I want to verify any of this?
Four paths, in this order. assrt-mcp/src/mcp/server.ts for the three tool registrations (lines 335, 768, 866). assrt-mcp/src/core/scenario-files.ts for where your scenarios live and how they sync (lines 16-20 for the paths, 97 for the watcher). assrt-mcp/src/core/browser.ts for the @playwright/mcp spawn and the extension-token logic (lines 258-350). assrt-mcp/src/core/agent.ts for the tool schema the agent uses (lines 16-200). Reading those four segments end to end takes about twenty minutes and is the fastest way to see that the catalog really is just these pieces.