Run tests locally, self-hosted: the four files an AI-driven run writes to your disk, and the one env var that keeps them there.
Most articles for this topic tell you to npm install Playwright and call it a day. That works when the runner is 0 file. When the runner is an AI agent that receives a live accessibility tree and picks one of 0 tools per turn, "self-hosted" means something more specific: the browser, the scenario, the results, and the profile all have to live on your machine, and you need a kill switch for anything that tries to phone home. This guide shows you every path Assrt touches, every env var that matters, and the exact two lines of source that form the offline fallback.
What most pages on this topic actually tell you
Open any of the top existing write-ups for this question and you get the same four-step shape: npm install, write a playwright.config.ts, set the webServer option to your localhost, run npx playwright test. Great advice when the test artifact is a .spec.ts file a developer wrote by hand. None of it helps when the artifact is the runtime trace of an AI agent that never committed a test file.
Those are the problems a hosted AI testing platform hands you the moment you want to run something against a private staging URL, an air-gapped laptop, or a regulated environment. Self-hosting is not about frugality. It is about keeping the scenario plan, the browser profile, and the run results on a disk you own.
The four directories a local Assrt run touches
These paths are not a cache. They are the canonical storage the agent itself reads and writes. Every interaction with a live test flows through them.
Each path, and what puts it there
~/.assrt/browser-profile
Persistent Chromium profile. Cookies, localStorage, service workers, IndexedDB. Survives reboots so logins stick across runs. Created by browser.ts line 313 on first launch.
/tmp/assrt/scenario.md
The #Case plan, written in plain markdown. Watched by fs.watch with a 1-second debounce. Edit it by hand while the agent is idle — your changes get picked up on the next run.
/tmp/assrt/results/latest.json
Last run's structured output: scenarios, assertions, timings, error evidence. Same schema the MCP tool returns. Tail it from your CI runner instead of screen-scraping the CLI.
~/.assrt/playwright-output
Where accessibility-tree snapshots land as .yml files so the MCP transport doesn't choke on 2MB Wikipedia-sized trees. Set via --output-mode file at browser.ts line 296.
~/.assrt/extension-token
One-time token for --extension mode. Written on first use, read on every subsequent run. Only needed if you want the agent to attach to your real Chrome session.
~/.assrt/scenarios/local-<uuid>.json
The offline-mode fallback. When scenario-store.ts line 124 hits its catch block, this is where the scenario gets written — with a 'local-' prefix so the watcher knows never to try to sync it.
The kill switch: one env var, two lines of source
The only external endpoint the MCP server talks to (beyond the Anthropic API) is the scenario store at app.assrt.ai. That endpoint is overridable with ASSRT_API_URL and, more importantly, it is optional. When the POST fails, the code path at scenario-store.ts line 124 generates a local-only UUID and writes the scenario to disk. A second check at scenario-files.ts line 93 then disables the sync watcher for any ID with a local- prefix. That is the entire offline mode.
The anchor fact, in one place
When the scenario store POST fails — whether because you unset ASSRT_API_URL, pointed it at an unreachable self-hosted endpoint, or blocked app.assrt.ai at the network layer — this exact catch block runs. Note the local- prefix on line 124:
} catch (err) {
console.error("[scenario-store] Central save failed:",
(err as Error).message);
// Generate a local-only ID with a prefix so we know it's unsynced
const crypto = await import("crypto");
const localId = `local-${crypto.randomUUID()}`;
writeLocal({ id: localId, plan: data.plan, ... });
return localId; // <-- line ~131
}The prefix is not cosmetic. It is how the watcher on the next line (scenario-files.ts line 93) decides never to attach to the file. One prefix, two files, a complete offline mode. Open the source after installing and verify it yourself — the relevant function is saveScenario in scenario-store.ts, and startWatching in scenario-files.ts.
How the pieces connect during a self-hosted run
Three inputs flow into a single agent, which fans out to real services on your own infrastructure plus a single LLM call. Nothing in this diagram is a hosted Assrt component.
Self-hosted data flow — everything but the LLM call stays on your disk
Three launch modes, three levels of "local"
A CLI flag decides which of these modes you get. Every mode keeps the browser, the plan, and the results on your machine. The difference is how much state sticks around between runs.
Picking the right launch mode for a self-hosted run
Persistent profile (default)
Chromium profile at ~/.assrt/browser-profile survives reboots. Sign into Gmail, Shopify admin, or your staging dashboard once; every future run starts already authenticated. browser.ts line 313.
Isolated (no disk writes)
Pass --isolated. Browser profile lives in memory and dies with the process. Every run is a clean slate. Useful for untrusted apps or CI where every artifact is ephemeral. browser.ts --isolated flag.
Attach to real Chrome (--extension)
Pass --extension. The agent connects to your already-running Chrome (the one with your extensions, password manager, enterprise SSO session) via @playwright/mcp's extension bridge. First run saves a token to ~/.assrt/extension-token so subsequent runs just work.
The persistent profile is why authenticated apps are testable at all
Cookies and service-worker storage survive across runs because the Chromium profile lives in a stable directory on disk. That is what turns a 5-minute Gmail signin dance into a one-time setup. The singleton-lock cleanup below is the part most homegrown profile persistence implementations forget.
A first self-hosted run, end to end
This is the exact transcript you get when ASSRT_API_URL is set to an empty string and the agent points at a local dev server. Note the offline-mode log line: the scenario saves locally and keeps running.
Getting to your first self-hosted run in five steps
The shortest path from zero to a passing test against a private localhost URL, with zero outbound traffic except the Anthropic model call.
Install the CLI
npx @assrt-ai/assrt setup — registers the MCP server globally, installs a PostToolUse hook that nudges the agent after git commits, and appends a QA testing section to ~/.claude/CLAUDE.md. The install itself is local: the CLI lives inside node_modules.
Export ANTHROPIC_API_KEY
Or let the CLI pull a Claude Code OAuth token from the macOS Keychain. Either way, the model call goes straight from your machine to api.anthropic.com. No Assrt middle tier.
Write a #Case plan
A plan is markdown with #Case headers. Example: '#Case 1: Signup flow\nNavigate to /signup, fill the form with a disposable email, verify the dashboard heading.' Pipe it via stdin, paste it with --plan, or point at a file with --plan-file.
Run against localhost
assrt run --url http://localhost:3000 --plan-file tests/signup.md. The agent does an 8-second HEAD preflight (browser.ts), spawns Playwright MCP over stdio, and starts calling its 18 tools against your real DOM. No tunneling, no port forwarding, no WebSocket to a cloud runner.
Optional: kill scenario sync
export ASSRT_API_URL="" — this leaves the env var unset-in-practice. Every scenario save now hits the catch block at scenario-store.ts line 123, gets a local- ID, and stays on disk forever. Useful for air-gapped laptops, regulated environments, or just because.
Counting what leaves the box
Four concrete numbers, each verifiable in the Assrt MCP source.
Self-hosted Assrt vs. a hosted AI testing platform
Where the artifacts live is the fault line between the two approaches. Everything downstream (pricing, auditability, offline support, regulated-environment fit) follows from that one decision.
Where things actually live
| Feature | Hosted AI testing platform | Assrt (self-hosted) |
|---|---|---|
| Scenario plan | Proprietary YAML / DSL on their servers | /tmp/assrt/scenario.md (plain markdown) |
| Run results | Web dashboard behind login | /tmp/assrt/results/latest.json (structured) |
| Browser profile | Ephemeral worker VM per run | ~/.assrt/browser-profile (persistent) |
| Video recording | Cloud-hosted viewer URL | 127.0.0.1 local player, .webm on disk |
| Private localhost support | Tunnel or agent runner required | Direct — localhost IS the target |
| Offline / air-gapped mode | Unavailable | local-<uuid> fallback at scenario-store.ts line 124 |
| Outbound endpoints per run | Control plane + browser worker + storage | One (Anthropic) when ASSRT_API_URL is unset |
| License cost | $1,000 to $7,500+ per month | MIT, free |
| Source readable | Closed | github.com/assrt-ai/assrt-mcp |
What "self-hosted" actually buys you, feature by feature
Every item below maps to a named file or function in the MCP source. Install the package, open the file, verify the claim.
The real self-hosted surface, line by line
- Spawns a local Playwright MCP subprocess — no remote runner
- Browser profile at ~/.assrt/browser-profile survives reboots
- Scenario plan is plain markdown at /tmp/assrt/scenario.md
- Results are plain JSON at /tmp/assrt/results/latest.json
- ASSRT_API_URL env var redirects (or kills) scenario sync
- local- prefixed IDs skip the sync watcher entirely (scenario-files.ts line 93)
- Video recording stays at a 127.0.0.1 player URL — not a cloud viewer
- LLM call goes to Anthropic directly, no Assrt proxy
- MIT license; npm install, no signup to run
Want to run this against your private staging URL, live?
Bring a localhost dev server, an air-gapped laptop, or a regulated environment. We will walk through every path the agent touches on your disk and show you the kill switch in source.
Book a call →Run-tests-locally questions, answered from the source
What exactly does 'self-hosted' mean when the runner is an AI agent, not a pre-written test file?
It means three things stay on your machine. First, the browser: spawned via @playwright/mcp version 0.0.70 as a local stdio subprocess (browser.ts line 286 logs 'spawning local Playwright MCP via stdio'). Second, the scenario text and all run results: written to /tmp/assrt/scenario.md and /tmp/assrt/results/latest.json (scenario-files.ts lines 17 to 22). Third, the persistent browser profile with cookies and logins: ~/.assrt/browser-profile (browser.ts line 313). The only outbound traffic on a default run is the LLM call to Anthropic and, if you leave ASSRT_API_URL unset, an optional scenario sync to app.assrt.ai that silently falls back to local-only when unreachable.
How do I stop the scenario plan from being uploaded anywhere?
Either set ASSRT_API_URL to a host you control (your own self-hosted scenario store), or leave it unset and block app.assrt.ai at the network layer. scenario-store.ts line 14 reads the env var with a fallback, so any non-200 response (including a connection refused) trips the catch block at line 123. That block generates an ID with a local- prefix, writes to ~/.assrt/scenarios/<uuid>.json, and returns. The watcher at scenario-files.ts line 93 then checks scenarioId.startsWith('local-') and returns early, so the 1-second debounced sync loop never fires. Two lines of source form a complete offline mode.
Where do my test results actually live on disk?
Three places, all predictable: /tmp/assrt/scenario.md holds the plan text, /tmp/assrt/scenario.json holds metadata (id, name, url, updatedAt), and /tmp/assrt/results/latest.json holds the most recent run's structured output. Historical runs are keyed by runId at /tmp/assrt/results/<runId>.json. These are the paths the agent itself reads and writes — not a separate cache or export layer. If you want to pipe results into a CI system, tail the files directly.
How does Assrt avoid re-logging-in every time it tests a page that requires auth?
Default mode persists a full Chromium profile at ~/.assrt/browser-profile. browser.ts line 313 creates the directory and passes it to Playwright MCP via --user-data-dir, so cookies, localStorage, and session tokens survive across test runs. Sign into Gmail once, the next 50 test runs against Gmail start already authenticated. If you prefer zero persistence, pass --isolated and the profile is in-memory only. If you want to use your real Chrome (with your real profile, real extensions, real bookmarks), pass --extension and the agent attaches to your running Chrome instance via a one-time token saved at ~/.assrt/extension-token.
What is the 'local-' prefix in the scenario ID and why does it matter for self-hosting?
It is a marker the watcher uses to decide whether to sync. When scenario-store.ts saveScenario() fails to POST to the central API (lines 107 to 131), it generates an ID using crypto.randomUUID() prefixed with 'local-', writes the scenario to ~/.assrt/scenarios/<id>.json via writeLocal(), and returns the local ID. Then whenever the scenario file is loaded or saved, scenario-files.ts line 93 short-circuits: 'if (scenarioId.startsWith("local-")) return'. No watcher attaches, no debounced sync fires, no network call happens. You can see the local- prefix in the scenario JSON file itself.
Can I run the whole stack without Chrome ever launching a visible window?
Yes. The default launch mode is headless. browser.ts line 348 inserts '--headless' into the Playwright MCP args whenever headed is false. So 'assrt run --url ...' with no extra flags runs fully offscreen. Pass --headed to see what the agent sees, --video to record the whole thing to a .webm plus an auto-opening local 127.0.0.1 player, or --extension to attach to your already-running Chrome. Video recording goes through devtools capability in the Playwright MCP and the recording stays on your disk at the path printed in the run log.
Do I need an API key, and if so, where does it go?
You need one LLM key: ANTHROPIC_API_KEY env var, or a Claude Code OAuth token the CLI pulls from the macOS Keychain automatically (cli.ts). The agent runs Claude Haiku 4.5 by default (agent.ts line 9: DEFAULT_ANTHROPIC_MODEL = 'claude-haiku-4-5-20251001') and calls Anthropic directly from your machine. There is no Assrt proxy, no Assrt-issued key, no vendor-managed credential. Your dev server is still localhost; only the model call leaves the box.
What's actually in the 18-tool agent surface, and can I verify it without trusting this page?
Eighteen tools defined in the TOOLS array between agent.ts lines 16 and 196: navigate, snapshot, click, type_text, select_option, scroll, press_key, wait, screenshot, evaluate, create_temp_email, wait_for_verification_code, check_email_inbox, assert, complete_scenario, suggest_improvement, http_request, wait_for_stable. No hidden SDK, no proprietary YAML schema, no cloud-only capabilities. Open the file after installing (node_modules/assrt-sdk/dist/cli.mjs if you npx-installed it, or the GitHub source directly) and count.
How does this compare to running a hosted AI testing platform?
The hosted platforms in this category price somewhere between $1,000 and $7,500+ per month for midmarket tiers and produce a proprietary test artifact (YAML, DSL, their own spec format) that you can't run anywhere else. Assrt is MIT licensed (LICENSE file in assrt-mcp root), npm-installable, and the 'test artifact' is a markdown #Case plan you edit as plain text. The only ongoing cost is the Anthropic invoice for Haiku calls during runs — typically a few cents per scenario. Zero per-seat fee, zero platform fee, zero vendor lock.
How did this page land for you?
React to reveal totals
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.