MIT-licensedAgentic test runnerBring your own LLM keySelf-hostable

Automated open source testing: when the runner itself is a forkable file

Most guides on this keyword hand you a list: Playwright, Selenium, Cypress, Appium, Robot Framework, Allure, BugBug. Useful, and also the same list every year. Every item on it still asks you to write the tests yourself. I want to cover the other category: open-source AGENTIC testing, where the plan you commit is prose and the LLM writes the Playwright calls at runtime. I wrote this because the whole thing, in Assrt, collapses into one MIT-licensed TypeScript file you can read in a sitting.

Matthew Diakonov, Written with AI

Published April 20, 202612 min read

Open source, one layer up

The plan is prose. The runner is a forkable file.

Plan lives at /tmp/assrt/scenario.md, written in English

Agent loop is 1,087 lines in src/core/agent.ts (MIT)

18 tools in a single TOOLS array, lines 16-196

Provider is --model: Claude or Gemini, your pick

ANTHROPIC_API_KEY first; no Assrt account required

0:00 / 0:05

4.9from Assrt engineering

Agent loop = 1,087 lines in one file: src/core/agent.ts

18 LLM-callable tools, declared in TOOLS array at lines 16-196

ANTHROPIC_API_KEY in env is the first credential path (keychain.ts:34)

assrt-mcp MIT@playwright/mcp Apache-2.0@anthropic-ai/sdk MIT@google/genai Apache-2.0@modelcontextprotocol/sdk MITws MITzod MITtypescript Apache-2.0

Every runtime dependency in the Assrt agent is MIT or Apache 2.0. No AGPL, no source-available shim, no enterprise-only paywall on the agent loop itself.

What the SERP for this keyword actually covers

Search automated open source testing and the first ten results are framework roundups: BugBug, Momentic, Guru99, SoftwareTestingHelp, Cetpa, TestGuild, Aqua Cloud, Allure Report, Opensource.com, Apidog. Every one lists Playwright, Selenium, Cypress, Appium, Robot Framework, and maybe one AI fuzzer. All of those are real open-source tools. All of them still ask the developer to write the test code. That is the shape of the market most of this keyword assumes.

What every top result covers

Open-source runners, hand-written tests

The framework is free. The scripts are not: someone has to author each locator, each assertion, each wait. The license says MIT; the maintenance bill says 40 hours a sprint.

The gap this guide fills

Open-source runner and open-source agent

The runner is @playwright/mcp (Apache 2.0). The agent on top is MIT, 1,087 lines, one file. You bring the LLM key. The plan you commit to git is English prose.

Where the open-source pieces meet at runtime

The agent is the hub. It receives a prose plan and a live ARIA tree and produces a passed-or-failed report, a webm video, and JSON results. Every piece on the left and right is its own open-source project with its own license. No vendor-locked glue.

Assrt runtime, every arrow is an open-source boundary

The entire agent surface area, in one array

If you want to audit what a closed SaaS testing agent is allowed to do, you usually cannot. With Assrt, you open src/core/agent.ts at line 16 and read the TOOLS array until line 196. That is every capability the LLM has against your application. Eighteen tools, roughly three screens of TypeScript.

assrt-mcp/src/core/agent.ts (lines 9-196, abbreviated)

1,087 lines

“The agent, the provider switch, the tool definitions, the retry-on-failure loop. All in one file. Read it before you commit to any vendor.”

The LICENSE is three paragraphs

No field-of-use restriction, no commercial clause, no patent grant carve-out. If you ever migrate off a source-available platform, you know how valuable this page is. If you have not, trust me: the license is often where vendors smuggle lock-in back in.

assrt-mcp/LICENSE

The auth path that keeps your key yours

Here is the first ten lines of resolution. If ANTHROPIC_API_KEY is in the environment, it is used immediately and no other auth path runs. On non-Darwin platforms (Linux VMs, CI) that is the only path that works: there is no hidden fallback to an Assrt-hosted key.

assrt-mcp/src/core/keychain.ts:31-44

One line makes the cloud optional

The scenario-file watcher syncs plan edits to Firestore so teams can collaborate on shared scenarios. But any scenario whose ID starts with local- is excluded from sync entirely. This is the surgical on/off switch for regulated environments, local-only runs, and air-gapped CI. It is one line.

assrt-mcp/src/core/scenario-files.ts:90-98

Self-hosted in one terminal session

No sign-up, no dashboard, no cloud redirect. Clone, build, export a key, run. If your CI box can reach your LLM provider it can run this. Every log line is local; every artifact is on disk.

Local install and first run

Six steps from git clone to running tests you own

The same path, annotated. Nothing here requires an Assrt account. If step three (exporting an API key) also feels like a vendor, remember: both Anthropic and Google sell raw model access with per-token pricing, no minimum, and cancel-anytime.

Clone the MCP server repo

git clone the assrt-mcp repo. That is the agent, CLI, and MCP server. No binary blobs, no vendored SDKs, no pinned closed dependency. The full LICENSE file is three paragraphs of MIT.

Read src/core/agent.ts end to end

1,087 lines. The TOOLS array is between lines 16 and 196. The system prompt is a template string beginning around line 198. The tool dispatch loop is near the bottom. You can finish reading before your coffee goes cold.

Set ANTHROPIC_API_KEY (or GEMINI_API_KEY)

Export your own provider key. keychain.ts:34 reads env first, falls back to Claude Code's macOS Keychain entry on darwin only, and errors otherwise. No Assrt account required.

Write a plan file in English

Drop a /tmp/assrt/scenario.md with #Case blocks. Each step is one English sentence naming an intent (click the Sign in button, type a disposable email). There is no selector to store.

Run with --isolated --json

assrt run --url http://localhost:3000 --plan-file tests.md --isolated --json writes structured results to stdout. Pipe to CI. Keep the markdown in git. Delete the /tmp/assrt folder anytime.

Fork and modify freely

Add a custom tool (e.g. webhook_signature_verify), swap the telemetry stub, rename the provider. MIT lets you ship a patched binary to clients without asking anyone.

Swap the model, keep the plan

agent.ts line 9 pins the Anthropic default, line 10 pins the Gemini default. The --model flag is the only thing that changes between these two runs. Your scenario.md is untouched; your results are still a JSON report.

Same plan, two providers

Lines in src/core/agent.ts (the whole agent loop)

LLM-callable tools, declared in one TOOLS array

Providers supported out of the box (Anthropic, Gemini)

License fee, vs up to $7,500/mo at closed competitors

Three flavors of "automated open source testing"

The keyword groups together things that are not the same. Here is the cross-section: a framework-only OSS stack, a closed AI QA platform, and Assrt. All three can run a browser. Only one is MIT-licensed AI and self-hostable at the same time.

Feature	Typical alternatives	Assrt (MIT, agent + prose)
License	Closed SaaS (Momentic, QA Wolf, Testsigma): no source, quarterly contracts	MIT (LICENSE line 1). Fork it, sell it, ship it.
What you write	Framework-only OSS (Playwright, Selenium, Cypress): you hand-write every click() and locator	English #Case blocks in /tmp/assrt/scenario.md
What the agent writes	Framework-only OSS: no agent. Closed SaaS: proprietary YAML you cannot port	Real @playwright/mcp calls, resolved per run against a live ARIA tree
Where the decision loop lives	Closed SaaS: server-side, unaudittable. Framework OSS: no such loop exists	One file, src/core/agent.ts, 1,087 lines, readable in one sitting
LLM provider	Closed SaaS: vendor-chosen model, silently changes. Framework OSS: n/a	Anthropic (agent.ts:9) or Gemini (agent.ts:10), switch via --model
Who owns the API key	Closed SaaS: vendor keys, bundled into monthly fee. Framework OSS: n/a	You. ANTHROPIC_API_KEY env var, read first (keychain.ts:34)
Cloud dependency at runtime	Closed SaaS: hard-required. Framework OSS: none	Optional. local- scenario IDs skip all sync (scenario-files.ts:94)
Artifact you keep when the vendor disappears	Closed SaaS: a recording you cannot replay without their runner. Framework OSS: your code, but no agent	A markdown file and a .webm video
Starting price	Closed SaaS: up to $7,500/month. Framework OSS: $0, plus the cost of hand-written scripts forever	$0 (you pay LLM tokens to a provider you pick)

Six things this particular open-source shape buys you

Each card points at a specific file or line number. No aspirational marketing.

1,087 lines, one file

The agent decision loop, every tool the LLM can call, and the retry-on-failure path all live in src/core/agent.ts. Read it cover to cover; there is no hidden server, no second brain elsewhere.

18 tools in one array

Lines 16-196 of agent.ts define every capability the agent has: navigate, snapshot, click, type_text, select_option, scroll, press_key, wait, screenshot, evaluate, create_temp_email, wait_for_verification_code, check_email_inbox, assert, complete_scenario, suggest_improvement, http_request, wait_for_stable.

Two providers, one --model flag

DEFAULT_ANTHROPIC_MODEL on line 9, DEFAULT_GEMINI_MODEL on line 10. Swap at runtime. Your plan file never changes.

Your API key, your runs

keychain.ts:34 checks ANTHROPIC_API_KEY in env before anything else. Set it and the runner never talks to an Assrt-hosted endpoint. Your LLM bill is on your Anthropic or Google invoice.

Cloud sync is opt-out by ID

scenario-files.ts:94: any scenario whose ID begins with local- is excluded from Firestore sync. Run in a private network, on an air-gapped laptop, or inside a regulated VPC.

Wraps @playwright/mcp

The only browser dependency. Also open source (Apache 2.0, from the Playwright team). If you swap it for a custom MCP server, Assrt's agent loop still runs.

Bring a closed-platform test contract. Leave with a forkable repo.

Thirty minutes. You share one test recorded in a closed SaaS testing tool. We port it to a prose scenario.md, run it live against the MIT Assrt agent, and walk you through the exact files you would own on your own disk.

FAQ on automated open source testing and the agent-on-top shape

How is this different from Playwright, Selenium, or Cypress, which are already open source?

Playwright, Selenium, and Cypress are browser automation frameworks. They are open source runners that execute tests you write by hand in code: locator strings, waits, assertions, the whole thing. Assrt sits one layer above that. The plan you write is English prose (#Case blocks in /tmp/assrt/scenario.md), and an MIT-licensed LLM agent translates each sentence into real @playwright/mcp calls at runtime. In other words: Playwright is what the agent uses; Assrt is the agent. Both are open source, but they solve different halves of the problem. If you already love Playwright, you can keep using it as the driver and move only your intent layer into prose.

What does 'the whole agent loop is one file' actually mean?

Literally that. /Users/you/assrt-mcp/src/core/agent.ts is 1,087 lines of TypeScript. It contains the TOOLS array (lines 16-196, 18 tools), the system prompt, the provider dispatch (Anthropic on line 9, Gemini on line 10), the conversation turn loop, and the try/catch error recovery that re-snapshots on any failure. If you want to audit what the LLM can do to your browser, you do not have to read a server repo, a worker repo, and a dashboard repo. You read one file. That is what MIT + self-contained actually buys you compared to a closed platform.

Do I need an Assrt cloud account to use it?

No. keychain.ts lines 33-38 check for ANTHROPIC_API_KEY in env before any other auth path. If it is set, the runner uses it directly and makes zero calls to any Assrt endpoint. scenario-files.ts:94 explicitly skips Firestore sync for any scenario whose ID starts with local-. If you pass --isolated and use a local- scenario, the entire data path is: your disk, your LLM provider, the target URL, back to your disk. app.assrt.ai exists for teams that want shareable run URLs and history; it is never on the hot path for a test to pass.

Can I swap Claude for another model?

Yes, via --model. Line 9 of agent.ts defines DEFAULT_ANTHROPIC_MODEL = 'claude-haiku-4-5-20251001'. Line 10 defines DEFAULT_GEMINI_MODEL = 'gemini-3.1-pro-preview'. Passing --model gemini-3.1-pro-preview (or any other Gemini model ID) routes the conversation through @google/genai instead of @anthropic-ai/sdk. Adding a new provider is a pull request, not a ticket: implement the tool-call loop in core/agent.ts for your provider, point the --model flag at it, and ship. The dispatch table is right there in the file.

What does 'real Playwright, not proprietary YAML' mean in practice?

Closed vendors typically give you a visual recorder that emits a proprietary recording file (YAML, JSON, or a custom binary). Play it back inside the vendor runner and it works; try to run it anywhere else and it does not. Assrt's runtime artifact is different. The scenario is a markdown file, and every browser action the agent takes is a regular @playwright/mcp tool call (navigate, browser_snapshot, browser_click). If you swap Assrt for a raw Playwright MCP loop you drive yourself, the same underlying tool calls still work. The plan is prose, and the calls below it are standard Playwright. Nothing in your on-disk artifact is vendor-specific.

What is actually licensed MIT vs Apache vs something else?

assrt-mcp is MIT (see LICENSE lines 1-3, Copyright 2026 Assrt). Its only browser dependency is @playwright/mcp ^0.0.70 from the Playwright team (Apache 2.0). The agent pulls from @anthropic-ai/sdk ^0.39.0 (MIT) and @google/genai ^1.46.0 (Apache 2.0). The MCP framework dependency is @modelcontextprotocol/sdk ^1.29.0 (MIT). Every runtime dependency is either MIT or Apache 2.0. There is no AGPL component, no 'source available' license, no 'open core with a closed enterprise feature' split.

How do I run this in CI without sending anything to a third-party service except my LLM provider?

Three environment pieces: ANTHROPIC_API_KEY (or GEMINI_API_KEY), a checked-in /tmp/assrt/scenario.md (or passed via --plan-file), and --isolated to keep the browser profile in memory. The CLI supports --json which writes a structured TestReport to stdout. Pipe that to your CI artifact store. The cloud-sync code path is scenario-files.ts, and it only activates if the scenario ID is not prefixed with local-; if you are running --plan-file directly, the local-only path is what kicks in. The only outbound network call is to your LLM provider for tool-use inference. Everything else is your own infra.

How do I extend the agent with a custom tool?

Add an entry to the TOOLS array in src/core/agent.ts (anywhere between lines 16 and 196 is fine; order does not matter). Each tool is a plain object with name, description, and input_schema. Then add a case to the tool dispatch switch further down in the same file to implement it. Rebuild with npm run build, reinstall from your fork, and the LLM can now call your tool by name. That is the whole extension model: no plugin API, no vendor approval, no marketplace review. Changes to your fork take about as long as adding a function to any TypeScript file.

What happens if the Assrt company disappears tomorrow?

You keep running. The MIT license grants irrevocable rights to use, modify, and redistribute. The runtime artifacts are a markdown plan file and webm videos you already own. The only external service in the hot path is your LLM provider, which you picked. If the Assrt org disappears from GitHub, your existing clone still works and you can fork it. Contrast with a closed SaaS: when the vendor disappears, so do your recordings, your dashboards, and your test runs. The 'open source' part of automated open source testing is insurance against exactly this.

Does open source mean slower or lower quality than paid tools?

Not here. The runtime is real @playwright/mcp (the same engine closed vendors build on top of), and the model can be Claude Haiku or Gemini Pro, both frontier-tier for this workload. The difference between Assrt and a $7,500/month closed platform is not browser engine quality. It is packaging: onboarding dashboards, screenshot libraries, video sharing, managed test runners. You can add those yourself on top of the MIT core, or accept that your test artifacts are raw files and move on. Quality of the underlying automation is bounded by Playwright and the LLM, not by the license.

Adjacent guides on the open-source testing stack and where an agent layer sits

Keep reading

Overview

Open-source testing guide

A broader look at what open source buys you across the testing stack, from runners to reporters to dashboards.

Read

Playwright

Open-source test automation (Playwright focus)

How to structure a Playwright-native automated suite and where an agent layer fits on top of it.