Open Source Stack

Open source software testing is four files on your disk, not a free UI on a proprietary backend.

The 2026 version of open source software testing is more than picking Playwright. It is choosing a stack where the scenario format, the AI agent that runs them, the browser driver, and the artifacts are all inspectable and self-hostable. Here is what that actually writes to disk, file by file, with the anchor lines of code that make it work.

M
Matthew Diakonov
9 min read
4.9from developers running Assrt locally
Scenarios live in your repo as plain text, not a vendor database
Real Playwright code, not a proprietary YAML dialect
MIT licensed end to end. Zero cloud dependency. Zero vendor lock-in.

Why “just use Playwright” is an incomplete answer

Every article about open source software testing lists the same five runners. Selenium. Playwright. Cypress. Puppeteer. pytest. They are all genuinely open source. They are also half of a testing stack.

The other half is the format your scenarios live in, the layer that decides what to run, the artifact you get back, and where any of that gets stored. A test runner that emits real code is only open source end to end if the thing that wrote that code, the thing that stored it, and the thing that reads the results are also open source. Most paid testing platforms score well on exactly one of those four. The runner. Then they store your scenarios as rows in their database, hide the agent prompt behind a SaaS boundary, and give you a dashboard URL instead of a file.

The right test for whether your testing stack is open source is not “is Playwright in the tech stack.” It is “if the vendor disappeared tomorrow, would my tests still run?” If the answer is no, you have a free UI on a proprietary backend.

The four layers of an end-to-end open source stack

Every testing stack in 2026 has the same four layers whether the vendor shows them to you or not. Format. Agent. Runner. Artifact. An open source stack is one where all four answers are MIT or Apache code you can read.

The four layers, what they are in assrt-mcp, and what the artifact looks like

Format
Agent
Runner
assrt_test
scenario.md
latest.json
recording.webm
player.html

Layer 1: the format is plain text, not a proprietary DSL

The scenarios file is the part most paid tools get wrong. They encode tests in a JSON schema with their internal step types (open_url, click_selector, wait_for_ms) and store it in their database. That schema is the lock-in. The open source answer is a markdown-ish file with #Case N: headers and English sentences. The whole parser is a regex and a split, which means the format is editable by any human, any LLM, and a future runner you have not written yet.

tests/smoke.txt

The parser that turns this file into executable cases is twelve lines, sitting in src/core/agent.ts:620-631. It splits on a single regex and returns an array of { name, steps }. If you want to migrate your scenarios to a different runner, you write twelve lines of your own parser, or you just keep the file and re-read the MIT source.

assrt-mcp/src/core/agent.ts

Layer 2: the TestAssertion primitive you can grep

The reporting layer in most closed tools is a web dashboard. You cannot write a shell script that says “fail the build if any assertion with the word checkoutregressed.” In an open source stack, every assertion a run produces is a typed record with three fields: what you asked it to prove, whether it proved it, and the exact evidence string the agent captured when it checked. Three fields, one line of grep to find what broke.

assrt-mcp/src/core/types.ts

The shape is intentionally small. A TestReport is a URL, an array of ScenarioResult, counts, and a timestamp. A ScenarioResult has steps and assertions. An assertion has a human-readable description, a boolean, and a string of evidence. That is the whole data model. You can query it with jq, diff it between commits, or feed specific evidence strings back into the agent for a re-run.

Layer 3: the browser is real Playwright, not a shim

The runner layer is the one where most platforms are genuinely OK. They already use Playwright or Selenium under the hood. What you want to verify is that the runner is talking to a real Playwright instance on your machine, not a headless runner in their cloud with your credentials. assrt-mcp wraps @playwright/mcp and its dependency tree lists @modelcontextprotocol/sdk, @anthropic-ai/sdk, and @google/genai. No vendor-internal bridge. No headless farm. The browser opens on your machine and the run either passes or fails against the network your machine can see.

Layer 4: the artifact is four files, not a dashboard URL

This is the uncopyable part of the stack. Closed platforms show you the results in their web UI. An open source stack writes them to disk in formats any program can read, and serves the video itself from a local HTTP server on an ephemeral port. If the internet disappears, you can still replay any run by double-clicking a file.

artifact layout on disk

The video player is the part people do not expect. When a run finishes, assrt-mcp generates a player.html next to the recording. It is fully self-contained: hardcoded 1x, 2x, 3x, 5x, and 10x speed buttons; Space to toggle play/pause; arrow keys to seek five seconds; number keys to jump speed. It is served via a tiny Node HTTP server on an ephemeral port with Range-request support so the webm seekbar works. The code for all of that is at src/mcp/server.ts:35-111. No SaaS video host. No auth token. If you close the terminal, the file stays on disk and still plays.

What an actual run looks like in a terminal

No account. No dashboard login. No web upload. One command, one webm file, one JSON report, one exit code you can gate CI on. If you want the agent to run this for you instead, it is an MCP tool called assrt_test.

npx assrt run --url http://localhost:3000 --plan-file tests/smoke.txt --json

Free UI on proprietary backend vs. fully open

The easiest way to audit your current testing vendor is to ask them to export every test in a format that runs tomorrow without their platform. The answers you get are usually: “you can export the last run” or “you get a Playwright script per test.” Neither is the same as owning the format and the artifact. Here is the concrete difference:

What fully open source actually means

FeatureFree UI / paid cloudassrt-mcp stack
Scenarios live in your repoIn the vendor's database. Export is best-effort.tests/*.txt plain text files, version-controlled.
Scenario format parser is openProprietary schema, vendor endpoint.12 lines at src/core/agent.ts:620-631.
Agent prompt is inspectableBehind a SaaS API.src/core/agent.ts — MIT, readable, fork-able.
Test assertions as typed recordsDashboard rows, not queryable files.TestAssertion { description, passed, evidence } in types.ts.
Runner is real PlaywrightSometimes. Usually a headless farm with your creds.@playwright/mcp on your machine, your network.
Video and player artifactDashboard player; link rots when plan ends.recording.webm + self-contained player.html on disk.
License of every core moduleClosed source.MIT.
Works offlineNo. Needs vendor cloud.Yes. Local runner, local artifact, ephemeral localhost server.
Cost at steady state$500 to $7,500/month, per seat.$0. You pay the LLM inference bill if you use the AI layer.

The mental model

A testing stack is open source if the vendor disappearing tomorrow would not delete your tests.

Tests as plain text in your repo. An agent whose prompts are in an MIT file. A runner that opens a browser on your machine. Results as JSON on disk and a webm you can play in any browser. If any one of those is a SaaS endpoint you do not own, the stack has a lock-in point. Name it and decide whether that is the right trade for what you are getting.

Standing up the stack in four steps

This is the part that sounds like it should take a week. It is one setup command, one scenarios file, one CI gate, and an optional agent hook. Everything else is reading the same artifact files from different places.

1

Install the open source MCP server

Run npx @assrt-ai/assrt setup. The CLI registers the assrt MCP server globally for Claude Code or Cursor, drops a small QA reminder into your global CLAUDE.md, and writes a PostToolUse hook that nudges the agent to run assrt_test after it touches user-facing code. No account. No paid tier. The source for setup is in the same MIT repo.

2

Check in a plain-text scenarios file

Create tests/smoke.txt (or wherever you want) with #Case N: headers and English sentences. Commit it. The agent reads this file, writes results to /tmp/assrt/results/latest.json, and uses /tmp/assrt/scenario.md as the live working copy while it iterates. If you want to delete Assrt tomorrow, your scenarios are still sitting in your repo as text you can hand to any other runner.

3

Gate CI on failedCount from the TestReport JSON

npx assrt run --json writes a TestReport shape (url, scenarios[], passedCount, failedCount, totalDuration, generatedAt) to stdout. One jq '.failedCount' > 0 check is the whole CI gate. Works in GitHub Actions, Vercel build hooks, Fly deploy checks, or a local pre-push git hook. Nothing to configure beyond the exit code.

4

Let the agent extend scenarios and passCriteria

When the same AI agent edits your code and has MCP access to assrt_test, it will extend the scenarios file each time it touches a new route or component. You seed a starter list once. The agent grows it in place. Because the file is plain text, every extension is a diff a human can review in a PR.

What every layer is licensed as

The word “open source” hides a lot of variance. The important question is whether every layer of the stack is OSI-approved and self-hostable. These are the actual license strings for the components in the assrt-mcp stack:

assrt-mcp agent — MITPlaywright — Apache-2.0@playwright/mcp — Apache-2.0Model Context Protocol SDK — MIT#Case format parser — MIT (12 lines)TestReport/TestAssertion types — MITvideo player.html — MIT, self-containedlocalhost HTTP Range server — MIT, in-repoCLI entrypoints — MIT

The numbers that matter

A good test for whether a stack is genuinely open source is how small the critical pieces are. If the scenario parser needs a five-service backend to read a file, the format is the lock-in. These are the real sizes for the parts that matter.

0
lines in the #Case parser
0
fields on TestAssertion
0
artifact files per run
$0
license cost, any scale

Where the paid platforms still win

This would be dishonest without the other side. The genuinely open source stack gives up three things a paid platform ships out of the box. A hosted dashboard with SSO. A managed parallel runner farm. A support SLA. If your team needs all three on day one and does not want to self-host, pay the vendor. You are paying for operations, not capability.

What you want to avoid is paying for operations and getting a format that only runs in their cloud. Those are separable products. A team using an open source stack can still run it on any managed CI (GitHub Actions, Vercel, Fly), pipe the TestReport JSON into an in-house dashboard, and keep every scenario file portable. If a vendor later ships a better dashboard, you point it at the same files. The lock-in is in the format; the rest is commoditized.

Want a walkthrough of a fully open source testing stack on your repo?

20 minutes, your code on screen, and you leave with a tests/ folder, a working MCP hook, and a CI gate against TestReport JSON.

Book a call

Frequently asked questions

What counts as open source software testing in 2026?

A stack where every layer is inspectable and swappable: the test format (not a proprietary YAML dialect), the runner (real Playwright, Selenium, Cypress, not a wrapper that re-exports them over a cloud API), the AI layer if you use one (prompts and scoring logic you can read), and the artifacts (files on disk you can diff, grep, and version). If any layer encodes your tests in a format only the vendor can read, or stores them in a cloud account you cannot self-host, it is not fully open source. It is a free UI on a proprietary backend.

Isn't 'open source testing tools' just a list of Playwright, Selenium, Cypress, and pytest?

Those are open source test runners. They are half of a testing stack. The other half is the format your scenarios live in, the agent or script that decides what to run, the reporting artifact, and the recording layer. A modern team picks all of those deliberately. A list of runners without a position on the format layer and the artifact layer is an incomplete answer.

Where are the tests stored when I run an open source AI testing tool?

In a genuinely open setup, in your repo as plain text, next to the code. For example the assrt-mcp layout writes the active plan to /tmp/assrt/scenario.md and the most recent run to /tmp/assrt/results/latest.json, both of which are regular files you can open in any editor. In a 'free UI on proprietary backend' setup, your tests live in the vendor's database and you cannot export them in a format that still runs without their cloud.

Does real Playwright code actually matter, or is it OK to use a no-code recorder?

It matters the moment you want to migrate. If the recorder emits real Playwright TypeScript into your repo, a future team can open it in any editor, run it with npx playwright test, and it works. If the recorder stores the test as a JSON document in the vendor's cloud with their opcodes (open_url, click_selector_id_42, assert_text_id_7), migrating means rewriting. The format is the lock-in, not the UI.

What does the TestAssertion primitive look like in assrt-mcp?

It is three fields: description, passed, evidence. This is the full definition at src/core/types.ts:13-17 of the open-source assrt-mcp repo. Each assertion the agent makes during a run becomes one of these objects in the final TestReport JSON. Because it is just a TypeScript interface in an MIT-licensed file, you can grep assertions[].evidence across every historical run, write alerts on regressions, and feed specific evidence strings back into the agent for a re-run.

How are videos of the test run handled without a cloud vendor?

Playwright records a .webm file directly. assrt-mcp generates a self-contained player.html next to the recording (src/mcp/server.ts:35-111) with hardcoded 1x/2x/3x/5x/10x speed buttons and Space/arrow-key shortcuts, and serves it over a tiny Range-request HTTP server on an ephemeral localhost port. No SaaS, no auth, no upload. If you close the terminal, the files stay on disk and still open in any browser.

Is the AI layer actually open source, or just the runner?

In assrt-mcp the agent itself is the open part. The scenario parsing regex is in src/core/agent.ts:620-631. The pass-criteria injection that turns plain English into a MANDATORY verification contract is seven lines at src/core/agent.ts:670-672. The tool definitions are in src/mcp/server.ts. The model provider (Claude, Gemini) is pluggable. You still pay an inference provider, but the orchestration, prompts, and evaluation logic are MIT licensed in the repo.

Why do teams pay $7.5K/month for closed testing platforms if open source covers it?

Historically for three things: a hosted recorder, a dashboard UI, and managed run infrastructure. In 2026 all three are tractable to self-host. The recorder is real Playwright codegen. The dashboard is a JSON TestReport rendered by any UI layer. The runners are containers on GitHub Actions, Vercel, Fly, or a dev machine. The remaining reason to pay is support and SSO, not capability.

How do I adopt an open source software testing stack without a big migration?

Do not migrate. Add it next to what you have. Check in a plain-text scenarios file, wire npx @assrt-ai/assrt setup to install the MCP server locally, let the agent extend the scenarios each time it touches code, and gate CI on failedCount from the TestReport JSON. Existing Jest, Vitest, or Playwright suites keep running. The open source AI layer sits on top.