Open Source Recipe

How to open source testing: three files replace the vendor cloud, everything else is just getting out of their way.

This is a recipe, not a think piece. The end state is a repo where your scenarios are a plain-text file, your reporting is a six-field JSON, and your recordings are webm plus a self-contained player.html that works offline. Everything between you and those three files is optional.

Matthew Diakonov, Written with AI

Published April 20, 202610 min read

4.9from teams running their testing fully open source with Assrt

Scenarios live in your repo. The parser is 12 lines of MIT TypeScript.

TestReport JSON has six fields. One jq expression is a valid CI gate.

Video plays offline forever via a self-contained player.html on disk.

How to open source your testing

three disk artifacts, one 12-line parser, zero vendor cloud

scenarios become a plain-text file in your repo

results become a six-field TestReport JSON

video becomes a webm plus self-contained player.html

ci gate becomes one jq expression on failedCount

0:00 / 0:05

The contract is three files, not a framework

Most guides tell you to open source your testing by picking a runner. That is the wrong first question. Runners are almost all open source already. Playwright is Apache. Selenium is Apache. Cypress is MIT. The lock-in is rarely in the runner. It is in the format your scenarios live in, the reporting artifact your CI reads, and where the video of a failing run is stored.

Open sourcing your testing is the much smaller problem of moving those three things onto your disk, in formats a stranger could open without your vendor account. Once they are on disk, the runner choice is interchangeable.

What gets moved, from closed to open

Before and after, for a real project

This is what the transition looks like in file-system terms for a small SaaS. Nothing about the product changes. What changes is where the tests are and what format they are in.

What the test assets look like before vs after

Scenarios are strings of text inside the vendor's web UI. There is no file in the repo that lists what the tests are. Results are web pages behind a login. Videos expire at the end of the plan. If the vendor invoice lapses, the tests do not run and you cannot export them in a format that runs elsewhere.

No tests/ folder in the repo
Scenarios are rows in someone else's database
Recordings are hotlinked SaaS URLs that rot
Reporting is a dashboard, not a file CI can parse
Canceling the vendor deletes your tests

Artifact 1: the scenarios file

The format is the piece most paid tools get wrong. They encode tests in a JSON schema with their own step opcodes and store it server-side. Migrating means rewriting. The open source move is a markdown-ish file with #Case N: headers and English sentences. A future runner reads it with a regex.

tests/smoke.txt

The parser that turns this file into executable cases is exactly twelve lines. If Assrt stopped shipping tomorrow and you wanted to keep the file, you could paste those twelve lines into your own repo in any language and stay moving. That is the point of picking a format that parses with a regex instead of a compiler.

assrt-mcp/src/core/agent.ts (lines 620-631)

Artifact 2: the TestReport JSON

The reporting layer is where paid tools earn recurring revenue. You end up with a dashboard URL that you have to log into every time. The open source move is a JSON file with six fields. Diffable, grep-able, and gate-able from any shell. If your CI cannot read a plain file, the problem is not testing, it is CI.

assrt-mcp/src/core/types.ts (lines 28-35)

Six fields is the whole schema the CI job cares about. If any vendor report you are paying for cannot be flattened into something isomorphic to this, their reporting format is a lock-in point and a reason the team cannot rip it out without a migration project. That is precisely what opening up your testing removes.

Artifact 3: the video and its self-contained player

Playwright writes recording.webm to disk on its own. The missing piece teams usually pay for is a player with speed controls and keyboard shortcuts. assrt-mcp generates that player as a single player.html file next to the recording. It has hardcoded 1x, 2x, 3x, 5x, and 10x speed buttons, Space toggles play/pause, and arrow keys seek five seconds. The generator is at src/mcp/server.ts:35-111. A tiny local HTTP server with Range-request support serves the webm so the seek bar works. No SaaS, no auth, no upload.

The reason this matters for open-sourcing is that every other artifact of a test run is pointless if the video of the failure is hotlinked from a vendor whose trial ends in a week. If you can double-click player.html in six months and it still plays, the debug loop survives.

The four steps, in the order you do them

Every step maps to one of the three artifacts, plus a final wire-up into CI. None of them require tearing down an existing test suite. You add the open source layer alongside whatever you already have and retire the proprietary one when you are ready.

The open-source-your-testing recipe

1
Install the open source test agent
npx @assrt-ai/assrt setup registers the MCP server globally, installs a post-tool-use reminder into your global CLAUDE.md, and gives you the assrt CLI. No account. MIT licensed. Everything lives in your npm tree.
2
Extract scenarios into tests/smoke.txt
Copy whatever your scenarios are today (vendor UI, Notion, a Google Doc, an SDET's head) into a file with #Case 1:, #Case 2: headers and plain-English steps. The file is the new source of truth. Commit it.
3
Run once locally, get three artifacts on disk
npx assrt run --url http://localhost:3000 --plan-file tests/smoke.txt --video --json drops report.json, a webm, and player.html into /tmp/assrt/<runId>/. Open player.html to replay the run. Open report.json with jq.
4
Gate CI on report.json .failedCount
In GitHub Actions, Vercel, Fly, or a local pre-push hook, run the same CLI with --json > report.json and add jq '.failedCount' > 0 as the pass/fail check. Upload /tmp/assrt/ as a build artifact for history.

The CI gate, in full

This is the entire CI config. It is short on purpose. Opening up your testing should not require a platform team or a new dashboard. A GitHub Actions job, a jq expression, and an upload-artifact step covers running, gating, and history.

.github/workflows/assrt.yml

What a single local run looks like

One command, one exit code, four artifact files. No dashboard. No login. No upload. The agent reads the scenarios file, runs the browser on your machine, and writes the results to disk. The paths are the same every time because they are constants exported from src/core/scenario-files.ts:16-20.

npx assrt run --plan-file tests/smoke.txt --video --json

The mental model

You have open sourced your testing when pulling the plug on every vendor leaves the test suite untouched.

Three files in your file system. One regex for the format. One jq expression for the CI gate. One webm plus player.html for every recording. Everything beyond that is an operations choice (hosted dashboard, SSO, a runner farm) and you can pay for it if you want, but none of it should own the test artifacts themselves.

The numbers that make the recipe small enough to adopt

Open sourcing a testing stack fails when the pieces are big enough to look like a project of their own. The critical parts here are deliberately small so that a team can adopt them in one sitting.

artifacts you own (text, json, webm+html)

lines in the #Case parser

fields on the TestReport JSON

vendor cost for the testing loop

The licenses on every piece you touch

The phrase “open source testing” only means something if every layer you depend on is actually OSI licensed and self-hostable. These are the actual licenses for each component this recipe uses.

assrt test agent — MIT#Case parser — MIT (12 lines)TestReport types — MITplayer.html generator — MITlocalhost Range HTTP server — MITPlaywright — Apache-2.0@playwright/mcp — Apache-2.0Model Context Protocol SDK — MITjq — MITGitHub Actions runner — MIT

What you keep paying for, and what you stop paying for

Opening up the testing loop does not mean every bill goes to zero. It means the bills go to the right places. You keep paying compute costs (CI minutes, an LLM if you use the AI layer) because those are commodity inputs. You stop paying for a format, a dashboard URL, and a video player that only works inside a login wall, because those are things a team can own in one afternoon.

If you still want a hosted dashboard with SSO and a support SLA, buy one. But buy it knowing the test artifacts are yours, the format is portable, and the vendor is a UI over files you own, not the other way around.

Want someone to walk through the three-file switch on your codebase?

20 minutes, screenshare on your repo, and you leave with a tests/ folder, a report.json checked into CI, and a webm you can replay offline.

Frequently asked questions

What does 'open source testing' actually mean in practice, not in philosophy?

Three disk artifacts are the whole contract: a scenarios file in plain text, a JSON TestReport that any CI can read, and a recording (webm plus self-contained player.html) that plays offline. If those three files live in your file system and your repo, your testing is open source. If any of them lives in a vendor's database or cloud, you still have lock-in even if the runner underneath is Playwright.

How long does it actually take to open source testing on a typical project?

About an hour for a small repo and a weekend for a medium one. The long pole is extracting scenarios from wherever they live today (a Testim account, a Notion page, a Jira test-plan, a Google Doc). Once the scenarios are in a plain-text file with #Case N: headers, running them locally is one command (npx assrt run --plan-file tests/smoke.txt --json) and wiring CI is one jq check against the TestReport JSON's failedCount field.

Do I need to rewrite my existing Playwright tests to open-source my testing?

No. Playwright scripts that already live in your repo are already open source in the sense that matters: the format is inspectable, the runner is real, the artifact is a file. What you are opening up is the stuff that is NOT already in your repo, usually the scenario list, the reporting layer, and the video recording. Keep the Playwright scripts you have and add a tests/*.txt file of higher-level #Case scenarios the agent drives on top.

What is the exact shape of the TestReport JSON I should gate CI on?

Six fields, defined in src/core/types.ts:28-35 of the open source assrt-mcp repo: url (string), scenarios (array of ScenarioResult), totalDuration (number in ms), passedCount (number), failedCount (number), generatedAt (ISO timestamp). One jq '.failedCount' > 0 expression is a valid CI gate in any shell. If a vendor gives you a reporting artifact that does not fit this shape or something equivalent, you cannot diff runs or script alerts without their UI.

Where do the test scenarios actually live on disk when I run an open source AI test agent?

At /tmp/assrt/scenario.md while a run is active, with metadata at /tmp/assrt/scenario.json and historical results at /tmp/assrt/results/<runId>.json. These paths are exported as the PATHS constant from src/core/scenario-files.ts:16-20 of assrt-mcp. The active plan is a live working copy an agent can edit; the canonical scenarios stay in your repo at tests/smoke.txt or wherever you commit them.

How is the video recording handled without a cloud vendor?

Playwright writes recording.webm directly to disk. assrt-mcp generates a self-contained player.html next to it (code at src/mcp/server.ts:35-111) with hardcoded 1x/2x/3x/5x/10x speed buttons, Space/arrow keyboard shortcuts, and the video served over a tiny localhost HTTP server with Range-request support. No upload, no auth, no SaaS link that rots. Double-click player.html in six months and it still works.

What if I do not want an AI layer and I just want to open-source the scenario and reporting format?

The format and the reporting are separable from the agent. The #Case parser is 12 lines at src/core/agent.ts:620-631, a single regex and a split. You can write your own runner in any language that reads the same format, emits the same TestReport JSON, and drops a webm in the same directory structure. Existing runners (Playwright Test, Jest, pytest, Mocha) can also be wrapped in about 50 lines to produce the same output shape.

What happens to my tests if Assrt disappears tomorrow?

Nothing, because Assrt never held them. The scenarios are in your repo. The TestReport JSON is in your CI artifacts. The recordings are in your build output. The parser is 12 lines and MIT licensed. The test agent is in the same MIT repo and will keep running; even without the agent, the scenarios are plain English a human can execute or a different runner can parse. That is the whole point of the contract being three files.

Can I open source my testing and still use a managed CI runner like GitHub Actions or Vercel?

Yes. Open sourcing the testing is about the format and the artifact, not about where the compute runs. You can run assrt inside GitHub Actions, Vercel build hooks, Fly deploys, or a local pre-push git hook, and you still own every output file. The managed CI is just a machine that writes files; you pipe those files to wherever you want (S3 bucket, GitHub artifact, in-house dashboard), and the lock-in is zero because the format is portable.

What is the minimum set of commands to get a fully open source testing loop running today?

Four commands. (1) npx @assrt-ai/assrt setup registers the MCP server and installs a reminder hook. (2) Write tests/smoke.txt with #Case N: headers and plain-English steps. (3) npx assrt run --url http://localhost:3000 --plan-file tests/smoke.txt --video --json runs it locally and drops the artifact files. (4) Add jq '.failedCount' < tmp-report.json as a pass/fail check in CI. Nothing else is required for a loop that works in your repo and out of anyone else's cloud.

Install Assrt and get the three files

MIT licensed. Self-hosted. No account required.