Technical Guide

Real Playwright Tests, Not Automation Scripts: How Assrt Eliminated the .spec.ts File

Every AI testing tool on the market promises to "generate Playwright tests." What they actually generate is code files. Files you check into version control, maintain when the UI changes, debug when selectors break, and eventually delete when maintenance costs outweigh the value. Assrt works differently. Instead of generating test scripts for you to run later, an AI agent interprets your test scenarios in plain English and drives real Playwright browser APIs at runtime. The Playwright calls are genuine. The test artifact is not a TypeScript file.

$0/mo

“Open-source, self-hosted, zero vendor lock-in. Your test scenarios are plain Markdown you own forever.”

Assrt architecture

1. The Code Generation Trap

When most tools say they "generate Playwright tests," they mean one of two things. Either they produce .spec.ts files with hardcoded selectors and assertions, or they output proprietary YAML that wraps Playwright underneath. Both approaches share the same fundamental problem: they create a code artifact that immediately starts decaying.

A generated .spec.ts file works perfectly the moment it is created. The selectors match the current DOM. The assertions pass against today's UI. But the moment a developer changes a button label, rearranges a form, or updates a component library, the file becomes a maintenance liability. Someone has to update the selectors. Someone has to fix the assertions. Someone has to run the suite and figure out which failures are real bugs and which are stale tests.

Proprietary YAML formats add another layer to this problem. The tests are locked to a specific vendor's runtime. If you outgrow the tool or the vendor raises prices (and $7,500/month is not uncommon for enterprise testing platforms), your YAML files are worthless outside their ecosystem. You cannot run them with standard Playwright. You cannot port them to another tool without rewriting every test from scratch.

The root issue is not the quality of the generated code. The issue is that generating code is the wrong abstraction for tests that need to survive UI changes.

2. How Assrt Actually Works: The Agent as Test Runner

Assrt takes a fundamentally different approach. Instead of generating code that a test runner executes, an AI agent is the test runner. You write (or let AI generate) test scenarios in plain English. The agent reads each scenario, interprets the steps, and drives a real Playwright browser session to execute them.

The architecture is a two-stage pipeline. First, a planning model (Gemini) analyzes a page's screenshots, DOM structure, and interactive elements to generate test scenarios in a simple #Case N: Markdown format. Each case contains 3 to 5 steps describing what to do and what to verify. Second, an execution model (Claude Haiku) reads those scenarios and translates each step into real Playwright API calls inside an ephemeral browser VM.

There is no intermediate code generation step. The agent does not produce a .spec.ts file and then run it. The agent reads "click the Login button" and calls browser_click with the appropriate element reference. It reads "verify the dashboard shows Welcome text" and calls browser_wait_for with that text string. Every interaction is a genuine Playwright MCP tool invocation against a real Chromium browser.

This means the agent adapts in real time. If a button label changed from "Submit" to "Save Changes," the agent uses Playwright's accessibility tree snapshot to find the right element by role and context, not by a hardcoded selector from three months ago. The scenario still says "click the save button." The agent figures out which element that refers to right now, in the current state of the page.

Try it yourself

Write a test scenario in plain English, point it at your local dev server, and watch real Playwright calls execute in a live browser. No setup, no code files.

Get Started →

3. The 10 Real Playwright APIs Under the Hood

This is not a wrapper that pretends to use Playwright. The Assrt agent calls actual Playwright MCP tools through convenience methods defined in browser.ts. Each method maps directly to a Playwright browser operation:

navigate(url)Calls browser_navigate. Opens a URL in the real Chromium instance.
snapshot()Calls browser_snapshot. Returns the full accessibility tree of the current page.
click(element, ref?)Calls browser_click. Clicks an element by accessibility reference, with a visible cursor animation.
type(element, text)Calls browser_type. Types text into an input field with keystroke visualization.
selectOption(element, values)Calls browser_select_option. Selects values from dropdown elements.
screenshot()Calls browser_take_screenshot. Captures a JPEG screenshot for visual evidence.
pressKey(key)Calls browser_press_key. Sends keyboard events like Enter, Tab, or Escape.
scroll(x, y)Calls browser_scroll. Scrolls the page by pixel coordinates.
waitForText(text)Calls browser_wait_for. Waits until specific text appears on the page, used for assertions.
evaluate(expression)Calls browser_evaluate. Executes arbitrary JavaScript in the page context.

Each of these methods includes cursor overlay injection and keystroke visualization, so test runs produce watchable video recordings showing exactly what the agent did. The click method waits 400ms for a CSS cursor glide animation before executing, so the recorded video is human-readable. This is not a headless script blasting through a page in 200ms. It is a visible, auditable browser session.

4. Scenarios, Not Scripts: What the Test Input Looks Like

Traditional test automation requires you to learn an API, write code in a specific language, and structure your tests according to a framework's conventions. Assrt scenarios are plain Markdown. Here is what an actual scenario looks like:

#Case 1: Login with valid credentials
Navigate to the login page.
Fill in test@example.com in the email field.
Fill in the password field with a valid password.
Click the Login button.
Verify that the dashboard loads with "Welcome" text visible.

#Case 2: Empty form submission shows validation
Navigate to the login page.
Click the Login button without filling in any fields.
Verify that an error message appears for the email field.

The agent's parseScenarios function splits this text on the #Case delimiter using a regex pattern. Each block becomes an independent test scenario with a name and steps. The agent then executes each scenario in sequence, collecting pass/fail results, screenshots, and assertion evidence into a structured JSON report.

Because scenarios are plain text, they are version-controllable, diffable, and readable by anyone on the team. A product manager can write a scenario. A QA engineer can refine it. A developer can review it in a pull request. Nobody needs to understand Playwright's locator API, assertion library, or test runner configuration.

And because the scenarios are interpreted at runtime rather than compiled into code, the same scenario works across UI changes without modification. "Click the Login button" works whether that button is a <button>, an <a> styled as a button, or a <div role="button">. The agent resolves the reference against the live accessibility tree each time it runs.

5. What You Stop Maintaining

The practical difference comes down to what disappears from your workflow:

1.No selector maintenance. You never write page.locator('button[data-testid="submit"]') and then update it when someone renames the test ID. The agent finds elements by intent, not by CSS selector.
2.No test framework configuration. No playwright.config.ts, no test fixtures, no custom reporters, no retry logic. The agent handles retries, timeouts, and error recovery natively.
3.No CI/CD pipeline integration work. Because tests run in ephemeral VMs triggered by an MCP call, there is no Docker image to build, no browser binary to install, and no flaky headless Chrome configuration to debug.
4.No vendor lock-in. Your scenarios are Markdown files. If you stop using Assrt tomorrow, your test cases are still perfectly readable descriptions of what your application should do. Port them to any tool, run them manually, or use them as acceptance criteria.
5.No test code review overhead. Pull requests that add or change test scenarios contain readable English sentences, not Playwright API calls wrapped in async functions with try/catch blocks.

The trade-off is real: you lose the fine-grained control of hand-written Playwright code. If you need to test a specific WebSocket message sequence or validate an HTTP response header, a .spec.ts file is still the right tool. Assrt is for the 80% of E2E testing where the question is "does this user flow work from start to finish?" and the test should be as easy to write as describing the flow in a sentence.

Frequently Asked Questions

Is Assrt actually using Playwright, or is it a different engine?

It uses real Playwright. The agent communicates with a Playwright MCP (Model Context Protocol) server that controls a genuine Chromium browser instance. Every click, navigation, and assertion is a standard Playwright browser operation. You can verify this by examining the browser.ts source file, which maps each convenience method (navigate, click, type, etc.) directly to a Playwright MCP tool call.

Can I export Playwright .spec.ts files from Assrt?

Assrt does not generate .spec.ts files by design. The scenarios are intentionally kept as plain Markdown because that format survives UI changes without selector maintenance. If you need traditional Playwright scripts for specific low-level tests, you can use Playwright's built-in codegen tool alongside Assrt for the scenarios that require it.

How does the agent handle dynamic content and loading states?

The agent uses Playwright's browser_wait_for tool to wait for specific text or elements to appear before proceeding. It also takes accessibility tree snapshots (browser_snapshot) to understand the current page state before interacting with elements. If an element is not yet visible, the agent waits and retries rather than failing immediately.

What happens when a test fails? How do I debug it?

Each test run produces a structured JSON report with per-scenario pass/fail results, assertion evidence (including screenshots), and step-by-step execution logs. The agent also records video of the entire browser session with cursor animations and keystroke overlays, so you can watch exactly what happened. For root cause analysis, you can pass the failure to assrt_diagnose, which analyzes the error and returns a corrected scenario.

Is Assrt free? What is the catch compared to $7,500/month competitors?

Assrt is open-source and free to self-host. The MCP server runs locally on your machine, and tests execute in ephemeral VMs. There is no cloud dependency, no per-seat pricing, and no usage limits imposed by Assrt itself. You pay only for the compute your VMs use and the AI model API calls. The total cost for most teams is a few dollars per month, not thousands.

Can I run Assrt tests in CI/CD pipelines?

Yes. Since Assrt exposes an MCP interface, any CI system that can run a Node.js process can trigger test scenarios programmatically. The scenarios are stored in Markdown files that live in your repository. The MCP server handles browser provisioning, test execution, and result collection. No Docker images with pre-installed browsers required.

How does this compare to Playwright Test Agents?

Playwright Test Agents (Planner/Generator/Healer) still produce .spec.ts code files as their output. The Generator turns a plan into runnable Playwright code that you commit and maintain. Assrt skips the code generation step entirely. The agent interprets scenarios and drives Playwright APIs directly, which means there is no generated code to heal or maintain in the first place.

Try real Playwright tests without writing test code

Describe what should work in plain English. An AI agent drives real Playwright APIs in a live browser. Open-source, free, no lock-in.

$Free forever. No credit card required.

Get Started View on GitHub