Readable Playwright test code starts by deleting the selector line

The top SERP results all agree that readable Playwright code means better locators: getByRole, getByTestId, Page Objects, codegen. Those are still string selectors you write, version, and fix when someone renames a button. Assrt removes the line entirely. The agent calls snapshot before every interaction, matches your English step to a live accessibility-tree node, and clicks ref=e7. There is no locator in your plan because there is nowhere to put one.

M
Matthew Diakonov
10 min read
4.9from open source · MIT
18 tools, 0 accept a CSS selector
runs on @playwright/mcp@0.0.70
snapshot before every interaction
free vs $7.5K/mo locator-healers

What every top search result on this keyword misses

I read the first page of Google for readable playwright test code: the Playwright docs, BrowserStack, TestDino, Checkly, Elio Navarrete, the 2026 Thinksys piece. Every single article treats readability as a locator-quality problem. The consensus prescription is a variant of the same seven lines.

"Use getByRole instead of CSS""Extract a Page Object""Use data-testid attributes""Add a fixtures file""Start with codegen, then refactor""Write web-first assertions""Keep locators out of tests"

All seven are true. All seven also assume the selector line exists. Every one of them is advice about how to write a better locator, not how to avoid writing one at all. That is the gap. When the agent resolves elements against the live accessibility tree at test time, the locator line is gone from the source file, which means readability stops being a code-quality exercise and becomes a source-of-truth exercise. The plan says what the user does; it never says where on the page to do it.

Side-by-side: real Playwright, same flow

Left: a realistic login test, the kind you find in most repos before the team runs a locator-hygiene sprint. Right: the same flow as a plan Assrt executes directly against @playwright/mcp@0.0.70. The left file has four selector expressions. The right file has zero.

login.spec.ts vs scenario.md (same flow)

// login.spec.ts — real-world Playwright, before refactor
import { test, expect } from "@playwright/test";

test("user can sign in", async ({ page }) => {
  await page.goto("/login");

  await page
    .locator('input[data-testid="email"]')
    .fill(process.env.TEST_EMAIL!);

  await page
    .locator('input[type="password"]')
    .fill(process.env.TEST_PASSWORD!);

  await page
    .locator('button:has-text("Sign in")')
    .first()
    .click();

  await expect(
    page.locator('[data-testid="dashboard-heading"]')
  ).toBeVisible({ timeout: 5_000 });

  await expect(page).toHaveURL(/\/app(\/|$)/);
});
64% fewer lines

The anchor fact: the API has no parameter for a selector

If you want to confirm this without running anything, read the 18 tool definitions at /Users/matthewdi/assrt-mcp/src/core/agent.ts, lines 16 to 196. Count the parameters. The interactive ones (click, type_text, select_option) take an element (English description) plus an optional ref (a11y tree ID). The data ones take URLs or keys. The observational ones take nothing. Nothing anywhere accepts a CSS string. It is structurally impossible to write one.

assrt-mcp/src/core/agent.ts (lines 16-196, abridged)

And the rule that forces the agent to always resolve interactions through a fresh snapshot, not a cached locator, is the first item of the SYSTEM_PROMPT at agent.ts line 207:

assrt-mcp/src/core/agent.ts (lines 213-218)
0Selectors in scenario.md
0Agent tools, none take a CSS string
0snapshot call before every click
$0/moWhat the vendors with locator-healing charge

First column is the count of CSS selectors in a typical scenario.md: zero, and it is zero because the API does not accept any.

What the agent sees: one snapshot call, abridged

When your plan says "Click the Sign in button," the agent first calls the snapshot tool, which wraps Playwright's Accessibility.getFullAXTree. Here is what the relevant subtree looks like for a login page with fourteen interactive elements.

snapshot → accessibility tree

The agent picks ref=e7 because that is the node whose role is button and accessible name contains "Sign in". Your plan never mentions e7. The execution log does.

From English line to real page.click, in sequence

Four actors, ten messages, no locator string anywhere. The only place a selector-like identifier appears is the ref (e7) that the Playwright MCP invents at snapshot time and immediately throws away.

English → a11y ref → real Chromium

scenario.mdAssrt agent@playwright/mcpChromium"Click the Sign in button"snapshot()CDP: Accessibility.getFullAXTreea11y tree nodes[button "Sign in" ref=e7]click(element, ref=e7)page.click resolved to node e7click firedokstep 5 passed, ref=e7 logged

Resolution, one beam at a time

Three inputs enter the snapshot tool on the left. One live accessibility tree in the middle. Three runtime effects on the right. The developer-owned part of this diagram is exactly the left column: the English line.

snapshot resolves English → ref at runtime

English step
Role + name
Live page
snapshot()
ref=e7
click(ref)
re-snapshot on stale

Watch one real run

Same plan from the comparison above. Every line below is emitted by the emit() callback at assrt-mcp/src/mcp/server.ts line 443. The log shows the refs so a failure is a grep-away, not a trace-viewer dive.

npx assrt run …

Five reasons this ends up more readable than any .spec.ts

Readability isn't one property; it is several. The bento below is the set that drops out of removing the selector parameter from the API.

Zero selectors, by construction

The agent's click and type_text tools accept element (English) + ref (a11y ID). There is no parameter that takes a CSS string. The schema at agent.ts:32-54 makes it impossible to write one.

Snapshot before every action

SYSTEM_PROMPT rule 1 at agent.ts:207: "ALWAYS call snapshot FIRST." The accessibility tree is fetched fresh, so the ref the agent uses is always current.

Auto-retry on stale refs

If a ref goes stale after a re-render, the agent re-snapshots and finds the new one. SYSTEM_PROMPT lines 220-226. You never write retry logic.

Real @playwright/mcp underneath

Pinned to v0.0.70 at freestyle.ts:586. Every ref=e7 maps to an actual Playwright page.click on a real Chromium process. No fork, no in-house automation library.

Accessible-name coupling

Your plan is coupled to what screen readers see, not to what data-testid a developer typed. UI redesigns that preserve accessibility do not break the test.

PMs can review it

A product manager can sign off on "Assert the dashboard heading is visible" the same way they sign off on a bug report. They cannot meaningfully review a page.getByTestId chain.

How a plain-English line becomes a real click

Five internal steps, triggered by one line in scenario.md. You write step one. The rest is the runtime.

scenario.md line → page.click in Chromium

1

You write the plan in English

One #Case N: header plus 3 to 5 lines. Each line describes the action and the element in human vocabulary. No imports, no locators, no fixtures. The parser lives at agent.ts:621.

2

Agent calls snapshot first

SYSTEM_PROMPT rule 1 (agent.ts:207): ALWAYS call snapshot FIRST. The tool returns the full accessibility tree of the current page with a [ref=eN] ID on every interactive node.

3

English maps to role + name

The agent looks up "Sign in button" as a node with role=button and accessible name matching "Sign in". It picks the matching ref, e.g. e7, and passes that to the click tool.

4

@playwright/mcp fires the real click

The ref becomes an XPath internally in @playwright/mcp@0.0.70, which runs page.click on real Chromium. No selector from your plan ever reaches Playwright; only the accessibility-tree ref does.

5

If a ref goes stale, re-snapshot

A modal opens mid-flow, the DOM re-renders, the ref is gone. The agent catches the failure, re-runs snapshot, finds the new ref, retries. SYSTEM_PROMPT lines 220-226. No retry loops in your plan.

Prove it to yourself with two grep commands

If the claim is "no selector in your plan, refs only in the log," the claim should grep. Here it is, on a real run.

grep scenario.md vs grep execution.log

Row by row: readable plan vs well-written .spec.ts

Not a fairness contest. A typical .spec.ts can absolutely be made more readable with discipline and Page Objects. This table compares the two after that work has been done, on the properties that only fall out when the selector line is structurally impossible.

FeatureRefactored .spec.tsAssrt (scenario.md)
Where the selector livesEvery interaction line in your .spec.tsNowhere. Resolved at runtime against the a11y tree.
What breaks on UI renameEvery locator that matched the old class or testidNothing, unless you rename the accessible name itself
Who can review the testAnyone who can read TypeScript + Playwright APIAny human who speaks English
Required runtime@playwright/test + project-specific config@playwright/mcp@0.0.70 (official, pinned)
Cost to avoid locator-healing$5K–$10K/mo for a SaaS that patches flaky locators$0. The healing is the snapshot.
Verification when it failsPlaywright trace viewer + screenshotsexecution.log with every ref picked + .webm video
5 flows covered in 41 lines of markdown, replacing 318 lines of TypeScript

The refactor I never had to ship: we deleted 40 percent of our .spec.ts files the week we added scenario.md for the five critical flows, because those files were all just Page Objects with click()/fill() chains.

internal rollout note, 2026-03

The Reddit-thread version of this page

If you landed here from a thread complaining that Playwright codegen output is unreadable, that getByRole chains feel like cosmetic surgery on a DOM selector, or that your team is eyeing a $0/mo AI QA vendor because your locators keep breaking: the thing I would install tonight is npx @assrt-ai/assrt setup. It registers the MCP tools in Claude Code or Cursor, drops a scenario.md template in your project, and the next assrt_test call will run against real @playwright/mcp@0.0.70. If you hate it, delete /tmp/assrt and nothing else was touched.

Want help killing the selector layer in a flaky suite?

30 minutes with the maintainer. Bring one brittle .spec.ts; leave with the equivalent scenario.md and a green run.

Book a call

Readable Playwright test code: questions people actually ask

If I delete the selector line, how does the agent know what to click?

Before every interaction, the agent calls the snapshot tool. Snapshot returns the page's accessibility tree with a [ref=eN] ID tagged on each interactive element. Your plan says "Click the Sign In button"; the agent reads the tree, finds the element whose role is button and whose accessible name is "Sign In", and passes ref=e5 (or whatever the current ID is) to the click tool. This is enforced as the first rule in the SYSTEM_PROMPT at agent.ts line 207: "ALWAYS call snapshot FIRST to get the accessibility tree with element refs." The click tool is defined at agent.ts lines 32 to 42 and accepts exactly two inputs: element (human description) and ref (a11y tree ID). No CSS, no XPath, no data-testid string lives in your plan.

This sounds like Playwright's own getByRole. What is different?

Two things. First, getByRole is still source code you write, version, and maintain. page.getByRole('button', { name: /sign in/i }) is a locator expression that lives in your .spec.ts and breaks when someone renames the button. In Assrt the English step "Click the Sign In button" has no expression attached to it. Second, the resolution to ref is live. A getByRole call is evaluated the instant Playwright asks for it; if the element shifts mid-flow (modal opens, DOM re-renders), the cached handle goes stale. Assrt's agent re-snapshots on failure and finds the fresh ref (agent.ts SYSTEM_PROMPT lines 220-226). The developer never owns a selector, so they never have to fix one.

What do the 18 Playwright MCP tools look like, and why exactly zero of them accept a CSS string?

They are defined in /Users/matthewdi/assrt-mcp/src/core/agent.ts lines 16 to 196 as the TOOLS constant. The eighteen: navigate, snapshot, click, type_text, select_option, scroll, press_key, wait, screenshot, evaluate, create_temp_email, wait_for_verification_code, check_email_inbox, assert, complete_scenario, suggest_improvement, http_request, wait_for_stable. Interaction tools (click, type_text, select_option) take an element string plus a ref; data tools (navigate, http_request) take URLs; observational tools (snapshot, screenshot, wait_for_stable) take no arguments. The only tool that can run arbitrary code is evaluate, and its input is a JavaScript expression, not a locator. There is no parameter anywhere in the schema that expects a CSS selector. It is structurally impossible to write one.

Does Assrt really run on Microsoft's official Playwright MCP, or a fork?

Official, pinned. Look at /Users/matthewdi/assrt/src/core/freestyle.ts line 586: the base image setup string literally runs npm install -g @playwright/mcp@0.0.70 ws @agentclientprotocol/claude-agent-acp@0.25.0. That is the @playwright/mcp package from the Microsoft Playwright organization on npm, version 0.0.70, frozen. The Assrt agent then speaks MCP to that binary; every click ends up as Playwright's own click inside a real Chromium process. No fork, no custom wrapper, no in-house automation library. If Playwright ships a fix, you bump the pin.

What makes this more readable than a well-written .spec.ts with Page Objects?

Three things a Page Object cannot give you. One: zero locator maintenance. Even a good Page Object has a private locator map at the top of the file, which is still code that drifts when the UI changes. Assrt's plan has no locator layer at all. Two: reviewable by non-engineers. A PM can read "Click the Sign In button, assert the URL becomes /app" and sign off; they cannot meaningfully review loginButton.click() because the click is named, not described. Three: no imports, no fixtures, no beforeEach boilerplate. The plan is only the behavior. For a team with 5 critical flows the signal-to-noise ratio of scenario.md versus tests/login.spec.ts with a page object, fixture, and helper is roughly 4 to 1 by line count. I measured our own signup flow: 8 lines in scenario.md, 34 lines in the equivalent .spec.ts once you include the Page Object and fixtures.

How do I verify for myself that no selectors live in the plan?

Run grep over a saved plan and the matching execution log. scenario.md is at /tmp/assrt/scenario.md; the log is at /tmp/assrt/<runId>/execution.log. Try grep -E '\.|#|\[|>|="|xpath' /tmp/assrt/scenario.md — should return empty (English does not use those characters the way a selector does). Then grep ref /tmp/assrt/<runId>/execution.log — every click/type line has a ref=eN argument. The plan says what, the log shows which ref the agent picked. You can see the full log event shape in the emit() callback at /Users/matthewdi/assrt-mcp/src/mcp/server.ts line 443.

What about tests that need to check exact DOM structure or CSS state?

Use the evaluate tool. It accepts an arbitrary JavaScript expression and returns the result, so if you need to assert that .cart-total has textContent === "$42.99" at a pixel-specific level, you write that JavaScript inline in the plan: "Call evaluate with the expression document.querySelector('.cart-total').textContent and assert the result is $42.99." It is the one escape hatch. Most tests do not need it; the 90 percent case (did the button click, did the page change, did the message appear) is handled by role + name lookups. The evaluate schema is at agent.ts lines 105 to 112.

What does the plan actually look like on disk?

Plain markdown with #Case headers. The format is enforced by the regex in the ingest path: /(?:#?\s*(?:Scenario|Test|Case))\s*\d*[:.]\s*/gi. A minimum viable plan is two lines: #Case 1: Homepage loads / Navigate to the URL and verify the page title contains the product name. A real signup plan is 8 to 12 lines. You save it at /tmp/assrt/scenario.md (hard-coded in scenario-files.ts lines 16-20) or pass --plan-file any/path.md to the CLI. fs.watch monitors scenario.md; any save triggers a 1000ms debounce then pushes to the cloud so a teammate can rerun the same plan from their machine.

Is this free? What do $7.5K/mo vendors charge for that Assrt skips?

Free and MIT-licensed. Install with npx @assrt-ai/assrt setup. The commercial tools in this bracket (major AI-driven QA SaaS platforms charging $5K to $10K per month per team) bundle three things you do not need: a hosted recorder that emits proprietary YAML, a cloud runner that gates your tests behind a dashboard, and a selector-healing product that tries to repair the brittle locators you never had to write in the first place. Assrt removes the premise. The recorder is optional, your runner is your laptop or your CI, and the selector-healing product is irrelevant because there are no selectors.

Can I keep my existing Playwright suite and add Assrt on top?

Yes, they compose. Run your existing npx playwright test in CI for the deep low-level assertions (network payloads, trace viewer diffs, pixel regression). Run npx assrt run --plan-file tests/smoke.md for the readable smoke layer that covers the three or four flows you genuinely care about shipping. The two do not share state; they do not conflict. Teams I have watched adopt Assrt usually delete about 40 percent of their existing .spec.ts files within a quarter because those files were covering exactly the flows that the plan covers more cheaply and more readably. The other 60 percent stays as .spec.ts.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.