Head-to-Head Comparison

Assrt vs QA Wolf: Your Tests Should Not Break When You Rename a CSS Class

QA Wolf writes Playwright scripts with CSS selectors and locators that target your DOM structure. Assrt reads the accessibility tree at runtime and targets elements by role and name. One approach ties your tests to your markup. The other does not. This comparison explains why that distinction matters more than pricing or test format.

By the Assrt team|April 12, 2026|7 min read

0 selectors in test plans

“Assrt test plans contain zero CSS selectors. Every element interaction goes through accessibility tree refs that are recomputed on each step.”

agent.ts: snapshot tool returns [ref=eN] references for all interactions

1. The Selector Maintenance Problem

Most end-to-end test failures are not caused by actual bugs. They are caused by selectors that stopped matching. A developer renames a CSS class, restructures a component, or swaps a div for a section element. The feature still works. The test breaks.

This is the single largest source of test maintenance in any Playwright or Cypress test suite. Teams adopt conventions like data-testid attributes to mitigate it, but that means modifying production code for testing purposes. It also means every new component needs a developer to remember to add test IDs, and every test author needs to know which IDs exist.

QA Wolf and Assrt handle this problem in fundamentally different ways. QA Wolf minimizes selector breakage through experienced engineers who choose stable locator strategies. Assrt eliminates selectors entirely.

2. How Assrt Targets Elements Without Selectors

Assrt's test agent uses three MCP tools for element interaction, all defined in agent.ts:

// snapshot tool (agent.ts:28)
"Get the accessibility tree of the current page.
 Returns elements with [ref=eN] references
 you can use for click/type.
 ALWAYS call this before interacting with elements."

// click tool (agent.ts:34)
{
  element: "Human-readable description, e.g. 'Submit button'",
  ref: "Exact ref from snapshot, e.g. 'e5'"
}

// type_text tool (agent.ts:46)
{
  element: "Human-readable description",
  text: "Text to type",
  ref: "Exact ref from snapshot"
}

The flow works like this: before every interaction, the agent calls snapshot. The snapshot reads the browser's accessibility tree and returns every element with a role, name, and a temporary ref like [ref=e5]. The agent then calls click or type_text with that ref and a human-readable description of what it is clicking.

The refs are ephemeral. They exist only for the current page state. On the next snapshot, the tree is rebuilt from scratch and new refs are assigned. Nothing persists between steps. There is no selector to go stale.

The test plan itself contains only natural language:

#Case 1: Login with valid credentials
1. Navigate to http://localhost:3000/login
2. Type "test@example.com" into the email field
3. Type "password123" into the password field
4. Click the Sign In button
5. Verify the dashboard heading is visible

No .btn-primary. No [data-testid="login-btn"]. No #submit-form > div:nth-child(3) > button. The AI agent figures out which element matches "the Sign In button" by reading the accessibility tree at the moment it needs to click.

See it in action

Point Assrt at any URL and watch the agent interact with elements using accessibility tree refs. No selectors, no test IDs, no setup.

3. How QA Wolf Targets Elements

QA Wolf's engineers write Playwright test scripts for your application. Playwright provides several locator strategies: page.getByRole(), page.getByText(), page.getByTestId(), and CSS selectors via page.locator().

QA Wolf's engineers are skilled at choosing stable locators. They prefer getByRole and getByText over raw CSS when possible, and they maintain your tests when selectors break. But even the best locator strategy is a static string compiled into a test script. When the DOM changes between test runs, the locator must be updated. That update requires a human engineer to notice the failure, diagnose whether it is a real bug or a selector mismatch, and edit the script.

QA Wolf's value proposition includes this maintenance. They guarantee zero flaky tests, which means they actively fix broken selectors as part of the service. But the maintenance labor is real and ongoing. It is why the service costs $8,000+ per month.

4. What Breaks and What Does Not

Here are five common frontend changes and how each tool responds:

Change	Assrt	QA Wolf
Rename CSS class `.btn-primary` to `.button-main`	No effect. Accessibility tree sees a button, not a class name.	Breaks if any locator targets that class. Engineer must update.
Move button from `<div>` to `<nav>` wrapper	No effect. The button's role and name are unchanged.	Breaks if locator uses a structural path. Safe if using getByRole.
Change button text from "Submit" to "Save Changes"	May need test plan update if step says "Click Submit." Agent often adapts.	Breaks if locator uses getByText("Submit"). Engineer must update.
Swap component library (e.g. Material UI to Radix)	No effect if the accessible roles and names stay the same.	Likely breaks most locators. Major maintenance event.
Remove the login feature entirely	Test fails (correctly). The feature is gone.	Test fails (correctly). The feature is gone.

The pattern: Assrt tests break when features change. QA Wolf tests break when features change or when markup changes. Assrt has one failure mode. QA Wolf has two.

5. Side by Side Comparison

Dimension	Assrt	QA Wolf
Element targeting	Accessibility tree refs, recomputed every step	Playwright locators (CSS, role, text, testId)
Test format	Plain markdown (#Case N:), no code	Playwright JavaScript/TypeScript scripts
Selector maintenance	None. No selectors exist to maintain.	Included in managed service. Engineers fix broken selectors.
data-testid required	No. Reads semantic HTML and ARIA labels.	Often requested for stable selectors.
Price	Free (MIT license, LLM API costs only)	$8,000+/mo (median annual contract ~$90K)
Infrastructure	Local Chromium, your own VMs, or Claude Code session browser	QA Wolf managed cloud with parallel execution
Integration model	MCP server (Claude Code, Cursor, Windsurf, any MCP client)	CI/CD webhooks, dashboard, Slack notifications
License	MIT (open source)	Proprietary SaaS

6. When to Pick Which

Pick QA Wolf if you want a fully managed QA team that writes and maintains Playwright scripts for you. QA Wolf is a good fit when your frontend is stable, your team does not want to think about testing at all, and you can budget $96,000+ per year. Their zero-flake guarantee means they absorb the selector maintenance cost on your behalf.

Pick Assrt if your frontend changes frequently, you are tired of tests breaking from markup changes that do not affect functionality, and you want tests that target what the user actually sees (accessible roles and names) rather than how the developer structured the HTML. Assrt is especially valuable during active development when components are being restyled, refactored, or migrated between UI libraries.

The deeper question is whether you want your tests coupled to your DOM or coupled to your user interface. A button is a button regardless of its CSS class, its parent element, or its component library. Assrt tests reflect that. QA Wolf tests, despite best practices, are always one refactor away from needing maintenance.

7. Frequently Asked Questions

What is an accessibility tree ref and how does Assrt use it?

An accessibility tree ref is a temporary identifier (like e5, e12, e47) assigned to each interactive element when Assrt takes a snapshot of the page. The snapshot reads the browser's accessibility tree, which represents the page by role, name, and state rather than by HTML structure. Assrt's AI agent uses these refs to click buttons, type into fields, and verify content. Because refs are recomputed on every snapshot, they always point to the correct element regardless of how the underlying HTML or CSS has changed.

Does QA Wolf use CSS selectors in its tests?

QA Wolf writes Playwright test scripts for your application. Playwright scripts use locator strategies including CSS selectors, text matching, and data-testid attributes. While QA Wolf's engineers choose stable selectors when possible, any selector-based approach is inherently tied to DOM structure. When your frontend team refactors component markup or renames CSS classes, selectors can break even if the feature still works correctly.

Can Assrt tests break from DOM changes?

Assrt test plans contain natural language steps like 'Click the Sign In button' and 'Verify the dashboard heading is visible.' At runtime, the AI agent resolves these descriptions against the live accessibility tree. If you move a button from a div to a nav element, change its class from btn-primary to button-main, or restructure the surrounding markup, the accessibility tree still reports a button named 'Sign In' and the test still passes. The only change that breaks an Assrt test is removing the feature itself.

What does an Assrt test plan look like compared to a QA Wolf Playwright script?

An Assrt test plan is plain markdown: '#Case 1: Login flow' followed by numbered natural language steps. No code, no selectors, no locators. A QA Wolf test is a Playwright script with code like page.locator('[data-testid="login-btn"]').click(). The Assrt plan is human-editable in any text editor and version-controllable in git. The Playwright script requires JavaScript knowledge to modify.

How does Assrt handle elements that have the same visible text?

The accessibility tree includes role, name, description, and position for each element. When multiple elements share the same text (like two 'Submit' buttons), the AI agent disambiguates using role (button vs link), surrounding context (which form section it belongs to), and the ref ID from the snapshot. The ref is a precise pointer that eliminates ambiguity without needing a unique CSS selector or data-testid attribute.

Do I need to add data-testid attributes to my code for Assrt?

No. Assrt reads the accessibility tree, which derives from semantic HTML, ARIA labels, and visible text. If your app uses standard HTML elements (button, input, a, h1) with reasonable labels, Assrt can interact with every element without any test-specific markup. QA Wolf's engineers often request data-testid attributes to stabilize their selectors, which means modifying production code for testing purposes.

What does Assrt cost compared to QA Wolf?

Assrt is MIT licensed and free. You pay for LLM API calls (Claude Haiku or Gemini) that power the test agent, typically a few cents per run. QA Wolf starts around $8,000 per month for managed test coverage, with median annual contracts around $90,000. Assrt runs on your machine or your own infrastructure with no per-seat or per-test pricing.

Tests That Survive Your Next Refactor

Assrt targets elements by what they are, not where they sit in the DOM. Add it to your coding agent and run tests that do not break when you restyle your app.

View on GitHub