Head-to-Head Comparison

QA Wolf vs Assrt: One Tests What You Scope, the Other Finds What You Missed

QA Wolf's human QA team writes tests for the pages you tell them about. Assrt's AI agent discovers pages you forgot to mention, generates test cases for them in parallel, and runs everything in a real browser. This comparison focuses on that difference: manual test scoping versus automatic test surface expansion.

By the Assrt team|April 12, 2026|8 min read

20 pages auto-discovered

“Assrt discovers up to 20 pages per test run and generates cases for 3 concurrently. QA Wolf tests only the pages in your contract scope.”

agent.ts: MAX_DISCOVERED_PAGES=20, MAX_CONCURRENT_DISCOVERIES=3

1. The Scoping Problem Neither Tool Talks About

Every QA tool comparison focuses on the same things: pricing, test format, flakiness handling, CI integration. Those matter. But there is a more fundamental question that gets ignored: how does the tool decide what to test?

With QA Wolf, the answer is simple. Their team tests what you tell them to test. You define the scope (your checkout flow, your dashboard, your settings page), and their engineers write Playwright scripts for those specific flows. If you ship a new page and forget to update the scope, that page has zero test coverage until someone notices.

This is normal for managed QA. It is how most testing works. But it means your test coverage is always a subset of your actual application surface. The gap between "pages that exist" and "pages that are tested" grows every time a developer ships a new route without telling the QA team.

Assrt handles this differently. When it runs a test scenario, it reads the page, follows internal links, and generates test cases for pages it discovers along the way. The test surface expands automatically as part of the run.

2. How Assrt Discovers Pages During a Test Run

The implementation lives in agent.ts in Assrt's core. Here is what happens step by step:

During the first test scenario, the agent reads the accessibility tree of the current page. This tree contains every internal link on the page.
The agent collects these URLs and normalizes them (stripping trailing slashes, resolving relative paths). It deduplicates against URLs already in the test plan and URLs already visited.
A filter removes URLs matching known non-testable patterns: /logout, /api/, javascript:, about:blank, data:, and chrome: URIs are all excluded.
The remaining URLs are queued for discovery. Assrt processes up to 3 pages concurrently (MAX_CONCURRENT_DISCOVERIES=3) and caps total discovery at 20 pages per run (MAX_DISCOVERED_PAGES=20).
For each discovered page, Assrt navigates to it, takes a screenshot, reads the accessibility tree, and sends both to Gemini vision. Gemini generates test cases in the same #Case N: markdown format.
The new cases are streamed back to the UI (or to your coding agent) and appended to the test plan. They can be run immediately or saved for later.

The result: you point Assrt at your homepage, and it finds your pricing page, your docs, your login page, and your blog. It generates test cases for each. You did not have to list those pages. You did not have to know they existed.

# You write this:
#Case 1: Verify homepage loads and navigation works
1. Navigate to http://localhost:3000
2. Verify the main heading is visible
3. Click the "Pricing" link in the nav

# Assrt discovers and generates these:
#Case 2: Verify pricing page displays plan tiers
1. Navigate to http://localhost:3000/pricing
2. Verify at least two pricing tiers are visible
3. Verify each tier has a "Get Started" button

#Case 3: Verify docs page loads with sidebar navigation
1. Navigate to http://localhost:3000/docs
2. Verify a sidebar with navigation links is present
3. Click the first link in the sidebar
4. Verify the page content updates

This is not a conceptual feature. It is running code. The constants, the URL filter patterns, and the concurrent discovery logic are all in the open source repository, readable and modifiable.

See discovery in action

Point Assrt at any URL and watch it find pages you forgot to test. MIT licensed, one line to install.

3. How QA Wolf Handles New Pages

QA Wolf's model is straightforward: you scope the work, their team executes it. When your product adds a new page or flow, the process looks like this:

Your team ships the new page to staging or production.
Someone on your team notifies QA Wolf (via Slack, email, or their dashboard) that a new flow needs coverage.
A QA Wolf engineer reviews the page, writes Playwright test scripts, and adds them to your test suite.
The tests run on QA Wolf's infrastructure as part of your regular test schedule.

This works well for stable, well-defined applications where the page inventory changes slowly and the team has good communication with their QA provider. The weakness is the gap between "page shipped" and "test written." During that gap, the page is live but unverified.

For fast-moving teams shipping multiple pages per week, this gap compounds. The QA Wolf team is always a step behind the development team. Not because they are slow, but because the model requires explicit communication for every new piece of test scope.

4. Side by Side Comparison

Dimension	Assrt	QA Wolf
Test scoping	Automatic discovery during test runs (up to 20 pages)	Manual scoping by your team, executed by QA Wolf engineers
New page coverage	Discovered and tested in the same run	Requires notification to QA team, then manual test creation
Price	Free (MIT license, LLM API costs only)	$8,000+/mo (median annual contract ~$90K)
Discovery concurrency	3 pages analyzed in parallel per run	N/A (no automatic discovery)
Test format	Markdown (#Case N:), git-committable	Playwright scripts on QA Wolf's infrastructure
Integration model	MCP server (works with Claude Code, Cursor, Windsurf)	Managed service with CI/CD webhooks and dashboard
URL filtering	Built-in patterns skip /logout, /api/, javascript:, etc.	Human judgment during test authoring
Infrastructure	Local Chromium or ephemeral VMs, self-hosted	QA Wolf's cloud with 100% parallel execution
License	MIT (open source)	Proprietary SaaS

5. When to Pick Which

Pick QA Wolf if you want a fully managed QA operation, have a stable application with a well-defined page inventory, and can budget $96,000+ per year. QA Wolf excels when your product surface changes slowly and you want human engineers responsible for test reliability.

Pick Assrt if you are already using an AI coding agent (Claude Code, Cursor, Windsurf), ship new pages frequently, and want test coverage to expand automatically as your app grows. Assrt is especially useful during active development, where the coding agent discovers and tests pages in the same session where it wrote the code.

The two tools are not mutually exclusive. You could use QA Wolf for your stable regression suite and Assrt for rapid coverage of new features and pages that haven't been scoped into QA Wolf's contract yet. But if the reason you searched "QA Wolf vs Assrt" is because you want test coverage that keeps up with your shipping speed without manual scoping, Assrt's automatic discovery is the feature that matters most.

6. Frequently Asked Questions

How does Assrt discover pages I didn't explicitly ask it to test?

During the first test scenario, Assrt's agent reads the accessibility tree and collects every internal link on the page. It normalizes the URLs, deduplicates them, and filters out patterns like /logout, /api/, and javascript: URIs. Then it opens up to 3 discovered pages concurrently, screenshots each one with Gemini vision, and generates test cases for them. The process caps at 20 discovered pages per run so it stays focused.

Does QA Wolf have any form of automatic page discovery?

No. QA Wolf operates as a managed service where their human QA engineers write and maintain tests for pages you explicitly scope in your contract. If a new page ships and nobody tells the QA Wolf team about it, it goes untested until someone notices. Their model optimizes for reliability on known flows, not for discovering unknown ones.

Can I disable page discovery in Assrt if I only want to test specific pages?

Yes. Discovery only triggers during the first scenario of a run. If you pass a specific scenarioId to assrt_test, it re-runs exactly that scenario without discovery. You can also write a focused test plan with a single #Case targeting one URL, and the agent will stay on that page.

What does Assrt cost compared to QA Wolf?

Assrt is MIT licensed and free. You pay for LLM API calls to power the test reasoning, typically a few cents per test run. QA Wolf starts at $8,000 per month for 200 tests, with a median annual contract around $90,000 according to public pricing data.

What format are Assrt test scenarios stored in?

Plain markdown files using a #Case N: format, saved to /tmp/assrt/scenario.md. You can commit them to git, edit them in any text editor, or let your coding agent modify them. There is no proprietary syntax. QA Wolf's tests are Playwright scripts maintained on their infrastructure and not directly portable.

Can Assrt replace QA Wolf for teams that need full regression suites?

Assrt is built for a different workflow. It integrates into your AI coding agent via MCP so the agent that writes code also tests it in real time. For full regression suites with hundreds of stable tests running on every deploy, QA Wolf's managed team may still be the right fit if you have the budget. Assrt excels at fast feedback during development and at catching gaps in test coverage through page discovery.

How does Assrt handle pages behind authentication during discovery?

Assrt has built-in disposable email support. It can create temporary email addresses, wait for OTP codes, and fill them in to complete signup or login flows. If a discovered page requires authentication, the agent attempts the auth flow using these tools. For pages behind SSO or enterprise auth, you would need to provide session cookies or pre-authenticated state.

Stop Scoping Tests Manually

Add Assrt to your coding agent's MCP config. Point it at any URL. Watch it discover pages you forgot existed and generate test cases for each one.

View on GitHub