A test automation tools comparison matrix, scored on the rows that actually cost you later

Almost every comparison matrix online ranks tools by features you can read off a marketing page: supported languages, parallel runners, assertion syntax. Those columns tell you whether you can start. They say nothing about the bill you pay in month twelve. This matrix scores the four rows that do.

M
Matthew Diakonov
9 min read
The short answer (verified 2026-06-16)

No single tool wins every row, so do not look for the “best” cell. Pick by constraint, scoring your shortlist on four axes that predict 12-month cost: how you author tests, where they run, who fixes them when the UI moves, and what leaving costs. Free and open source keeps tests in your repo (Playwright, Selenium, Cypress, Assrt). A managed service trades money for the labor of writing and maintaining them (QA Wolf). And if you want AI to draft scenarios but still keep and edit the result, Assrt authors them as plain-English #Case blocks you own.

The matrix, filled in

Eight tools people actually shortlist, scored on the four cost-predicting rows plus cost and browser reach. The last column, “what you keep,” is the one most published matrices leave out.

ToolCostWhere tests runBrowsersHow you author testsWhat you keep if you leave
SeleniumFree (Apache 2.0)Your machine / your CIChrome, Firefox, Safari, EdgeCode (Java, Python, C#, Ruby, JS)You keep the code
CypressFree runner (MIT) + paid CloudYour CI; Cloud for smart parallelChromium, Firefox, WebKit (experimental)Code (JavaScript / TypeScript)You keep the code
PlaywrightFree (Apache 2.0)Your machine / your CIChromium, Firefox, WebKitCode (.spec files: TS, JS, Python, .NET, Java)You keep the code
TestimCommercialVendor cloudChrome-focusedRecorder + low-code editorLocked to the platform
MablCommercialVendor cloudMulti-browserLow-code recorder + AILocked to the platform
QA WolfManaged service (no public pricing)Vendor-run on your behalfMulti-browser (Playwright underneath)Their team writes Playwright for youYou receive Playwright, but coverage is service-dependent
MomenticCommercialVendor CLI + platformChromium / Chrome only (Safari, Firefox on roadmap)Proprietary YAML (momentic.config.yaml)YAML is not portable to other runners
AssrtFree (open source)Your machine / your CIChromium, Firefox, WebKitPlain-English #Case Markdown, run by an AI agentYou keep the scenarios and run them yourself

Sources: each tool’s own documentation: Playwright, Selenium, Cypress, Momentic, QA Wolf, and the Assrt repository. QA Wolf pricing is not officially published; figures are from third-party reports. Verified 2026-06-16.

The four rows that actually predict cost

A matrix is only as good as its columns. Drop the rows that every marketing page already answers and keep the ones that decide whether you are still using the tool, and still sane, a year from now.

Score these, not feature counts

  • How you author tests. Code needs an engineer to maintain it; a recorder needs re-recording on every UI change; proprietary YAML needs the vendor's runner; natural language can be edited by anyone on the team.
  • Where tests run. Your own CI versus a vendor cloud decides whether a pricing change or an outage can take your suite down.
  • Who fixes a test when the UI moves. This is where most of the real cost lives, and it is almost never a column in published matrices.
  • What leaving costs. If the artifact only runs on the vendor's platform, your tests are a lease, not an asset.

The row most matrices put first, and the row that should replace it

Toggle between a typical matrix row and the one that predicts your actual bill. Same tools, very different buying decision.

Supported languages and assertion style. Every tool gets a green check, the grid looks complete, and you learn nothing about the next twelve months.

  • Counts features you can read off a landing page
  • All cells trend green, so it cannot break a tie
  • Silent on maintenance, ownership, and exit cost

The one authoring row no other tool has

Selenium, Cypress, and Playwright author tests as code. Testim and Mabl record clicks. Momentic writes YAML. Assrt is the only row in the matrix whose test is a plain-English Markdown block. Here is exactly what one looks like, taken from the scenario file the tool writes to /tmp/assrt/scenario.md:

#Case 1: Log in with valid credentials
Click the "Sign in" button in the header.
Type a valid email into the Email field.
Type the matching password into the Password field.
Click "Log in".
Verify the dashboard heading is visible.

#Case 2: Reject an empty password
Click "Sign in".
Type a valid email, leave Password empty.
Click "Log in".
Verify an inline "Password is required" error appears.

That file is editable in place and auto-syncs as you change it. Each #Case is self-contained, and verification steps start with words like Verify, Check, or Confirm. The assrt_test tool runs the cases in a real Playwright-driven browser and writes structured results to /tmp/assrt/results/latest.json. You can verify the format yourself in the open-source repository.

From a sentence to a verdict

1

Write a #Case

Plain-English steps in scenario.md, no selectors

2

Agent reads the page

Resolves elements from the accessibility tree

3

Runs in a real browser

Chromium, Firefox, or WebKit via Playwright

4

Writes a verdict

Pass / fail plus assertions to latest.json

Build your own matrix in four steps

A generic matrix ranks tools for an average team that does not exist. Yours should rank them for your constraints. Here is the fastest way to make one that actually decides.

1

Write your constraints as the first column

Budget ceiling, must-run-in-our-CI, who maintains tests, compliance needs. These become weighted rows. If a row does not change your decision, delete it.

2

Shortlist three tools, not ten

A ten-tool grid is a reading exercise. Pick the three that survive your hardest constraint (usually budget or ownership) and compare those seriously.

3

Score the cost-predicting rows, then weight them

Authoring format, run location, who fixes broken tests, exit cost. Weight the row that hurts most for your team, often maintenance, at double.

4

Run a one-hour spike on the top two

Point each at one real flow in your app. A free tool like Assrt or Playwright can be trialed in an afternoon; the spike beats any cell in the grid.

Where Assrt loses, honestly

If your real constraint is that nobody on the team will ever write or maintain tests, a managed service like QA Wolf is buying you labor, not software. That is a legitimate trade if the budget exists; an open-source tool cannot staff itself.

If you are a large organization whose gating requirement is compliance paperwork, SSO, role-based access, and a vendor security review, a commercial platform with that machinery will clear procurement faster than a self-hosted open-source tool will.

And if you have a mature, hand-tuned Playwright or Cypress suite that your team already maintains comfortably, there is no urgency to change authoring formats. The matrix is for teams deciding, not teams already settled. For the deeper artifact-by-artifact view, see our Playwright tools comparison and open-source testing tools comparison.

Not sure which row breaks your tie?

Bring your shortlist and constraints; we will help you score the matrix for your team and stack in 20 minutes.

Questions teams ask before they commit

Frequently asked questions

What is a test automation tools comparison matrix?

It is a grid that puts candidate tools in rows and decision criteria in columns so you can compare them at a glance. The useful version scores criteria that predict long-term cost (how you author tests, where they run, who maintains them when the UI changes, and what it costs to leave) rather than counting surface features like supported languages or assertion styles.

Which test automation tool should I pick from the matrix?

No single tool wins every row, so pick by constraint. If your bottleneck is budget and ownership, the free open-source options (Playwright, Selenium, Cypress, Assrt) keep tests in your repo and run in your CI. If your bottleneck is having nobody to write or maintain tests, a managed service like QA Wolf trades money for that labor. If you want AI to draft scenarios but still want to keep and edit the result, Assrt authors plain-English #Case blocks you own.

Why score authoring format instead of supported languages?

Supported languages tell you whether you can start. Authoring format tells you who can maintain the suite a year later. A tool that writes code needs an engineer to edit it; a recorder needs you to re-record on every change; proprietary YAML needs the vendor's CLI to run at all. Assrt's authoring format is a natural-language #Case Markdown block, which means a non-engineer can read and edit a scenario and an AI agent executes it.

What does Assrt generate, and where does it live?

Assrt scenarios are Markdown saved to /tmp/assrt/scenario.md, structured as #Case blocks (for example, '#Case 1: log in with valid credentials' followed by step-by-step instructions). The file is editable in place and auto-syncs. The assrt_test tool runs those cases in a real Playwright-driven browser and writes results to /tmp/assrt/results/latest.json. Because the scenarios are plain text in your project, leaving Assrt costs nothing.

How much does QA Wolf cost compared to open-source tools?

QA Wolf does not publish official pricing. Third-party reports put it at roughly $8,000 per month for around 200 tests, scaling with test volume, with median annual contracts well into five or six figures. Playwright, Selenium, Cypress's core runner, and Assrt are free; their cost is the engineering time to write and maintain tests, which is exactly what a managed service is selling you out of.

Is proprietary YAML a problem in a comparison matrix?

It is the exit-cost row most matrices omit. Momentic stores tests as momentic.config.yaml in your git repo, which looks portable, but the YAML only runs through Momentic's CLI and platform. If you leave, the files do not execute anywhere else. Code-based tools (Playwright, Selenium, Cypress) and plain-English scenarios (Assrt) do not have that lock-in.

Can one matrix cover both AI tools and classic frameworks?

Yes, if the columns are framed around outcomes rather than mechanisms. 'Where does it run' and 'what does leaving cost' apply equally to Selenium and to an AI agent. The mistake is adding AI-only rows (model used, prompt format) that classic tools can't fill, which fragments the grid into two half-empty tables.

assrtOpen-source AI testing framework
© 2026 Assrt. MIT License.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.