An e2e testing framework is two decisions, not one

M
Matthew Diakonov
9 min read

An end-to-end testing framework is the software that drives a real browser through your app's complete user journeys, signup, login, checkout, and asserts the result at each step. In 2026 the code-first leaders are Playwright, Cypress, and Selenium, with Playwright the default for most new web projects.

That sentence is the answer every other guide gives. It is also only half the framework. The half nobody names is what your tests are made of, the artifact the framework leaves behind in your repo, and that half is the one that actually breaks.

Direct answer

An e2e testing framework drives a real browser through full user flows and asserts the outcome. Pick Playwright for new web work (multi-browser, auto-waiting, parallel by default), Cypress for its debugger-first developer experience, or Selenium for the widest language and browser reach. Then make the second decision the others skip: what your tests are written as, because the artifact, not the runner, is what you will spend the next year maintaining.

The runner, and the artifact

Every framework decomposes into two parts. The runner is the engine that opens the browser, clicks, types, waits, and reports. The artifact is what the runner executes: the file that lives in your repo and describes the test. When people compare e2e frameworks, they compare runners, startup speed, browser coverage, parallelism, the debugger. Those are real differences, and they matter on day one.

The artifact is what matters on day two hundred. A test written today says click button.btn-primary. Six months later a designer renames the class, re-wraps the button in a new container, or swaps a data-testid. Nothing about the app is broken. The button still works. But the test fails, because the artifact pinned a selector that was correct the day it was written and is wrong now. That is where flakiness comes from, and it is a property of the artifact, not the runner.

So a complete answer to "which e2e testing framework" is two answers: which runner, and what does the runner leave behind. The first question has good, well-known answers. The second one is where the interesting design space in 2026 actually is.

The 2026 landscape, read along both axes

Here is the same set of tools you will see in any ranked list, except split into the two columns that matter: the runner it drives and the artifact it produces.

FrameworkRunnerArtifact
Playwright
Default for new web projects. Auto-waiting, parallel by default.
Chromium, Firefox, WebKit from one APITypeScript .spec.ts with selectors and assertions
Cypress
Loved for its time-travel debugger and DX. Narrower browser reach.
Chromium-family and Firefox, in-browserJavaScript spec with chained commands
Selenium
Maximum language and browser coverage. Most setup and upkeep.
Every major browser via WebDriverCode in Java, Python, C#, JS, and more
Assrt (on Playwright)
MIT, open-source, @m13v/assrt. The runner stays standard; the artifact is the difference.
Real Playwright engine, driven by an AI agentPlain-English #Case Markdown, re-derived live each run

Playwright, Cypress, and Selenium differ most on the runner axis and converge on the artifact axis: all three leave a code file with selectors baked in. The interesting move is to keep a proven runner and change the artifact.

What changing the artifact actually looks like

Assrt is an open-source framework that keeps the Playwright runner and replaces the artifact. Instead of a compiled .spec.ts with selectors frozen into it, the test you commit is a plain-English #Case block stored at /tmp/assrt/scenario.md and checked into your repo as Markdown. This is the whole anchor of the idea, so here is exactly what that file looks like:

scenario.md

There is no selector in that file. There is nothing to go stale. When the suite runs, the agent re-snapshots the page's accessibility tree before every action and resolves "the password field" or "submit" by role and label against the page as it exists right now. That is what "self-healing" means in practice: not a plugin that patches broken selectors after the fact, but an artifact that never hard-coded a selector to break.

terminal

The generator and the corrector both emit this exact #Case grammar, the system prompts that define it live in assrt-mcp/src/mcp/server.ts and run on Claude Haiku, so the file you read is the file that runs. You can verify all of this: the package is @m13v/assrt on npm, the source is open under the MIT license, and it is free.

For a deeper look at why the artifact decides your maintenance bill, see test maintenance vs generation.

A layer on the runner, not a replacement for it

The reason this matters for "zero lock-in" is structural. Assrt does not ask you to trust a new closed runner. The thing actually executing in your CI is Playwright, the same engine you would pick anyway. Assrt sits on top of it: plain-English intent goes in, the standard Playwright engine drives the browser, and a readable run report comes out.

Plain English in, standard Playwright underneath

Your app URL
#Case Markdown
Assrt agent
Real browser run
Pass/fail report

If you ever want to leave, you have not adopted a proprietary format you cannot export. You have Markdown intent files and a standard Playwright dependency. That is what zero vendor lock-in means when you read it at the level of the artifact rather than the marketing.

How to actually choose

  1. 1. Pick the runner first, and keep it boring. For new web work that is Playwright. If your team already lives in Cypress and is happy, that is a fine answer too. The runner choice is low-regret; you are unlikely to be wrong in a way that hurts.
  2. 2. Then decide what your tests are made of. If your UI is frozen and tests rarely change, hand-written or codegen'd specs are cheapest to run forever. If your UI churns, and most shipping products' UIs churn, the maintenance cost of selector-bound specs is the line item that quietly grows.
  3. 3. Match the artifact to the churn. Keep your stable flows as compiled specs. Move the flows that generate the most maintenance pull requests, the ones a button rename keeps breaking, to an artifact that resolves elements live. You do not have to choose one artifact for the whole suite.

Where the compiled spec still wins

This is not an argument that selector-bound specs are obsolete. A compiled .spec.ts is fully deterministic and costs nothing per run, it makes no model calls, so for a critical flow that almost never changes it is the cheaper, more predictable choice. Re-deriving a test from the live page every run spends tokens, and that is a real cost you should weigh honestly. The reframe is not "artifact A beats artifact B." It is that the artifact is a separate decision from the runner, the decision most framework comparisons never surface, and the one that determines whether your suite is an asset or a tax a year from now.

Talk through your e2e suite with the people who built Assrt

Bring your flakiest flow. We will look at whether the runner or the artifact is the problem, and whether re-deriving it live actually helps.

Frequently asked questions

What is an e2e testing framework?

An end-to-end (e2e) testing framework is the software that drives a real browser through your application's complete user journeys, signup, login, checkout, and asserts the outcome at each step. Unlike unit or integration tests, which exercise one function or one module in isolation, an e2e test exercises the whole stack the way a person would: clicking real buttons, filling real forms, waiting on real network calls. In 2026 the dominant code-first frameworks for the web are Playwright, Cypress, and Selenium, with Playwright the default choice for most new projects because of its speed, auto-waiting, and multi-engine browser support.

Which e2e testing framework should I use in 2026?

For a new web project, Playwright is the safe default: it drives Chromium, Firefox, and WebKit from one API, auto-waits for elements, and runs tests in parallel out of the box. Cypress is still excellent for teams who value its time-travel debugger and all-in-one developer experience, though it is browser-restricted compared to Playwright. Selenium remains the right answer when you need a language or browser that the newer tools do not cover, or when you are maintaining a large existing suite. The runner, though, is only half the decision. The other half is what your tests are made of, which is the part this page is about.

What is the difference between an e2e framework and an e2e platform?

A framework is a code library you assemble into a suite yourself: you write the tests, wire up CI, and own the maintenance. Playwright, Cypress, and Selenium are frameworks. A platform is a hosted product that bundles authoring, an execution grid, dashboards, and often a managed-service team that writes the tests for you. Managed platforms remove the upfront work but add cost and lock-in: pricing for a fully managed service like QA Wolf is quoted per scope and, by third-party estimates, runs into the tens of thousands of dollars per year. The framework route is cheaper and portable but puts maintenance on you, which is exactly why the artifact your framework produces matters so much.

Why do e2e tests get flaky, and is that the framework's fault?

Most flakiness traces back to selectors, not the runner. A test written six months ago says click the element matching button.btn-primary, and then a designer renames the class, re-wraps the button, or swaps a data-testid. The runner does exactly what it was told and fails. The runner is rarely the problem; the brittle, hard-coded locator baked into the test artifact is. This is why a framework comparison that only weighs runner features is incomplete: the thing that determines whether your suite survives a UI change is how the test resolves elements, which lives in the artifact, not the runner.

What does Assrt produce, and is it still real Playwright?

Assrt runs on Playwright. It drives a real Chromium process through the Playwright bridge, so it is not a separate runner you have to trust. What is different is the artifact. Instead of compiling a TypeScript .spec.ts with selectors baked in, Assrt's test is a plain-English #Case block stored at /tmp/assrt/scenario.md and meant to be committed to your repo as Markdown. On each run the agent re-snapshots the page's accessibility tree and resolves elements by role and label, so a renamed class or re-wrapped button is not a maintenance event. It is open-source under the MIT license, shipped as @m13v/assrt, with no proprietary YAML and no cloud you are locked into.

Does running tests as plain English cost money per run?

Yes, and it is an honest trade. Because Assrt re-derives the executable steps from the live page each run, every run spends Claude Haiku tokens to drive the agent. A compiled Playwright spec spends nothing per run because it is static. Assrt earns its keep when your UI changes often enough that maintenance, not execution, is your real cost center. If a flow is frozen and almost never changes, a hand-written or codegen'd .spec.ts is cheaper to run forever. The two approaches are not mutually exclusive; many teams keep stable flows as compiled specs and move the churn-heavy flows to #Case files.

Can I add Assrt to an existing Playwright or Cypress suite?

Yes. Because Assrt sits on the Playwright engine, a scenario.md file can live next to a Playwright project with no conflict, and you can keep your existing Cypress or Selenium suites untouched while you trial it. There is no migration to commit to up front: you point it at a URL with npx @m13v/assrt run --url https://your-app.com, look at the #Case files it produces, and keep whatever earns its place. Nothing about adopting it forces you to abandon the framework you already run.

assrtOpen-source AI testing framework
© 2026 Assrt. MIT License.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.