Tooling
Self-healing test tools, compared by who owns the code
Every vendor says their tool heals broken tests. Very few tell you what you actually own when your contract ends. We compared seven tools on the thing that decides whether you can walk away: do you get real test code, or a proprietary blob?
| Tool | Output format | Healing strategy | Open source | Self-hosted | Price |
|---|---|---|---|---|---|
| Assrt | Standard Playwright (.spec.ts) | Accessibility-tree repair loop | Yes (MIT) | Yes | $0 |
| Testim | Proprietary steps, exported JS | Smart Locators (DOM heuristics) | No | No | ~$450/mo + seats |
| Mabl | Proprietary visual flows | Auto-heal with ML confidence score | No | No | ~$2.5K/mo team tier |
| Functionize | Proprietary test archive | ML self-healing (cloud only) | No | No | Quote only (enterprise) |
| Momentic | Proprietary YAML steps | LLM-assisted step replay | No | No | Quote only |
| QA Wolf | Playwright, managed by their team | Humans fix flakes as a service | No | No | ~$7.5K/mo (managed) |
| Healenium | Selenium wrapper | Selector distance algorithm | Yes (Apache-2.0) | Yes | $0 |
The table groups these tools into three camps. Commercial cloud suites (Testim, Mabl, Functionize, Momentic) give you a polished recorder and a managed runner, but the test artifact lives in their database in a format only their runner understands. Managed services (QA Wolf) hand you real Playwright, but the healing is a human team on retainer, and the price reflects that. Open-source tools (Assrt, Healenium) give you code you keep, with no contract to cancel.
The rest of this page walks each tool through the same four questions: what does it generate, how does it heal, what does it cost, and what happens when you leave.
1. Testim
Testim, now part of Tricentis, was one of the first vendors to ship self-healing at scale. Its Smart Locators score dozens of attributes per element (id, text, CSS path, position, neighbor context) and swap in a new selector when the primary one fails. In practice, Smart Locators work well for small DOM changes and struggle when a whole layout is reshuffled.
The recorder is smooth and the Root Cause Analysis panel is the best in the commercial set. The catch is the output. Testim steps live in a proprietary JSON tree that only the Testim runner can execute. There is a "Code Mode" that exports JavaScript, but the exported code calls back into the Testim SDK and loses the visual editor round-trip. You can leave, but you cannot take the tests with you in a form that runs anywhere else.
Good for: teams that want a polished UI recorder and have budget for seat-based pricing.
Bad for: engineers who want their tests to live in Git next to the app code.
2. Mabl
Mabl markets itself on "intelligent test automation" and its auto-heal feature attaches a confidence score to every healed step, which is a nice touch. The recorder is no-code, the runner is fully managed, and the API testing module is genuinely good. Pricing is per-run, with team tiers starting around $2,500 per month once you cross the free trial volume.
Mabl tests are visual flows stored in Mabl's cloud. There is no meaningful local representation. If you cancel, you keep screenshots and a CSV of runs; you do not keep a test suite you can execute anywhere. Teams that pick Mabl usually pick it because nobody on the team wants to write code, and that is a valid trade-off until the bill gets reviewed.
Good for: QA-led teams with no engineering capacity who need a hosted solution.
Bad for: anyone who needs tests in CI next to the app, or who wants offline runs.
3. Functionize
Functionize pitches an ML model that "learns your app" and heals tests across releases. The closest public description of the algorithm is a blend of computer vision on screenshots plus DOM diffing. It runs in their cloud only. There is no self-hosted build, no CLI, and no exported test you can run elsewhere.
Pricing is quote-only, which in practice puts it in the same bracket as Mabl and up. The self-healing is effective for the narrow case it was trained on (click-heavy web flows with stable content), and less so for apps with heavy canvas rendering or frequent full-page redesigns.
Good for: enterprise buyers with procurement budgets and long sales cycles.
Bad for: teams that want to evaluate the tool against their real app before signing.
4. Momentic
Momentic is the newest entrant here and the most aggressive about LLM-driven healing. When a step fails, Momentic re-asks the model to locate the target element using a fresh screenshot and a natural-language description. That approach handles surprisingly large UI changes, but it means every run makes inference calls, and the cost shows up in both latency and the monthly bill.
Tests are stored as Momentic YAML. The YAML cannot run outside Momentic's runner, which is the exact lock-in pattern Testim and Mabl also use. Pricing is quote-only.
What you write: Momentic YAML vs Assrt plan
# momentic test.yaml
steps:
- action: navigate
url: https://example.com/login
- action: type
target: "email input"
value: user@example.com
- action: type
target: "password input"
value: hunter2
- action: click
target: "sign in button"
- action: assert
condition: "dashboard visible"The Assrt input on the right is a plain-English case. Assrt turns it into a standard Playwright spec, commits the spec to your repo, and re-heals the spec on every run using the same accessibility tree the browser shows a screen reader.
Good for: teams that want LLM-grade healing and can accept the YAML format as the trade-off.
Bad for: teams that refuse to ship tests they cannot run locally.
5. QA Wolf
QA Wolf is not a self-healing tool in the algorithmic sense. It is a managed service: QA Wolf engineers write your tests in Playwright, run them on their infrastructure, and re-fix them when the UI changes. The self-healing here is human hands. For teams that want a green suite and have the budget, this works. Public pricing starts around $7,500 per month for the lowest tier and climbs from there.
The tests themselves are real Playwright, and you can mirror them into your own repo. If you cancel, you keep the code. The lock-in is softer than Testim or Mabl, but the cost is at a different order of magnitude, and you depend on a team you do not manage.
Good for: well-funded scale-ups that want zero QA headcount and are willing to pay for a managed SLA.
Bad for: anyone who wants control over which tests exist, how they run, or how they are triaged.
Real Playwright, no YAML, $0
Assrt gives you tests that look like what a senior engineer would write by hand. Self-heal runs locally. Keep the code forever.
Get Started →6. Healenium
Healenium is the original open-source self-healing layer. It sits in front of Selenium, records every locator, and when one breaks, it computes the closest surviving element using a tree-edit distance algorithm. It is battle-tested, Apache-licensed, and self-hosted.
The honest trade-off is that Healenium is a Selenium tool. If your stack is already Playwright, the wiring does not transfer cleanly. Healing quality is also bounded by the algorithm: small DOM drift is handled well, AI-grade rewrites of whole sections are not.
Good for: teams with an existing Selenium suite who want healing bolted on without buying a cloud product.
Bad for: greenfield projects on modern runners.
7. Assrt
Assrt is the answer to the question "what if healing happened at the accessibility-tree level and the output was a real Playwright file?" You write scenarios in plain English, Assrt drives a real Chromium via Playwright, and it writes a standard spec file to disk. When the UI changes, Assrt re-runs the scenario against the new accessibility tree, identifies the new targets semantically (role, name, landmark), and updates the spec in place.
The spec that lands in your repo looks like this:
What Assrt writes to disk (vs hand-written)
import { test, expect } from '@playwright/test';
test('user can sign in and reach dashboard', async ({ page }) => {
await page.goto('https://example.com/login');
await page.locator('#email-input-v2').fill('user@example.com');
await page.locator('#password-field').fill('hunter2');
await page.locator('.btn-primary.submit').click();
await expect(page).toHaveURL(/dashboard/);
await expect(page.locator('.user-menu')).toBeVisible();
});The difference is not line count. It is that the Assrt-authored spec targets elements the way a screen reader targets them, so when a designer changes .btn-primary to .cta-button, the test does not notice. When a button actually moves from "Sign in" to "Log in," Assrt re-runs the scenario, finds the new name, and writes a new line to the spec.
Assrt is MIT licensed. Running it costs $0. You can self-host the MCP server, point Claude or Cursor at it, and never send your app to a third-party cloud. The tests live in your Git repo next to the code they test, and if Assrt disappeared tomorrow, the specs would keep running under vanilla Playwright.
Good for: engineering teams that want AI healing without giving up code ownership.
Bad for: teams with no engineering capacity who need a turnkey dashboard and managed runs.
How to decide
Three questions will narrow the field fast. First, can your team read and edit a Playwright file? If yes, the whole proprietary-YAML class (Testim, Mabl, Functionize, Momentic) costs you ownership you do not need to give up, and the open-source options (Assrt, Healenium) cover the same ground for free. If no, you are choosing between Mabl and QA Wolf, and the difference is whether you want a tool (Mabl) or a team (QA Wolf).
Second, where does your data live? Any tool that runs tests in a vendor cloud sees your app's pages and form fields. If you have a compliance story (healthcare, finance, regulated data), the self-hosted row collapses to Assrt or Healenium.
Third, what is the healing budget? Cloud LLM healing is more flexible than heuristic healing, but every re-heal is an inference call on someone's bill. Assrt runs the heal loop locally against your own model credits, so you can cap the spend. Testim and Healenium heuristics are free to run but miss larger changes. Pick the trade-off honestly.
The ownership test
The single question that separates real tools from rental subscriptions is: if the vendor shut down tomorrow, could you still run the tests you wrote today? Run through the list. Testim: no. Mabl: no. Functionize: no. Momentic: no. QA Wolf: yes, because the artifacts are Playwright. Healenium: yes, because it is yours. Assrt: yes, because the specs are standard Playwright that run under the open-source Playwright runner with zero Assrt dependencies at runtime.
Self-healing is a feature, not a product. The product is the test suite you own. Pick a tool that leaves you with one.