A cross browser testing tool, judged by where the engine choice lives

Every page that ranks for this term sells you a feature checklist: parallel runs, real-device cloud, retry logic, AI assists. The harder question never gets asked. Of the three layers in the stack (engine, runner, plan), at which one does the engine choice physically happen? For a cloud grid it lives in a JSON config you POST to someone else's server. For a code library it lives in a function call in your repo. For Assrt it lives in a TypeScript array literal at line 296 of browser.ts. That difference is the whole differentiator.

Matthew Diakonov, Written with AI

Published May 11, 202611 min read

Direct answer, verified 2026-05-11

A cross browser testing tool drives one or more real browser engines (Chromium, Firefox, WebKit, sometimes the Chromium-based Microsoft Edge) so the same test plan can run against each engine and engine-specific UI bugs surface before real users hit them. The category splits into three: cloud grids (BrowserStack, Sauce Labs, LambdaTest) that rent you remote engines; code libraries (Playwright, Cypress, Selenium) that drive engines locally; and AI agent layers (Assrt) that turn plain-text plans into Playwright runs and emit standard Playwright code.

Verified against playwright.dev/docs/browsers (engine list and command-line flags) and the Assrt MCP source at github.com/assrt-ai/assrt-mcp.

The three categories, with concrete examples

Listicles tend to mix these together as if they were comparable products. They are not. They are different layers of the stack packaged into different commercial shapes, and the right one for you depends on which problem you are trying to solve.

Category 1

Cloud grids: rent the engine

BrowserStack, Sauce Labs, LambdaTest, TestGrid. The engine binary runs on the vendor's machine. You point a remote WebDriver or remote Playwright session at their endpoint and pay per parallel session. Strength: zero install and a long matrix of historical browser versions. Weakness: the engine is not yours, the recording is not on your disk, and the canonical test code lives in whatever SDK they shipped.

Category 2

Code libraries: drive the engine locally

Playwright, Cypress, Selenium. The engine binary lives in your node_modules or a system package. You write a TypeScript or Python file, run it locally or in CI, and inspect the resulting screenshots and traces on your own disk. Strength: full control, zero per-run vendor fees, standard files you can grep and git-blame. Weakness: you write and maintain every locator yourself, and the bill for that maintenance comes due about six months after the suite first goes green.

Category 3

AI agent layers: drive the library from a plan

Assrt sits on top of Playwright MCP. You write a plain-Markdown scenario in /tmp/assrt/scenario.md and the agent resolves each step against the accessibility tree that the engine returns. Strength: same plan runs against any engine Playwright supports, because the plan contains no locator strings. Weakness: every step costs a Claude Haiku call, and the agent loop is harder to debug than a plain test file when something genuinely goes wrong.

The engines underneath the category

Every commercial cross-browser tool ultimately calls into a small number of upstream engine projects. The vendor wrapping is what you pay for; the rendering is shared. Here is the actual surface that Playwright (and therefore Assrt and most modern Playwright wrappers) expose.

Playwright
MCP

Chromium

Firefox

WebKit

MS Edge

The four values in that ring are the exact strings the Playwright MCP CLI accepts after --browser: chrome, firefox, webkit, msedge. Run npx @playwright/mcp@latest --help to confirm; the help text lists the same four.

The literal line where the engine choice is made

This is the part most articles cannot do, because their products do not have one. In Assrt, the line is in src/core/browser.ts at line 296, the args array that is passed to the Playwright MCP stdio subprocess. Add two strings and the entire pipeline runs on a different engine.

/Users/matthewdi/assrt-mcp/src/core/browser.tsline 296

// /Users/matthewdi/assrt-mcp/src/core/browser.ts, line 296
const args = [
  cliPath,
  "--viewport-size", "1600x900",
  "--output-mode",   "file",
  "--output-dir",    outputDir,
  "--caps",          "devtools",
  // adding the two strings below makes the entire stack
  // run on Firefox. swap "firefox" for "webkit" or "msedge"
  // to pick a different engine. defaults to Chromium today.
  // "--browser", "firefox",
];
this.transport = new StdioClientTransport({
  command: process.execPath, // node binary
  args,                      // the line above is the engine choice
  stderr: "pipe",
  env: transportEnv,
});

The array is read top to bottom and handed to StdioClientTransport, which spawns the Playwright MCP CLI as a child process. Playwright MCP, in turn, hands the engine binary to Playwright core, which launches a Chromium, Firefox, WebKit, or msedge process. That whole chain is the "cross-browser" surface, and the array literal is the only line in Assrt that determines which engine starts. You cannot point at the equivalent line on a closed cross-browser platform, because the equivalent line is on someone else's server.

What is not yet shipped

The Assrt CLI (src/cli.ts) does not yet expose a top-level --browser flag. Today the documented CLI surface is --headed, --isolated, --keep-open, --extension, and the matching extension token. To run against Firefox or WebKit right now you either edit the args array in browser.ts:296 (the one-line change shown above) or, if you are using Assrt as an MCP server from a coding agent, instruct the agent to spawn a Playwright MCP child with the engine you want. Calling this out here because the value of a cross-browser claim is exactly the value of the access you actually have.

What actually happens when a test runs on a second engine

Cross-browser bugs come from two places. One is genuine rendering disagreement, where the three engines paint a pixel layout differently because of how they interpret the same CSS. That is rare on a modern web app. The much more common one is locator drift: a query you wrote against one engine's DOM tree resolves to a different node on another engine's tree. The before-after below is what that looks like in a Playwright suite versus an Assrt-style plan.

The same step, two ways of writing it

Same string on three engines, three different resolved nodes after a minor UI change.

Source: await page.getByRole('button', { name: /next/i }).click()
Chromium: matches the primary CTA. Test passes.
Firefox: matches a hidden 'Next month' calendar button that came back from a layout shift. Test passes, but it clicks the wrong thing.
WebKit: role 'button' resolves differently on an unlabeled icon next to the CTA. Test fails with a strict mode violation.

Neither approach is universally better. The Playwright locator is faster and free; the agent loop costs a model call per step but collapses three engine branches into one file. The honest recommendation: use a code library if your locators are stable and your team enjoys writing engine-defensive queries, use an agent layer if your locators are not stable and you would rather pay pennies in tokens than hours in maintenance.

How an Assrt run actually flows across engines

Every step of the plan flows through the same four parties, no matter which engine you picked. Reading the lifeline below makes two things clear: there is no engine-specific code path in the orchestration, and every snapshot is fresh against the current engine.

One step, four actors, three possible engines

What to look for when picking a cross browser testing tool

The category is older than most engineers realize, and most feature checklists rank by the wrong axis. The five questions below come up in every honest evaluation, in roughly this order.

Five questions that actually separate the tools

Does the test code survive outside the vendor? A file you can git-blame and run with someone else's tooling beats a row in a vendor database.
Are the run artifacts on your disk after a run? A video file, a screenshots directory, an event log on local disk beats a hosted dashboard with view-only access.
Is the engine an unmodified upstream Chromium, Firefox, or WebKit, or a custom vendor build? Upstream means your Playwright reproduction works the same way as their pipeline.
Where does the engine choice physically live? An array literal you can read beats a JSON config sent to someone else's server.
What happens when the same step runs on a second engine? Locator drift versus accessibility-tree re-resolution is the structural fork in the road.

Want help picking the right cross-browser shape for your stack?

Bring your current suite (or the absence of one) and we will walk through which of the three categories fits the team you actually have.

Frequently asked questions

What is a cross browser testing tool, in one sentence?

A tool that drives one or more real browser engines (Chromium, Firefox, WebKit, sometimes the Chromium-based Edge) so the same test plan can be run against each engine and engine-specific UI bugs surface before users hit them in production. The category splits cleanly into three: cloud grids that rent you parallel engines on remote machines (BrowserStack, Sauce Labs, LambdaTest), code libraries you program against locally (Playwright, Cypress, Selenium), and AI agent layers that drive one of those libraries from a Markdown plan and emit standard test code (Assrt is in this third group). Every one of them ultimately calls into the same three open-source engine projects, so the differentiator is what survives outside the vendor: the test code, the run artifacts, and the engine binary.

Where exactly in Assrt source is the engine choice made?

/Users/matthewdi/assrt-mcp/src/core/browser.ts, line 296. It is a TypeScript array literal passed to a stdio subprocess: const args = [cliPath, '--viewport-size', '1600x900', '--output-mode', 'file', '--output-dir', outputDir, '--caps', 'devtools']. The Playwright MCP CLI (the cliPath in that array) accepts a '--browser <name>' flag where name is one of chrome, firefox, webkit, msedge. Adding the two strings '--browser', 'firefox' to that array literal is the entire change required to make the same agent loop, the same plan format, and the same recording pipeline run on Gecko instead of Blink. There is no second config file, no per-engine adapter, no separate runner. You can verify this with `npx @playwright/mcp@latest --help` from inside /Users/matthewdi/assrt-mcp.

Is the Assrt CLI exposing --browser as a flag right now?

Not yet. The current CLI (src/cli.ts in @m13v/assrt) exposes --headed, --isolated, --keep-open, --extension, and --extension-token. It does not yet have a top-level --browser flag, so the launch defaults to whatever Playwright MCP defaults to, which is chrome. To run against Firefox or WebKit today you either edit the args array in browser.ts:296 yourself (the literal one-line change) or, if you are using Assrt via the MCP server interface from an agent, you can ask the agent to spawn a custom Playwright MCP instance with the engine you want. This is the boring honesty that a category-page audit demands: cross-engine support is structurally present, the first-class UX is not yet shipped.

Why split this category by 'where the engine choice lives' rather than by features?

Because every cross browser testing tool sits on top of the same three engine projects. Chromium is upstreamed from Google, Firefox from Mozilla, WebKit from Apple. Speed, parallelism, screenshots, video, retry logic, and reporting are commodity features that every vendor in the space implements; their feature checklists are 80 percent overlapping. The non-overlapping part is structural: in a cloud grid, the engine binary is on someone else's machine and the configuration is a JSON object sent over HTTPS. In a code library, the engine binary is in your node_modules and the configuration is a function call. In an AI agent layer that wraps the library, the engine binary is still in your node_modules and the configuration is an array literal. Those are real, durable differences. The feature comparisons are not.

If my real bottleneck is flake on the second engine, does any cross browser tool actually help?

Most cross-browser flake on web apps is not engine-level rendering disagreement; it is locator drift. A CSS selector like '[data-testid=submit]' resolves to the same DOM node on Chromium and Firefox most of the time, but a Playwright locator like `page.getByRole('button', { name: /next/i })` can land on a different node when WebKit returns a different accessibility role default for an unlabeled icon button. Cloud grids do not help with this; they just give you the second engine to fail on. Code libraries make it your problem to write engine-defensive locators. The third category (AI agent layers that resolve the target per snapshot from the accessibility tree) shifts the responsibility from your test file to the engine itself: every engine answers 'which element matches this description right now' and you click that one. The trade is that every step costs a model call.

What is the cheapest path to actually testing my web app on all three engines this week?

If you already have a Playwright suite, add a `projects` array to playwright.config.ts with three entries: chromium, firefox, webkit, using the devices preset for each. Run `npx playwright test --project=firefox` and `--project=webkit` once and watch what breaks. Triage by class of failure: rendering disagreement (rare, needs a CSS fix), missing API (occasional, needs a polyfill or capability check), locator drift (common, needs a more robust query). If you do not have a suite yet, the cheaper path is to write 3-5 plain-Markdown scenarios in /tmp/assrt/scenario.md and run them with `npx @m13v/assrt run --url https://your-app.com --plan ...`. Both paths fit inside a single afternoon for a small app. The expensive path is to pay a SaaS cross-browser grid before you have any tests at all.

Why is Assrt free when QA Wolf and Mabl charge thousands per month?

Because the production cost of the tool is the model API tokens, which the user pays for directly, plus a thin layer of Node.js code that runs locally on their machine and is MIT-licensed at github.com/assrt-ai/assrt-mcp. There is no fleet of cloud browser instances that someone has to keep warm, no parallel session pool that has to be capacity-planned, and no hosted dashboard whose uptime someone is paid to defend. The optional cloud at app.assrt.ai stores scenarios for shareable URLs, but the CLI runs the full pipeline against a local Playwright MCP subprocess; the canonical artifacts (events.json, screenshots/, video/recording.webm, player.html, generated Playwright files) all land on your filesystem. The product is a small wrapper around the open source primitives the user could assemble themselves; the value is that the assembly is already done and inspectable.

What survives if Assrt as a company goes away tomorrow?

The MIT-licensed source at github.com/assrt-ai/assrt-mcp keeps working with npm cached versions. The generated Playwright files in your repo keep running under `npx playwright test` because they are plain TypeScript Playwright code, not a proprietary DSL. The scenario.md files keep being readable Markdown that any future tool can re-parse. The events.json files keep being structured JSON. The video/recording.webm files keep playing in any browser. The only Assrt-specific thing that disappears is the cloud at app.assrt.ai (for shareable scenario URLs), and that surface is opt-in. Compare to a cloud-grid vendor: if BrowserStack closes, the test files you wrote for their SDK keep being SDK calls that no longer have a server to connect to, and your historical run records are stored on infrastructure you do not control.

What about mobile browser testing? Does this category cover real iPhones?

Not in the strict sense. Cross-browser usually means desktop Chromium, Firefox, and WebKit (which is the engine Safari uses on macOS and the only engine iOS Safari can use on iPhone, by App Store policy). Running tests against the actual Safari binary on actual iPhone hardware is a separate category called real-device cloud testing, dominated by BrowserStack App Live, Sauce Labs Real Device Cloud, and AWS Device Farm. Playwright's WebKit engine renders pages with the same WebKit version Safari ships, so most iOS Safari bugs do reproduce in headless WebKit; the exceptions tend to be touch gestures, viewport quirks, and biometric APIs. Assrt, like every Playwright-based tool, inherits this trade.

If I read only one file in the Assrt source to understand the cross-browser story, which one?

/Users/matthewdi/assrt-mcp/src/core/agent.ts, the TOOLS array starting at line 16. There are 18 tools the agent can call: navigate, snapshot, click, type_text, select_option, scroll, press_key, wait, screenshot, evaluate, create_temp_email, wait_for_verification_code, check_email_inbox, assert, complete_scenario, suggest_improvement, http_request, wait_for_stable. None of them accept a field named selector, xpath, locator, or testid. The click tool takes 'element' (a human-readable description) and 'ref' (a live accessibility-tree ID like 'e5'). That ref is invalid the moment the next snapshot is taken; it is not a persisted target. Reading 200 lines tells you why the same plan format can run against any engine: there is no schema in which an engine-specific locator could be stored.