Tauri Playwright e2e tests: the dev-loop shortcut nobody writes about

Every guide you find on Tauri + Playwright tells you the same thing: fight WebDriver, add a plugin, bridge a socket to the system webview. That is the right answer for testing the built .app bundle. It is the wrong answer for the loop you live in every day, where you run cargo tauri dev and need to know the button still works before you push. For that loop, the dev server on port 1420 is a regular URL and you can test it with regular Chromium. This page shows you how.

A
Assrt team
9 min read
4.8from live on Claude Code, Cursor, Zed
Wraps @playwright/mcp 0.0.70 under the hood
Headless by default, video on demand
Plain Markdown #Case, zero vendor lock-in

The two layers of Tauri testing, and which one you are in

Before you pick a tool, be honest about what you are testing. There are two distinct layers and they need different answers. Most posts on the internet only answer the second one.

Layer 1: frontend during dev

The UI while cargo tauri dev is running. Vite is serving your React / SvelteKit / Leptos / Solid at http://localhost:1420. The Rust backend is attached, but the web layer is fully reachable from any browser on your machine. This is where 80% of your UI bugs live. This is also the loop you run 50 times a day.

Layer 2: the built bundle

The .app / .exe / .AppImage running inside its system webview. WKWebView on macOS, WebKitGTK on Linux, WebView2 on Windows. This is where tauri-driver, tauri-plugin-playwright, tauri-pilot, and tauri-remote-ui live. Use them for your release smoke tests, not your inner loop.

What Assrt covers

Layer 1 only, by design. You point Assrt at the dev URL, write #Case scenarios in plain Markdown, and get a real Playwright run with video. For Layer 2, keep your existing tauri-driver pipeline; the two compose.

Why this split matters

Running a Layer 2 test suite on every commit is slow and fragile. Running a Layer 1 suite on every save is cheap. The teams that ship stable Tauri apps run both, but on different cadences — fast Layer 1 on every pull request, slower Layer 2 nightly or on release branches.

The anchor fact: port 1420 is all you need

The create-tauri-app template has been pinning the Vite dev server to port 1420 since Tauri v1. The config lives in two files: the frontend framework's vite.config.ts (server: { port: 1420, strictPort: true }) and tauri.conf.json (build.devUrl: "http://localhost:1420"). The strictPort: true bit matters for test isolation: Vite refuses to fall back to 1421 if 1420 is occupied, so you never get silent port drift between the two terminals you have open.

1420

Assrt accepts any URL via --url. It does not inspect the protocol, the port, or the framework. If the URL answers within 8 seconds, the run starts.

preflight check in src/core/agent.ts line 518

What actually happens when you run Assrt against a Tauri dev server

cargo tauri dev
Your #Case.md
Claude Haiku
Assrt runner
localhost:1420
window.__TAURI_INTERNALS__
video.webm
0Tauri v2 default dev port
0x900Default viewport width (px)
0Preflight timeout (seconds)
0Plugins or Rust code required

The invoke shim is the whole trick

Your Tauri frontend calls Rust via invoke("cmd_name", args). Under the hood, that hits window.__TAURI_INTERNALS__.invoke. In a real Tauri webview, that global is injected by Tauri. In a regular Chromium session on port 1420, that global is undefined unless you either set withGlobalTauri: true in tauri.conf.json or stub it yourself. Stubbing is the better path for most tests: it gives you full control over Rust-side responses and keeps your tests fast, because you never wait for a real Rust round trip.

stub-invoke.js (executed inside the browser via browser_evaluate)

You do not run that file from disk. You put its logic into a #Case step, Assrt's agent reads the step, and the agent executes the JavaScript via the browser_evaluate Playwright MCP tool (wrapped at src/core/browser.ts:665). Here is what a two-case plan looks like in practice:

tests/greet.md

Plugin-based setup vs. the Assrt loop

The contrast below is not a fair fight. The left side is how you would set up tauri-plugin-playwright to test the built bundle. The right side is how you set up Assrt to test the dev server. They solve different problems. But the shape of the two is worth staring at before you decide which you actually need for a given test.

What you actually write

// tauri-plugin-playwright route: 4 files, 2 package managers
// src-tauri/Cargo.toml
[dependencies]
tauri-plugin-playwright = "0.2"

// src-tauri/src/main.rs
fn main() {
  tauri::Builder::default()
    .plugin(tauri_plugin_playwright::init())
    .run(...)
}

// playwright.config.ts
export default defineConfig({
  projects: [{ name: "tauri", use: {
    // custom launcher that talks to the socket bridge
    connectOverCDP: "ws://127.0.0.1:9222",
  }}],
});

// tests/app.spec.ts
import { test, expect } from "@playwright/test";
test("greet", async ({ page }) => {
  await page.goto("tauri://localhost");
  await page.getByRole("textbox").fill("Taylor");
  await page.getByRole("button", {
    name: /greet/i,
  }).click();
  await expect(page.getByText(/hello/i)).toBeVisible();
});
24% less setup for the dev-loop use case

The four-step loop

1

1. Start your Tauri dev server

Run cargo tauri dev (or pnpm tauri dev) in one terminal. This brings up the Rust side plus the Vite server on port 1420. If you are only testing the UI, you can skip the Rust bring-up and just run pnpm dev — it's faster to boot and Assrt does not care.

2

2. Write a #Case in plain Markdown

Create tests/greet.md with one or more #Case blocks. Each case gets a heading and numbered steps. The first step usually sets up the state (navigate to a URL, stub invoke, seed localStorage). The rest of the steps interact with the UI. The last step asserts visible behavior.

3

3. Run Assrt against the dev URL

npx @assrt-ai/assrt run --url http://localhost:1420 --plan-file tests/greet.md --video. The --video flag is optional but worth it for your first few runs; it opens a player so you can see exactly what the agent clicked.

4

4. Commit the Markdown, not the video

The .md file lives in your repo like any other test. The /tmp/assrt/<runId>/ artifacts are per-run; they do not belong in source control. If you want to share a run with a reviewer, set ASSRT_AUTH_TOKEN and the CLI will print a shareable URL in the --json output.

First run against localhost:1420

How this stacks up against the usual Tauri e2e tools

Again — this is not a replacement for tauri-driver, tauri-plugin-playwright, or TestDriver.ai. It is a complement for the layer those tools do not focus on. The table below is specifically about the Layer 1 use case.

Featuretauri-driver + pluginAssrt on :1420
TargetBuilt .app bundle, system webviewVite dev URL (localhost:1420)
SetupRust plugin + Cargo dep + config file + spec fileOne CLI command, one Markdown file
AuthoringTypeScript with page.getByRole(...) and .click()Plain Markdown #Case, sentence per step
Rust backendAlways real (plugin ships with the build)Stub invoke() inline via browser_evaluate, or run real
Video of every runAdd your own trace viewer setupYes, --video opens a player automatically
Speed per caseTens of seconds to minutes; rebuild + launchSeconds; no bundle rebuild
Best fitRelease smoke tests, OS-integration flowsInner loop, PR checks, exploratory tests

What the shim actually has to cover

The @tauri-apps/api surface you are most likely to hit in UI code is small. You rarely need to stub more than these calls for a dev-loop test:

MINIMAL INVOKE SHIM CHECKLIST

  • invoke(cmd, args) — the one you actually call from components
  • transformCallback(cb) — needed for event listeners; identity function is enough
  • __TAURI_INTERNALS__.plugins (if your UI reads version info)
  • Window/WebviewWindow close/show/hide (only if UI tests the window chrome, which at Layer 1 usually means: don't test it here)
  • convertFileSrc (only if you render local file:// URLs; otherwise skip)

A pragmatic rule

Start with invoke and transformCallback as the only two things in your shim. Add more stubs only when a test actually fails because something is missing. The shim should grow from red-test pressure, not from speculative completeness. Every stubbed function is a chance for the test to lie about real behavior, so keep the shim small on purpose.

When to bail out of Layer 1 and reach for a webview driver

There are real limits to what you can verify at Layer 1. If your test requires any of the following, the dev-server approach is wrong and you should reach for tauri-driver or tauri-plugin-playwright instead:

  • Native file dialogs from dialog.open / dialog.save (Chromium will show its own dialog, not the OS dialog)
  • System tray behavior, menu bar items, global shortcuts
  • OS-specific webview quirks: WKWebView font rendering, WebView2 CSP edge cases, WebKitGTK input method handling
  • Updater flows, auto-launch at login, deep-link registration
  • Anything that cares about the actual window chrome (decorations, resize handles, full-screen affordances)
tauri-driverWebDriverWebKitGTKWebView2Vite 1420SvelteKitNext.jsSolidLeptosYewDioxusinvoke()

What gets saved where

Scenarios and runs live in predictable places on disk. This matters for CI caching, for .gitignore, and for debugging why a flaky test just went red. The layout is declared at src/core/scenario-files.ts:16-20:

  • /tmp/assrt/scenario.md — the plan text, your #Case blocks, plain Markdown
  • /tmp/assrt/scenario.json — UUID, URL, name, tags; stable scenarioId
  • /tmp/assrt/results/latest.json — most recent run summary; what the MCP tool returns
  • /tmp/assrt/<runId>/video.webm — recorded run; served via the local video player
  • /tmp/assrt/<runId>/events.json — every Playwright MCP tool call the agent made
  • ~/.assrt/browser-profile/ — persistent Chromium profile (cookies, localStorage survive across runs)
  • ~/.assrt/extension-token — saved Chrome extension bridge token; written once on first use

If your test relies on a logged-in state, the persistent profile is the thing. Log in once (with --headed so you can see the flow) and every subsequent assrt run reuses the cookie jar in ~/.assrt/browser-profile. For CI, pass --isolated instead so each job gets a clean in-memory profile.

Got a Tauri app and a messy test story?

Bring the repo. We will wire the dev-loop #Case layer with you on the call and you will leave with a green run on port 1420.

Book a call

Frequently asked questions

Why does this page say I can skip tauri-driver? Don't I need WebDriver for Tauri?

You need tauri-driver (or tauri-plugin-playwright, or a socket bridge) when you want to test the built .app bundle running in its system webview — WKWebView on macOS, WebKitGTK on Linux, WebView2 on Windows. That covers the last 20% of your testing: install flows, window chrome, native file dialogs, menu bars, tray icons, OS permissions. The first 80% of your UI bugs are not about any of that. They are forms not validating, buttons not invoking the right Rust command, routes not loading the right state, tables not sorting. For all of that, the frontend running under `cargo tauri dev` is served by Vite (or SvelteKit, or Next.js) on a regular HTTP port, and you can drive it with regular Chromium. Assrt does that by default. Nothing stops you from ALSO adding tauri-driver for your bundle-level smoke tests; they are different layers.

How does Assrt handle `invoke()` calls when the Rust backend is not attached?

Two options. Option 1 (recommended for speed): shim `window.__TAURI_INTERNALS__.invoke` from inside a `#Case` step. Assrt's `browser_evaluate` tool lets the agent execute arbitrary JavaScript on the page, so a step like 'Stub the `get_user` command to return {id: 1, name: "Test"}' will have the agent define `window.__TAURI_INTERNALS__ = { invoke: async (cmd, args) => { if (cmd === "get_user") return {id: 1, name: "Test"}; throw new Error("unstubbed"); } }`. This is defined in /Users/matthewdi/assrt-mcp/src/core/browser.ts lines 665-670. Option 2 (slower but real): run `cargo tauri dev` so the Rust backend is attached; Assrt connects to the same dev URL and real invoke calls go through. Option 2 only works if you launch Tauri with `withGlobalTauri: true` in tauri.conf.json so `window.__TAURI_INTERNALS__` is actually injected into a normal Chromium — otherwise the globals are only present inside the system webview. Most teams use Option 1 for the fast inner loop and Option 2 for a nightly run.

What port does `cargo tauri dev` actually run on?

Port 1420 by default. That number comes from the `create-tauri-app` template: `vite.config.ts` sets `server: { port: 1420, strictPort: true }` and `tauri.conf.json` sets `build.devUrl: "http://localhost:1420"`. `strictPort: true` matters — Vite will refuse to fall back to 1421 if 1420 is occupied, which prevents silent port drift when you run `cargo tauri dev` and `assrt run` in two terminals. If you changed the template port, Assrt does not care; pass whatever URL you configured via `--url http://localhost:<yourport>`. The only constraint is that Assrt needs the URL to return a 2xx, 3xx, 4xx, or 5xx response within the 8-second preflight timeout, per the check at /Users/matthewdi/assrt-mcp/src/core/agent.ts line 518.

Where does the test video go, and how do I share it with a reviewer?

Run with `--video` and Assrt starts a local video server on a random 127.0.0.1 port, records the run as a .webm in `/tmp/assrt/<runId>/`, and opens a player in your browser at `http://127.0.0.1:<port>/player.html?dir=<encoded-path>`. The player supports seeking via HTTP Range requests. For sharing beyond your machine, Assrt syncs scenarios and runs to app.assrt.ai (set `ASSRT_AUTH_TOKEN` first) and returns a shareable URL in the JSON output when you pass `--json`. The file watcher at /Users/matthewdi/assrt-mcp/src/core/scenario-files.ts line 90 syncs edits to Firestore with a 1-second debounce, so edits you make to /tmp/assrt/scenario.md show up on the shared URL automatically.

Can I run this in GitHub Actions or another CI runner?

Yes. In CI you run `cargo tauri dev` in one background step (or, more efficiently, just run your frontend dev server directly — `pnpm dev` — since you are only testing the web layer) and then `npx @assrt-ai/assrt run --url http://localhost:1420 --plan-file tests.md --json` in the next step. The `--json` flag emits a structured report to stdout; the non-zero exit code on failure flips the job red. Set `ANTHROPIC_API_KEY` in repo secrets so the Haiku-driven agent can run. One CI gotcha: the browser runs headless by default (line 349 of browser.ts splices `--headless` into the args when `headed` is false), so you do not need a display server like Xvfb. Another gotcha: skip `--extension` in CI. Extension mode expects a running Chrome with the Playwright MCP bridge installed, which is not a thing on a fresh runner.

What LLM runs the test steps, and does that mean my tests are non-deterministic?

By default `claude-haiku-4-5-20251001`, defined at /Users/matthewdi/assrt-mcp/src/core/agent.ts line 9. You can override with `--model`. Non-determinism is real but bounded: the model decides which UI element matches a step like 'click Sign In', but the click itself is a deterministic Playwright action against a concrete DOM node. In practice, flakiness shows up when your `#Case` is vague ('fill out the form') rather than specific ('type test@example.com into the email field, then type password123 into the password field, then click Submit'). If you want determinism on a specific selector, you can write it into the step: 'click the button with data-testid="invoke-greet"'. The agent will lock onto that selector. This is why the `#Case` format encourages sentence-per-step granularity — it controls the degree of freedom the model has.

How is this different from just writing `@playwright/test` specs by hand?

It is not, in one sense — Assrt wraps the same `@playwright/mcp` (pinned at 0.0.70 in package.json) that Microsoft maintains, and every browser action is real Playwright under the hood. The difference is the authoring surface. `@playwright/test` makes you write `page.getByRole('button', { name: /sign in/i }).click()` and then maintain those selectors as the UI shifts. `#Case` lets you write 'click Sign In', keep the test in a Markdown file, and get a video of every run. For a Tauri dev loop where the UI is changing three times an hour, that tradeoff is usually the right one. For a stable flagship flow you run 10,000 times in CI, hand-written Playwright is still probably cheaper per run.

What does the Assrt scenario file actually look like on disk after I run a test?

Three files in /tmp/assrt/: `scenario.md` holds the plan text (your `#Case` blocks), `scenario.json` holds the metadata (a UUID, the URL, the name, the tags), and `results/latest.json` holds the most recent run's summary. Per-run artifacts go into /tmp/assrt/<runId>/: a video.webm, screenshots, and an events.json with every MCP tool call made by the agent. This layout is declared at /Users/matthewdi/assrt-mcp/src/core/scenario-files.ts lines 16-20. You can commit scenario.md into your repo alongside your code; it is plain Markdown, so it diffs cleanly in pull requests. The scenarioId from scenario.json lets you re-run the same plan later via `assrt_test` with `{scenarioId: "<uuid>"}` instead of pasting the plan text again.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.