Smoke testing deep dive

Smoke tests for critical paths: how to know the path actually finished

Every smoke-testing guide gives you the same ten-flow checklist: login, checkout, signup, search, logout. That part is easy. The hard part, the part that makes smoke suites rot, is knowing when a critical path has actually finished so the assertion does not fire into a half-rendered page. This guide covers the MutationObserver technique Assrt uses, the exact open-source code that runs it, and why it beats page.waitForSelector and networkidle on async-heavy flows.

Matthew Diakonov, Written with AI

Published April 18, 20269 min read

4.9from real Playwright code out, zero proprietary YAML

Sourced from @m13v/assrt agent.ts:941-994 (wait_for_stable implementation)

Works with streaming LLM responses, SSE feeds, and skeleton-to-content transitions

Open-source and self-hosted. No $7.5K per month cloud. The test artifact is yours to keep.

Smoke tests that know when the path finished

DOM-mutation stability, not a fixed sleep

MutationObserver on document.body

Poll every 500ms until the DOM goes quiet

Default 2s of stability, 30s max timeout

Works with streaming, skeletons, SSE

Real Playwright code out, no YAML

0:00 / 0:06

3.4s

“Page stabilized after 3.4s (47 total mutations)”

@m13v/assrt run log, checkout happy path

That is the line Assrt writes after it has watched your critical path finish. It means the MutationObserver saw 47 DOM mutations, then nothing for 2 consecutive seconds, and exited after 3.4 seconds total. The next assertion fires against a settled page. That is the difference between a green smoke run and a flaky one.

The ten critical paths you actually need

Pick ten. Not eleven. If your product is missing one on this list, swap it for the ninth most painful flow you maintain. The instinct to add more is the same instinct that got your suite to 600 flaky specs.

Your smoke suite, in priority order

Login plus password reset. If this breaks, the rest is unreachable.
The single CRUD flow that earns revenue. Create, edit, delete, re-read.
Checkout or payment, including the card-declined branch, not just the happy path.
Signup, because churn at signup is invisible without a smoke test.
Role or plan switching, if you have tiers or multi-user teams.
The one admin action you always triple-check before running in production.
A report or export, because it touches real data at scale and fails quietly.
The primary inbound webhook, because a silent failure means lost money.
A search or filter page, because these degrade as data grows.
Logout that actually clears session state, because auth bugs hide here.

What almost every smoke-test guide skips

Read the top ten Google results for this keyword and you will see the same advice: keep it fast, run on every build, cover login and checkout, bake it into CI. All true, none of it wrong. But none of those posts answer the question that determines whether your smoke suite is still green in three months:

How does the test know the critical path is finished so the next assertion is fair?

Three answers are common. Two of them are the reason your suite flakes.

Sleep 2 seconds. Fast on fast runs, wrong on slow ones. The first CI agent to hit a cold lambda fails. You bump it to 5. Now your 10-test suite is a 50-second suite, and a real 6-second stall still flakes.
Wait for a selector. Better, but only tells you one element appeared. A checkout confirmation card can mount while the account balance is still reloading. Assertions on the balance fire mid-update.
Wait for the DOM to stop changing. This is what Assrt does. It adapts to your app's real speed, handles streaming and skeletons, and survives a UI redesign because it is not coupled to any selector.

The exact code that runs `wait_for_stable`

This is the whole thing. No ML model, no proprietary wait heuristic. A MutationObserver on document.body, a counter, a 500ms poll loop, and a stability clock.

@m13v/assrt/src/core/agent.ts

Defaults: stable_seconds = 2, timeout_seconds = 30. Caps: stable_seconds maxes at 10, timeout_seconds maxes at 60. For AI-response flows that stream tokens for longer, the agent will request higher values per step. It also cleans up after itself, deleting window.__assrt_observer and window.__assrt_mutations so nothing leaks into the next step.

DOM-mutation stability versus the two other strategies

The left column is how most smoke suites wait today. The right column is what you get from wait_for_stable.

Feature	Fixed sleep / waitForSelector / networkidle	DOM-mutation stability (Assrt)
Detects when async streaming finished	No, fires as soon as it reaches the timeout	Yes, stability timer resets on every DOM mutation
Handles skeleton-to-content transition	Only if you know the final selector to wait on	Yes, mutation count keeps incrementing until skeleton unmounts
Survives a UI redesign without editing the test	No, a new selector path means a new test file	Yes, the observer watches the whole body, not a selector
Works with streaming LLM / SSE responses	Usually times out or fires early	Tokens mutate text nodes, timer resets, fires when the stream stops
Faster on fast runs, slower on slow runs (adaptive)	No, fixed by the timeout value you hard-coded	Yes, exits the moment the DOM goes quiet
Does not hang on long-lived WebSocket / analytics pings	networkidle hangs indefinitely	Unaffected, only the DOM matters

Write a critical-path smoke test in plain Markdown

A Case block per path. The agent handles selectors and timing. Nowhere in this file do you write [data-testid="pay-button"] or a sleep. The agent picks up the ref from the live accessibility tree on each run.

smoke-checkout.scenario.md

What happens end to end on one run

Case to agent to MCP tool call to Playwright to real .spec.ts. Five steps, no DSL in the middle, nothing Assrt-specific in the output file.

How a smoke test for a critical path flows through Assrt

The run, step by step

Write the flow as a #Case in Markdown

One markdown block per critical path. Natural English. No YAML, no selectors. The agent reads the accessibility tree at runtime, so you never commit a CSS path that will break in six weeks.

Agent drives @playwright/mcp

Assrt spawns the official @playwright/mcp server as a subprocess and calls its tools. Clicks, fills, navigations all resolve from the live a11y snapshot. Each tool call emits the corresponding Playwright line into the run trace.

wait_for_stable catches the async tail

After any state-changing click, the agent calls wait_for_stable. A MutationObserver is injected, poll loop runs at 500ms, exits on 2s of quiet or at 30s. Fires once the critical path has actually rendered, not when the first selector appears.

Assertions fire against a settled DOM

The next assert_visible or assert_text runs against a page that has stopped changing. No flakes from half-rendered skeletons. No false greens where a loading spinner matched your assertion text because the real content had not mounted yet.

Real Playwright code lands in the trace

Every step writes a Playwright line to /tmp/assrt/results/latest.json. Copy the file into tests/smoke.spec.ts and the test runs under vanilla Playwright. Self-hosted, open-source, zero lock-in. If you uninstall Assrt tomorrow, the suite keeps working.

One real run, one real log

Here is the checkout smoke test from above, executed. Note the Page stabilized after 3.4s (47 total mutations) line. That is the moment the DOM went quiet and the assertion became safe to fire.

assrt-smoke-run.log

By the numbers

0default seconds of DOM quiet before exit

0default seconds of max wait per step

0millisecond poll interval

0proprietary YAML in the output .spec.ts

critical paths is enough for most products. Ten, not fifty.

seconds is the hard cap per step. Covers streaming AI responses without stalling CI.

dollars per month. Open-source, self-hosted, MIT on GitHub.

Run your first ten-path smoke suite today

npx @m13v/assrt test https://your-app.com --file smoke.scenario.md. The agent handles selectors, wait_for_stable handles the async tail, and the artifact in /tmp/assrt/results/latest.json is a real Playwright .spec.ts you own forever.

Install it →

APIs and tools this page touches

MutationObserverdocument.bodychildList + subtree + characterDatapage.waitForSelector()page.waitForLoadState('networkidle')@playwright/mcpbrowser_snapshotbrowser_clickwait_for_stableassert_visible@m13v/assrtaccessibility tree refs

Questions developers actually ask about smoke-testing critical paths

What is a smoke test for a critical path, in plain terms?

A smoke test is a yes-or-no check that a critical path still works end to end in a real browser. Critical paths are the flows where failure effectively takes the product down: login, the primary CRUD flow that earns revenue, checkout or payment (including the card-declined branch), signup, role-switching, the most dangerous admin action, a report or export, the main inbound webhook, a search or filter, and a logout that clears session. Ten is usually enough. The goal is fast, deterministic feedback on every build and every 30 minutes against production, not full coverage.

Why do smoke tests flake on critical paths even when the flow works?

Because the assertion fires before the critical path has actually finished. Checkout shows a spinner for 1.8s, then a confirmation card, then an async balance refresh. A test that sleeps 2s passes sometimes and fails sometimes. A test that waits for the confirmation text passes, but the balance assertion on the next line fires mid-update. The technique that fixes this is not a longer sleep. It is waiting for the DOM itself to stop changing, which is what Assrt does via a MutationObserver injected into the page.

How does Assrt's wait_for_stable actually work?

The agent injects a MutationObserver onto document.body with childList, subtree, and characterData enabled, stored on window.__assrt_observer. Every 500ms it reads window.__assrt_mutations. When the count has been flat for stable_seconds (default 2, max 10), it breaks. It times out at timeout_seconds (default 30, max 60). On exit it disconnects the observer and returns a line like 'Page stabilized after 3.4s (42 total mutations)'. Source: /Users/matthewdi/assrt-mcp/src/core/agent.ts:941-994.

Why is this better than page.waitForSelector or a fixed timeout?

Fixed timeouts trade flakiness for speed: too short and the test flakes, too long and your suite is slow. waitForSelector is better, but it only tells you one element showed up, not that the page is done. A checkout confirmation card can appear while the balance is still reloading. MutationObserver stability waits for the whole critical path to finish rendering. Tests pass faster on fast runs, wait longer on slow runs, and catch the case where the page mounted a shell then bailed because nothing else mutated.

Does this generate real Playwright code I can keep, or a proprietary format?

Real Playwright. Assrt drives the official @playwright/mcp server. Each action the agent takes (click, fill, navigate, wait_for_stable) gets emitted as a TypeScript Playwright line into the run trace. The artifact in /tmp/assrt/results/latest.json is copy-pasteable into a *.spec.ts file. You own it. If you stop using Assrt, your smoke suite still runs under vanilla Playwright. There is no YAML layer, no vendor DSL, no $7.5K/month cloud bill.

How do I actually run smoke tests for my 10 critical paths?

npx @m13v/assrt test https://your-app.com --case 'log in, see the dashboard, log out'. Write each critical path as a short markdown #Case block (one per flow). Run the whole suite on every PR against a preview deploy (block merge on failure) and every 30 minutes against production as a synthetic (page on failure). The agent figures out selectors from the accessibility tree on each run, so a UI redesign does not break the tests.

What if my critical path depends on an LLM response or a webhook that takes 10+ seconds?

Set stable_seconds higher. wait_for_stable accepts up to 10 seconds of quiet and 60 seconds of total wait. For AI chat responses, 5 seconds of stability is usually right: tokens stream in with DOM mutations, then stop cleanly. For webhook-dependent confirmation screens, you may want to combine wait_for_stable with an explicit assertion on the confirmation text or an API poll. The mutation approach handles streaming, progressive rendering, and skeleton-to-content transitions automatically.

How does this interact with service workers, websockets, or streaming responses?

Service workers do not mutate the document body, so their background work is invisible to the observer, which is correct for UI smoke tests. Open websockets that push messages rendered into the DOM do register as mutations, so an SSE or WS-driven feed will keep the stability timer ticking until the feed quiets down. Background fetches that never touch the DOM do not affect it either. If you need to assert something that never renders (a background task success), use wait_for_stable for the visible flow, then a direct API check after.

Is this different from Playwright's waitForLoadState('networkidle')?

Yes. networkidle waits for 500ms with no more than two in-flight requests, and Playwright's own docs now discourage it because it hangs on apps with long-lived websockets, analytics pings, or poll loops. DOM-mutation stability is network-agnostic: it watches what the user actually sees. A page can be networkidle with a half-rendered skeleton, or network-chatty with a complete UI that has stopped changing. For smoke tests of critical paths, DOM stability is the signal you want.