Playwright headless vs headed rendering: what actually differs in 2026

Most pages on this topic were written between 2019 and 2022, when headless Chromium ran on a separate code path with no compositor and paint output that genuinely diverged from headed. That gap closed. Since Playwright 1.42 in February 2024, the default headless mode is Chromium's --headless=new, which uses the same renderer, compositor, and GPU pipeline as a headed window. Most of what is written about the rendering difference is now outdated. The choice has not disappeared, it has narrowed.

Direct answer (verified 2026-05-13)

Headless runs Chromium with no visible window; headedshows a real Chromium window. Since Playwright 1.42 (February 2024) headless mode uses Chromium's --headless=new flag, so the two modes share the same renderer and the rendering pipeline is functionally identical for most tests. The remaining differences are debugger visibility, video recording cost, and four specific test categories: visual regression, video element behavior, canvas or WebGL, and GPU-accelerated animations. Authoritative reference: playwright.dev launch options and the Chromium new headless mode announcement.

Matthew Diakonov, Written with AI

Published May 13, 20268 min read

4.8from verified against source on 2026-05-13

Playwright 1.42 (Feb 2024) made --headless=new the default for Chromium; the renderer is shared with headed

Assrt's agent reads the accessibility tree before every action, so paint differences do not affect element resolution

Four specific categories still show real headless/headed differences: visual regression, video, canvas/WebGL, animation

What changed when Playwright switched to `--headless=new`

The old headless mode (now reachable as --headless=old) was a separate Chromium binary path: no compositor, no GPU process, a stripped-down paint stack designed for server use cases where nobody was going to look at the output. That mode rendered text with a different font fallback chain, declined some media APIs entirely, and used software-only canvas. If your test exercised any of those, the headless run could diverge from the headed run for real reasons, not just timing.

The new headless mode is the same Chromium binary booted without a visible window. The compositor still runs. The GPU process still runs (on hardware that has a GPU). The paint output is the same code path that produces what you see in headed. Playwright started passing --headless=new by default in 1.42 (February 2024), and Chromium itself has been slowly removing the old code path since. The practical effect is that most tests that used to flake on the headless versus headed axis no longer do, because the renderer they targeted has been unified.

Most of the rendering-difference guides on the internet predate this change. They will tell you that headless paints text differently (true in 2020, mostly not in 2026), that headless ignores GPU acceleration (true in 2020, not in 2026 on a runner with drivers), and that headless and headed produce different screenshot bytes (true in 2020, hit-and-miss in 2026 depending on the four categories below). Treat the older advice as historical, not current.

The four cases where a real difference still shows up

With the renderer unified, the surface area of legitimate headless versus headed disagreement collapses to four categories. If your test does not exercise any of them, you should be running headless by default and treating headed as a debugger.

Visual regression. If your baseline image was captured headed and the runner re-shoots headless on a different OS with different system fonts, you can get false diffs from font fallback, anti-aliasing, and subpixel positioning. The right fix is to bake the runner image (a single Docker image with one font set, one rendering config) and capture the baseline against that same image. Not headless versus headed; runner versus runner.
Video element behavior. Autoplay policy, picture-in- picture, media session APIs, and EME (encrypted media) can behave differently in headless because no real audio output device is attached. If your test asserts on a video that plays, pauses, and mutes itself, run it headed or pass the relevant --autoplay-policy=no-user-gesture-required flag.
Canvas and WebGL. On a CI runner without GPU drivers, Chromium falls back to SwiftShader, Google's software GL implementation. SwiftShader is correct but byte-different from hardware-accelerated rendering. If you are running pixel comparisons against canvas or WebGL output, expect drift across runner generations even within headless.
GPU-accelerated animations. CSS transforms, opacity transitions, and filters that the compositor can run on the GPU will flush frames at different times depending on whether a real GPU is present. A test that asserts on an in-flight transition (rare, but it happens) can see a different intermediate state in headless on CPU compositing than in headed on GPU.

0 of 4

“The agent picks elements off the accessibility tree, not off pixels. The rendering pipeline is invisible to the action layer.”

The Assrt design note that motivated headless as the default

Why an AI agent does not care which mode you picked

Most of the headless versus headed discourse assumes the test author is writing CSS selectors against the DOM and asserting on rendered state. In that world, paint matters: if the locator falls back to a layout-based heuristic, or the assertion compares a rendered region to a baseline, the renderer is part of the test. In an accessibility-tree-driven pipeline, the locator step never reads pixels. The agent calls a snapshot primitive, gets back a tree of interactive nodes annotated with refs, and picks one by label. The browser's rendering pipeline never enters the loop.

The diagram below is the loop Assrt runs for one click step, traced through the actual code paths. The four boxes are the same whether the underlying Chromium was started with --headless or without it. The only line of code in the entire stack that branches on the mode is browser.ts line 366, which decides whether to insert --headless into the argv passed to @playwright/mcp. Nothing downstream knows or cares.

One click step, mode-agnostic

Anchor fact: the exact line that flips the mode in Assrt

A common reaction to claims like "the mode does not matter" is to assume marketing handwave. Here is the part that is not. Assrt spawns @playwright/mcp as a child process over stdio. The argv it assembles starts at browser.ts line 296: [cliPath, "--viewport-size", "1600x900", "--output-mode", "file", "--output-dir", outputDir, "--caps", "devtools"]. Headless versus headed is decided sixty-eight lines later, at line 366:

// /Users/matthewdi/assrt-mcp/src/core/browser.ts:363-368
// --headless conflicts with --cdp-endpoint (and --extension), so only apply
// it in the local-launch path.
if (!extension && !cdpEndpoint) {
  if (!headed) args.splice(1, 0, "--headless");
  console.error(`[browser] launch mode: ${headed ? "headed" : "headless"}`);
}

That is the entire mode decision. One conditional, one argv splice, one log line. Everything else in the agent loop (the snapshot primitive at agent.ts line 956, the action primitives, the video recording pipeline, the per-step screenshot capture) calls into the same @playwright/mcp interface in both modes. The CLI surface is equally minimal: --headed on the command line, or ASSRT_HEADED=1 in the env, or headed: true on the MCP call. Default is headless.

The reason this matters for the broader question is that it bounds how much the mode can possibly affect a test. If the only line that differs is which flag goes into the child process argv, and the child process is the same binary in both cases, the only differences you can see downstream are differences Chromium itself produces from that flag. Those differences are the four categories listed above, and in 2026 they are smaller than the rest of the internet implies.

The two eras of this question, side by side

A useful way to see how the advice has moved is to look at the same question framed in 2020 and again in 2026. The components of the answer are the same; the weights are not.

Headless vs headed: 2020 advice and 2026 reality

Headless and headed Chromium run on separate code paths. Headless has no compositor, software paint only, and a different font fallback chain. Run headed locally to debug, run headless on CI for speed, and expect screenshot drift between the two. Visual regression tests require careful baseline pinning.

Separate Chromium code path for headless
No GPU compositor in headless
Font rendering visibly different
Screenshot drift expected

How to set the mode in real code

Plain Playwright takes headless: true | false as a launch option, or you can pass channel: "chrome" to use installed Chrome instead of the bundled Chromium. The relevant API surface is browser.launch() for the test runner and browserType.launchPersistentContext() for a profile-backed run. The default of headless: true maps to --headless=new; explicitly passing args: ["--headless=old"] opts you back into the legacy mode if you have a specific reason to reproduce 2020-era behavior.

With @playwright/mcp (which is what an AI-agent stack like Assrt calls), the surface is even smaller. The MCP CLI accepts --headless as a positional argv. Pass it in for headless; omit it for headed. Assrt's wrapper around that, as described above, is --headed on the CLI or ASSRT_HEADED=1 in the env. The viewport is locked at 1600x900 in both modes so layout-based tests do not have to be re-baselined when you toggle.

“We spent two days hunting a 'headless paints differently' bug that turned out to be a 2-second waitForTimeout on a slow CI runner. The mode had nothing to do with it.”

An engineer who later admitted it

QA lead, anonymized

A practical decision rule

You can compress the whole question into three lines. Default to headless. Run headed locally when you want to watch the agent step through a scenario, or when one of the four categories above is the target of your test. Treat any flake that looks mode-dependent as a timing or font or runner-image problem first, and only suspect the renderer after you have ruled those out. In a year of running Assrt against real apps, the count of bugs we ultimately traced to a headless versus headed difference is small enough that the rule has held for almost every run.

Want to see an accessibility-tree-driven Playwright run on your own app?

Thirty minutes. Bring a flow you currently test headed because you do not trust the headless version. We will show you the run on both modes, side by side, with the artifacts on disk.

Frequently asked questions

Is there a real rendering difference between headless and headed in Playwright today?

Much smaller than the 2020-era guides suggest. Since Playwright 1.42 (released February 2024) the default headless mode for Chromium uses Chromium's --headless=new flag, which boots the same compositor, the same GPU process, and the same paint pipeline as headed Chromium. The old --headless=old mode (no compositor, software-only paint, separate code path for some features) is still reachable as a flag but is not the default. For most modern web apps the rendered output is byte-identical or close to it. Where it is not byte-identical, the reasons are font fallback differences (no installed system fonts in CI), missing GPU vendor drivers on a CI runner, and a handful of media corner cases (autoplay, encrypted media, screen capture APIs).

If the renderer is the same, why does my test pass headed and fail headless on CI?

Three causes in roughly this order. First, your test depends on a real GPU and your CI runner does not have one, so requestAnimationFrame timing differs and a transition you assumed was complete is still in flight. Second, your test pulls a system font that exists on your laptop and not on the Linux image (Apple SF Pro, Segoe UI, custom corporate fonts), so the text metrics differ and an element you queried by layout has shifted by a pixel. Third, your test was written with a timing-based wait that worked on your fast machine and breaks on a contended CI runner; this has nothing to do with headless versus headed and would also flake on a headed CI run with the same slow runner. The first two are real headless/CI gaps; the third is the test, not the mode.

Does Assrt change anything I should know about for the headless vs headed question?

Two things. One, Assrt's agent does not look at pixels by default. Before every action it calls a snapshot() primitive that returns the page's accessibility tree with each interactive node annotated with a fresh ref. Picking the right button is a label-resolution problem, not a pixel-comparison problem, so the mode of the underlying browser is invisible to the action layer. Two, when you do need headed, Assrt exposes it as the --headed CLI flag or the ASSRT_HEADED=1 env var. The viewport is locked at 1600x900 in both modes (browser.ts line 296), and the same @playwright/mcp child process handles both. The only line that differs between the two modes is whether --headless is inserted into the argv at browser.ts line 366.

When should I still run headed instead of headless?

Four cases. Visual regression tests where you are comparing rendered pixels against a baseline; if the baseline was captured headed and the test runs headless, font fallback or GPU driver differences can produce false diffs. Video element tests where autoplay, picture-in-picture, or media session controls behave differently in headless. Canvas and WebGL tests where the GPU code path may switch to SwiftShader software rendering in headless on CI runners without GPU drivers; the output is correct but slower and pixel-different. Animation tests that assert on an in-flight transition; if you assert before the GPU compositor has flushed a frame, headless on CPU compositing can show a different intermediate state than headed on GPU. For everything else, headless is the default and headed is a debugging convenience.

What about Firefox and WebKit? Does the same --headless=new story apply?

No. The --headless=new story is Chromium-specific. Firefox runs the same Gecko engine in both modes; the headless versus headed difference there has always been smaller than Chromium's old gap. WebKit on Playwright uses Apple's WebKit framework wrapped in a Playwright launcher; headless mode disables window creation but the WebCore renderer is the same code path. For Firefox and WebKit, you can mostly treat the two modes as equivalent for rendering purposes and pick based on debugger visibility.

How does headed mode change debugger ergonomics for AI-agent-driven tests?

Headed gives you three things you do not get headless. One, a live Chromium window you can interrupt with the devtools shortcut and inspect mid-run. Two, a real cursor that moves when the agent calls hover or click; in headless the cursor is virtual and you only see its effect in the next snapshot. Three, the ability to record the screen with your OS tools rather than relying on the runner's video output. Assrt records video in both modes via @playwright/mcp's recordVideo support, so the video-recording benefit of headed is smaller than it used to be. The argument for headed is mainly the live-inspect path when you want to pause an agent run and look around.

Is headed slower than headless on the same machine?

Slightly. The window-management overhead (compositing to a real GLX surface, frame presentation to the X server or Wayland compositor) is a single-digit-percent tax on most CI runners. On your laptop the headed run also competes with your other windows for the GPU, which can amplify the gap. On modern Linux CI without a real display server, headed runs through Xvfb and the gap shrinks again because Xvfb is a software display that does not present frames. The clean version of the answer: assume a 5 to 15 percent overhead for headed; do not pick the mode on perf grounds unless you are running tens of thousands of tests.

Does headless still get detected by bot-fingerprinting more than headed?

On a vanilla Playwright launch, yes, a little. Headless Chromium reports a user agent string that includes 'HeadlessChrome' and exposes a few JS-level signals that bot-detection services key off of (navigator.webdriver, missing window.chrome, default screen dimensions, no plugins array). Most of these are fixable with launch options or stealth shims. The shorter answer in 2026 is that any production-grade bot detection looks at behavioral signals (mouse curves, timing, IP reputation) more than at the HeadlessChrome user agent. If you are testing your own app, this does not apply at all. If you are scraping a third-party site that hates you, headless versus headed is the wrong axis; the right one is whether the site sees a Cloudflare or DataDome challenge, which is decided long before your browser starts rendering.

What is the exact line in Assrt's source that decides headless versus headed?

Two lines. browser.ts line 366 reads `if (!headed) args.splice(1, 0, "--headless");`, which inserts the literal --headless flag into the argv passed to @playwright/mcp's cli.js. browser.ts line 367 logs the mode with `[browser] launch mode: ${headed ? "headed" : "headless"}`. The default value of `headed` is false in cli.ts line 113 (`headed: !!args.headed || process.env.ASSRT_HEADED === "1"`). Everything downstream, from the agent's snapshot-and-action loop to the per-step screenshot capture, runs against the same browser interface regardless of which mode the argv produced. That is the load-bearing version of why mode rarely affects test correctness in this pipeline.

Does CDP attach or extension mode bypass the headless versus headed choice?

Yes. Both shortcut around it. CDP attach mode (set ASSRT_CDP_ENDPOINT in your env) tells @playwright/mcp to connect to an already-running Chromium over the CDP wire, so the spawning Playwright process never decides on a mode; the Chromium that is already up is whatever you launched it as. Extension mode (`--extension` or `extension: true` in the MCP call) connects the agent to your own running Chrome via the Playwright extension, which is by definition a headed window. browser.ts lines 363-368 short-circuit the --headless argv insertion in both of those modes; the comment on line 363 says exactly this.

Adjacent angles on Playwright reliability and modes

Keep reading

Reliability

Headless Chrome test flakiness

When tests pass headed and fail headless, the flake is almost always the script, not the mode. The MutationObserver wait that fixes it.

Read

Parallelism

Headless Chrome parallel flakiness

Parallelism amplifies headless quirks: shared state, race conditions, and the contention surface that makes a 5 percent flake rate into a 50 percent one.

Read

Framework

Playwright vs Cypress vs Selenium in 2026

Once you have decided on a mode, the framework comparison hinges on whether the maintainers ship a first-party MCP server for AI-agent drivers.

Read

Playwright headless vs headed rendering: what actually differs in 2026

What changed when Playwright switched to --headless=new

The four cases where a real difference still shows up

Why an AI agent does not care which mode you picked

Anchor fact: the exact line that flips the mode in Assrt

The two eras of this question, side by side

Headless vs headed: 2020 advice and 2026 reality

How to set the mode in real code

A practical decision rule

Want to see an accessibility-tree-driven Playwright run on your own app?

Frequently asked questions

Keep reading

Headless Chrome test flakiness

Headless Chrome parallel flakiness

Playwright vs Cypress vs Selenium in 2026

Comments (••)

What changed when Playwright switched to `--headless=new`

Comments ()