Visual regression testing with built-in automation that lives inside your coding agent, not your CI.
Every other result for this keyword wires visual regression into GitHub Actions: you push, CI pulls baselines from S3, a pixelmatch worker diffs the PNGs, and a red check arrives on your PR six minutes later. Assrt inverts that. On npx assrt setup, an 8-line Bash script lands at ~/.claude/hooks/assrt-qa-reminder.sh, registers itself as a PostToolUse hook against the Bash matcher in ~/.claude/settings.json, and starts firing on every git commit or git push the agent runs. The visual regression check happens in the same tool loop that wrote the code.
The premise every top-5 result skips
"Built-in automation" does not have to mean a YAML file in .github/workflows. It can mean a hook in ~/.claude/settings.json that fires inside the agent writing the code.
That inversion is the whole page. Everything below is how Assrt implements it, verified by line number, and what it feels like when the agent catches a regression before you hit push.
What "built-in" means when the automation is a shell hook
Most articles on this keyword treat "built-in" as built into the test runner: Playwright ships with toHaveScreenshot(), Cypress has cy.screenshot(), BackstopJS has reference + test commands. Assrt takes the phrase one layer higher: built into the coding agent's runtime. The moment the agent runs git commit, Claude Code fires a PostToolUse event, Assrt's hook script reads the command from stdin, and emits a JSON payload back into the conversation. The agent reads that payload on its very next turn and decides whether to call assrt_test.
Eight lines is the whole implementation. The Bash matcher in settings.json wires it up, and Claude Code guarantees it runs after every Bash tool call. The script is at /Users/matthewdi/assrt-mcp/src/cli.ts lines 203-212 in the source, copied verbatim to your home directory by setupAssrt() at line 246.
The commit commands the hook matches
git commitgit pushgit commit --amendgit push --force-with-leasegit commit -mgit push origin mainThe regex git (commit|push) matches substring anywhere in the command, so amends, force-pushes, and message-flagged commits all fire the same PostToolUse payload. The hook intentionally ignores git status, git diff, and read-only traffic.
Where the hook actually lands on disk
The installer is boring on purpose. Three files. No daemons, no launchd plists, no node background process. You can remove Assrt by deleting three paths. The shape below is exactly what a fresh npx assrt setup leaves behind.
The MCP server registration is a separate one-shot that runs inside the same setup function. It is how the assrt_test tool becomes available in every Claude Code project without a per-repo config.
Everything the installer writes, in one bento
Six paths on your disk. No database, no cloud account. Tearing the whole thing out is a three-line rm and a settings edit.
~/.claude/hooks/assrt-qa-reminder.sh
The 8-line Bash hook. Runs on every Bash tool call, greps 'git (commit|push)' out of the tool_input, emits a PostToolUse additionalContext JSON on match. Written by cli.ts line 246 with mode 0o755.
~/.claude/settings.json
PostToolUse array gets an entry with matcher 'Bash' pointing at the hook script. Idempotent: re-runs detect existing entries by the substring 'assrt-qa-reminder' and skip.
~/.claude/CLAUDE.md
Appended with a 'QA Testing (Assrt)' block whose first line reads: 'CRITICAL: You MUST run assrt_test after ANY user-facing change.' This is the global-memory half of the automation: the agent reads it at every conversation start.
MCP server registered at user scope
A one-shot 'claude mcp add-json assrt ... --scope user' registers the stdio MCP server, making assrt_test, assrt_plan, assrt_diagnose, assrt_analyze_video available everywhere. No per-project config needed.
~/.assrt/browser-profile/
Playwright persistent context directory. Set at browser.ts line 313. Cookies and auth survive between visual regression runs, which is what lets you run the same plan against a logged-in dashboard on every commit without re-logging in.
~/.assrt/installed
Marker file dropped at cli.ts line 447 so the npm postinstall does not re-run setup on every npx invocation. Delete it if you want to reinstall.
“The entire built-in automation layer is an 8-line Bash script registered against a 'Bash' matcher in ~/.claude/settings.json. That script, plus a single additionalContext string, is how a commit inside Claude Code turns into a visual regression run on your dev server.”
assrt-mcp/src/cli.ts:203-287
End to end: from a commit command to a TestAssertion
The lifecycle has five actors and no server. Every arrow below is a local tool call or a stdio write. Nothing crosses the network until assrt_test reaches out to Claude Haiku 4.5 to judge the captured JPEG.
Commit → hook → agent → assrt_test → result
The fan-out from a single hook fire
One commit, one JSON blob, many outputs. The hub is assrt_test; the inputs are the three environmental pieces it needs; the outputs are the artifacts you keep on disk and review later.
PostToolUse hook → assrt_test → your artifacts
The transcript, the way you actually see it
Below is a real-looking slice of what the Claude Code transcript prints when the hook fires and the agent responds. The key line is the PostToolUse additionalContext received marker; that is the injection point becoming part of the agent's next turn.
Built-in automation in the agent vs built-in automation in CI
I am not claiming agent-local automation replaces CI for every team. It does not. CI still owns the post-merge gate for main, the scheduled cross-browser matrix, and the release verification against production URLs. What the agent-local hook does is collapse the inner loop. The table below is the honest breakdown.
| Feature | Percy / BackstopJS / Applitools in CI | Assrt (agent-local built-in automation) |
|---|---|---|
| Where the automation runs | GitHub Actions / GitLab CI runner, on every push | Claude Code runtime, on every git commit |
| Who triggers the test | The CI engine, post-push | A PostToolUse hook in ~/.claude/settings.json |
| Latency from code change to feedback | Minutes (push, queue, build, diff worker, comment) | Seconds (hook fires, agent runs assrt_test locally) |
| Where the feedback lands | PR comment, check mark, screenshot gallery | The agent's own next turn, in the terminal |
| Cost to wire up | Workflow YAML, secrets, baseline storage | One npx command writes the hook and MCP registration |
| Pricing at scale | Around $7,500 / month on tier-3 AI platforms | $0, open-source, self-hosted |
| Artifacts kept on your disk | Vendor cloud; baseline PNGs mirrored to git at best | /tmp/assrt/<runId>/ with WebM, PNGs, and JSON results |
| Good for | Merge-gate checks, scheduled cross-browser runs | Inner-loop feedback before the push even happens |
The exact pipeline, rendered as frames
Reading this as a sequence helps: each frame is a step in the built-in automation, from the keystroke that fires the hook to the JSON Claude reads at the end of the run.
One commit, five frames
You type git commit
The built-in automation, by the numbers
0 lines of Bash. One regex. Four MCP tools. Zero YAML files to maintain. That is the whole automation surface.
How the hook scopes itself, in one checklist
The whole scoping model is seven bullets long. Read them and you know exactly when the hook will and will not fire.
What triggers the built-in automation
- Fires on the Bash matcher, so every Bash tool invocation is inspected
- Only emits the JSON blob when the regex 'git (commit|push)' matches
- Emission uses hookSpecificOutput.hookEventName = 'PostToolUse'
- The additionalContext string is literal plain English, no templating
- Installed globally in ~/.claude, not per project, so it covers every repo
- Idempotent: setup detects existing 'assrt-qa-reminder' entries and skips
- Migrates away from legacy 'assrt-post-commit' entries on upgrade
Get the hook live in four steps
From nothing installed to a visual regression fired by your next commit. No config file. No baseline directory. No CI pipeline changes.
From zero to a firing hook
- 1
npx assrt setup
Runs setupAssrt() from cli.ts. Writes the hook, updates settings.json, registers the MCP server, appends to CLAUDE.md.
- 2
Restart Claude Code
Required for the new PostToolUse entry and MCP server registration to take effect.
- 3
Write a scenario.md in your repo
One or two imperative #Case blocks describing what your key pages should look like. No YAML, plain English.
- 4
Commit anything user-facing
The hook fires, the agent reads additionalContext, calls assrt_test, and reports pass or fail before you push.
Why agent-local automation fits visual regression specifically
Visual regressions are the class of bug that survives unit tests and survives integration tests, because the pixels are the output. They die to a human reviewing a screenshot or to an agent reasoning about one. Agent-local automation closes that loop at the exact moment a developer is still in the thought of the change. Six minutes of CI latency is enough for context to vanish; a one-turn agent callback is not. That is the part of the SERP every tutorial misses. They assume the automation belongs to a pipeline, not to a conversation.
Want this hook firing on your repo by tomorrow?
Book 20 minutes with the Assrt team. We'll walk through installing the PostToolUse hook against your stack and wiring the first scenario.md against your staging dev server.
Book a call →Built-in visual regression automation: concrete answers
What does 'built-in automation' actually mean for visual regression testing in Assrt?
It means the automation layer does not live in a YAML file inside .github/workflows. It lives in your local Claude Code runtime. When you run npx assrt setup, the CLI writes a file at ~/.claude/hooks/assrt-qa-reminder.sh, registers it as a PostToolUse hook against the Bash matcher in ~/.claude/settings.json, and appends a 'CRITICAL: You MUST run assrt_test after ANY user-facing change' block to ~/.claude/CLAUDE.md. The source is assrt-mcp/src/cli.ts lines 214-308. After that, every Bash tool call the agent makes is inspected; if it matches the regex 'git (commit|push)', a JSON blob is emitted that Claude Code folds into the agent's next turn as additional context. The agent then runs the visual regression itself. No CI pipeline was involved at any point.
Show me the exact shell script that fires on a commit.
It is eight lines, defined as the QA_REMINDER_HOOK constant at assrt-mcp/src/cli.ts lines 203-212. It reads stdin as JSON, extracts .tool_input.command with jq, greps for 'git (commit|push)', and if matched prints '{"hookSpecificOutput":{"hookEventName":"PostToolUse","additionalContext":"A git commit/push was just made. If the committed changes affect anything user-facing (UI, routes, forms, APIs), run assrt_test against the local dev server..."}}'. That additionalContext string is the injection point. It becomes part of the agent's next prompt and the model decides to call assrt_test from there.
How is this different from a CI-triggered visual regression test?
A CI visual regression run happens after you have already pushed. The feedback arrives as a red check on a PR, minutes later, usually on a machine you cannot easily debug. A built-in automation hook fires before the push has finished, inside the same agent that just edited the code, on your own machine, with access to your dev server. The agent sees a failing TestAssertion and can re-open the file it just edited in the same turn. You shorten the loop from 'CI fails, you context-switch back' to 'the agent that wrote the bug also caught it'. Both modes are legitimate. Assrt supports running the same plan in CI too, via the same CLI.
Does the hook actually run tests on its own, or does it just remind the agent?
It injects additional context; it does not execute assrt_test directly. The reason is subtle. Running a visual regression test takes 5-30 seconds per case and uses the agent's own tool quota, so firing it on every Bash call without the agent's awareness would both be slow and confusing in the transcript. The hook is precise: it matches only the commit/push surface, then it tells the agent 'you just committed user-facing changes, you should run a visual regression'. The agent then calls the assrt_test MCP tool, which captures a video at 1600x900 via @playwright/mcp and attaches each step's JPEG screenshot to Claude Haiku 4.5 for semantic verification.
What gets installed on my machine when I run npx assrt setup?
Three things, written by setupAssrt() in assrt-mcp/src/cli.ts lines 214-308. One: the Assrt MCP server is registered globally via 'claude mcp add-json assrt ... --scope user', making the assrt_test, assrt_plan, assrt_diagnose, and assrt_analyze_video tools available in every Claude Code project. Two: the Bash script above is written to ~/.claude/hooks/assrt-qa-reminder.sh with mode 0o755, and a PostToolUse entry with {matcher: 'Bash', hooks: [{type: 'command', command: hookPath}]} is appended to ~/.claude/settings.json. Three: a 'QA Testing (Assrt)' section is appended to ~/.claude/CLAUDE.md. A marker file is dropped at ~/.assrt/installed so subsequent postinstalls are idempotent.
Does the hook see commits I make in other repos, or only in the current project?
It is global. Because it lives in ~/.claude/settings.json and not in the project's .claude/settings.json, it fires on any git commit/push the agent runs in any repo on this machine. The matcher is 'Bash' with no project scoping. This is the right default for built-in automation: you want visual regression to be the baseline behavior for every product you ship, not a per-repo opt-in. If you want to scope it to one repo, delete the global entry and add the same PostToolUse block to the project-level settings.json.
What if the commit does not touch UI? Does the hook still suggest a visual regression?
Yes. The hook has no understanding of the diff. It matches on 'git (commit|push)' in the command string and emits the same additionalContext every time. The decision of whether to actually call assrt_test is delegated to the agent, which sees the diff, the recent tool transcript, and the hook's message. The additionalContext explicitly scopes its suggestion to 'if the committed changes affect anything user-facing (UI, routes, forms, APIs)'. That wording is at cli.ts line 210. A well-behaved agent reads it, checks the diff, and either runs the test or acknowledges that the commit was a config bump and skips.
What does assrt_test actually do when the hook prompts the agent to call it?
assrt_test launches a real Chromium browser via Playwright MCP at 1600x900 (browser.ts line 291), runs the plan from scenario.md, records the run as WebM, captures a JPEG after every visual action (click, type_text, scroll, press_key, navigate, select_option), attaches each JPEG to Claude Haiku 4.5 so the model can semantically verify the frame, and returns a structured result: a pass/fail boolean, a TestAssertion[] with description/passed/evidence for each check, a videoPlayerUrl, and paths to the PNG forensic artifacts in /tmp/assrt/<runId>/screenshots/. The agent reads that JSON and either reports success, or if a case failed, calls assrt_diagnose with the runId to get a suggested fix.
Can the hook trigger a regression check on other tools' output, like npm test or tsc?
The matcher in the installed settings is 'Bash', so the hook receives every Bash tool invocation. Inside the script, the grep filter is intentionally narrow to 'git (commit|push)'. You can fork the script at ~/.claude/hooks/assrt-qa-reminder.sh and broaden the regex to include things like 'npm run build', 'pnpm deploy', or a specific pre-push step. The shape of the emitted JSON is stable: hookSpecificOutput.hookEventName is 'PostToolUse' and additionalContext is a free-text string. Anything you write there becomes part of the agent's next prompt.
Is this actually open-source, or does the MCP server phone home?
Open-source, self-hosted, zero cloud dependency. The MCP server is published as @assrt-ai/assrt on npm and runs entirely as a local stdio process. The browser profile is persisted at ~/.assrt/browser-profile so your auth state survives between runs, and nothing leaves the machine except the outbound LLM call to Anthropic or Google for the semantic verification step. The comparable tier-3 AI testing platforms charge around $7,500 a month at scale and keep scenarios, diffs, and screenshots in their cloud. Every file Assrt touches lives on your disk: ~/.claude/hooks, ~/.claude/settings.json, ~/.claude/CLAUDE.md, ~/.assrt/browser-profile, /tmp/assrt/<runId>/.
Adjacent guides on agent-driven testing, why visual regression is a reasoning problem, and how the rest of the Assrt surface fits together.
Keep reading
Visual regression tutorial without golden PNGs
Companion read: how a single JPEG attached to Claude Haiku replaces the __snapshots__/ folder, tolerance knobs, and --update-snapshots dance.
AI-generated regression tests
What the agent does once the hook has fired: generate a case, run it, and either pass or hand you an evidence string for the failing frame.
Auditing AI-generated apps with E2E regression testing
Why proactive built-in automation matters when the author of the code is itself an LLM and the surface area changes every commit.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.