Negative Assertions
Your AI's try/catch is why the test passes. Write the test that fails when the fallback fires.
The reason your green CI is lying is that Claude wrapped your fetch in a catch-all and returned []. The page renders a valid empty state. Nothing throws. The test goes green. Users see a blank list. You need an assertion that specifically fails when the defensive path activates, not when the page crashes.
The failure mode nobody writes tests for
Every post-mortem you read about AI-generated code eventually lands on the same sentence: the try/catch was too wide. The linter did not care. The typechecker did not care. The 40-line unit test suite did not care, because the unit test mocked the fetch so the catch branch never even ran. The code review missed it because to a human it looked like reasonable hardening. Then a real API call failed in staging and the cart total rendered as $0.00, because the AI returned an empty list as the “safe default”.
The standard advice in every blog post is the same: let exceptions propagate. This is correct and also useless when you are shipping five PRs a day from an agent that was literally trained to add defensive wrappers. You need a test contract the agent has to satisfy before the code leaves your machine. A contract that says: the fallback banner does not render, the empty state does not render, the total is not zero. Not a hope. A hard fail.
The same function with and without the silent fallback
// What Claude wrote. Lint is happy. Typecheck is happy. // The "no errors" test is happy. The user sees nothing. export async function getProducts() { try { const res = await fetch(API_URL + "/products"); if (!res.ok) return []; return res.json(); } catch (err) { console.error(err); return []; // <-- silent fallback. tests go green. prod is empty. } }
- Lint passes
- Typecheck passes
- Regular 'page loaded' test passes
- Real API failure returns [] to the UI silently
The anchor: the exact prompt that makes this work
The mechanism is seven lines of code in the open-source assrt-mcp repo. When you pass passCriteria to assrt_test, it is injected into the browser agent's system prompt as a MANDATORY verification section. The agent is told, in the same language you would tell a human QA, that the scenario fails the moment any listed condition is violated. This is the entire unlock.
Because this is just injected prompt text, you write the criteria in plain English. The canonical example documented in the MCP tool schema at src/mcp/server.ts:343 is literally “Error toast does NOT appear”. Negative assertions are first-class. That is the whole point.
What the passCriteria list looks like
Every entry below is a condition the defensive fallback path would silently violate. If the real API returns garbage and the catch branch covers it, at least one of these flips false and the scenario fails. This is not prose for a human; it is the verification contract the agent evaluates during the run.
How the pieces connect
Three inputs, one gate, one decision. The agent that edited the code authors the passCriteria in the same turn, hands it to the Assrt MCP tool, and reads the result back before it moves on. No handoff. No YAML. No separate CI wait.
Inputs, the gate, and the decision
The MCP call in one paste
This is the entire integration. If the agent is allowed to call assrt_test via MCP, it is also allowed to author the passCriteria. In most setups you seed a starter list once, then the agent extends it every time it touches a new surface.
What a #Case file looks like alongside it
Scenarios describe the user flow. passCriteria describes the states that must not exist at the end of that flow. Together they describe a happy-path outcome the fallback cannot fake.
What an actual run looks like
Here is the terminal output from a run against a dev server where the products endpoint returned a 503 and the catch branch silently returned []. The standard assertion set would have passed. The passCriteria entry failed on “No items found does NOT appear”.
Positive assertions vs. negative criteria
Most test frameworks are optimized for positive assertions: proving that a thing is. AI defensive code tricks positive assertions because the fallback is a valid thing. Negative criteria are where the leverage is. You are not asking the agent to prove the page loaded. You are asking it to prove the fallback did not.
The assertion shape that actually catches silent fallbacks
| Feature | Regular positive assertions | passCriteria (Assrt) |
|---|---|---|
| Detects empty-state fallback from catch branch | No. 'page loaded, no errors' passes on empty state | Yes. 'No items found does NOT appear' fails the scenario |
| Detects generic 'Something went wrong' banner | No. The banner has no error-boundary throw | Yes. 'page does NOT contain Something went wrong' |
| Detects $0.00 cart from fallback default | No. 0 is a valid number | Yes. 'cart total is NOT $0.00' on checkout |
| Detects silent redirect to /offline | Sometimes. Depends on the URL assertion | Yes. 'does NOT redirect to /offline' is explicit |
| Prompt-level contract, not just a post-run assert | No. Asserts evaluated after the agent is done | Yes. Injected as MANDATORY into the agent's system prompt |
| Open source, self-hosted, in-repo tests | Varies | Yes. assrt-mcp, plain text, your repo |
A checklist you can copy
If you are adding passCriteria to a project for the first time, these are the negative conditions worth seeding before you let the agent extend them. They cover the failure modes that a catch-all try/catch loves to hide.
Seed list of negative criteria
- The string 'Something went wrong' does NOT appear on any rendered page.
- The string 'Temporarily unavailable' does NOT appear on any rendered page.
- A generic empty-state headline does NOT appear when the route should have data.
- The cart total is NOT displayed as $0.00 on /checkout.
- The page does NOT redirect to /offline, /error, or /maintenance.
- The page does NOT render a toast with severity=error.
- The page does NOT render an empty skeleton for longer than 4 seconds.
- The body does NOT contain the literal word 'fallback'.
Wiring it up in four steps
This is a one-time setup per machine. After it runs, the agent has the MCP tool available and a global reminder to use it whenever it touches user-facing code. passCriteria becomes a field the agent knows to fill in.
Install the MCP server
Run npx @assrt-ai/assrt setup. Registers the assrt MCP server globally with Claude Code, writes the PostToolUse hook that nudges the agent to run assrt_test after commits, and appends a QA section to your global CLAUDE.md.
Check in a #Case file with the scenarios
Drop tests/fallback-guards.txt into the repo. Describe user flows in plain English. Each #Case runs in a real Playwright browser. The agent that edited the code extends this file in the same turn.
Seed the passCriteria list
Keep the list of negative conditions in the same file (or a sibling file the agent passes as --pass-criteria-file). Start with the copy-paste list above. The agent will grow it each time it touches a new surface.
Gate the build on failedCount
assrt run --json writes a TestReport to stdout. jq '.failedCount' > 0 is the only condition you need. Run it in a pre-push git hook, a GitHub Actions step, or a Vercel build hook.
The one-line mental model
A defensive fallback test is a test that says: “if the catch branch runs, the scenario FAILS.”
Every other framing leaks. Positive assertions go green on empty states. Error-boundary tests miss silent empty-array defaults. Network mocks bypass the exact layer the AI wrapped in try/catch. The only framing that survives contact with a 50-messages-a-day agent is a negative contract, injected into the agent's own prompt, evaluated by a real browser.
Coverage at a glance
Most teams who adopt this pattern end up with a small list of negative criteria that covers the majority of silent-fallback failure modes. The concrete number varies; the shape does not.
Surfaces where this pattern pays rent
Any surface where the user sees a list, a number, or a status is a surface where the defensive fallback has somewhere plausible to hide. Seed the negative criteria against the noun the user cares about.
Want the seed passCriteria list for your repo?
Hop on a 20-minute call. We will walk through your code and leave you with a concrete list of negative criteria your agent can extend from day one.
Book a call →Frequently asked questions
Why does a passing test miss AI defensive fallback code?
Because the fallback is the success path from the test's point of view. If the real API call fails and Claude's try/catch returns {items: []} or a mocked response, the page renders a valid-looking empty state. Your assertion of 'page loaded, no errors' is satisfied. The green check is lying. You need an assertion that specifically fails when the fallback path activates, not when the page crashes.
What is passCriteria in Assrt and how is it different from a regular assertion?
passCriteria is a free-text field on assrt_test that the agent treats as a MANDATORY verification contract. It injects into the agent's system prompt as '## Pass Criteria (MANDATORY). The test MUST verify ALL of the following conditions. Mark the scenario as FAILED if any condition is not met.' This is the exact text at src/core/agent.ts:670-672 of the open-source assrt-mcp. Unlike a regular assertion that the agent writes after exploring the page, passCriteria is a pre-committed negative contract you supply BEFORE the run starts.
What does a negative criterion look like in practice?
The canonical example shipped in the MCP tool schema (src/mcp/server.ts:343) is literally 'Error toast does NOT appear'. More useful fallback-catching criteria: 'A generic empty state message like No items found does NOT appear when the API key is valid', 'The fallback banner with text Temporarily unavailable does NOT appear', 'The cart total is NOT $0.00', 'The page does NOT redirect to /offline'. Each forces the agent to affirmatively prove the defensive path did not fire.
How is this different from just mocking the network layer in a unit test?
Unit tests with mocked network calls prove the happy path of the real implementation. They cannot prove the AI-inserted try/catch did not convert a 500 into a silent empty array at runtime. Assrt runs a real Playwright browser against your real dev server, which calls the real backend. If the fallback fires in production-shaped conditions, the scenario fails. Mocks can't catch this because mocks bypass the layer where the defensive fallback lives.
Do I have to write Playwright code to get this?
No. You write plain-language scenarios in a #Case N: format, and a short list of passCriteria. Assrt uses Playwright under the hood (real browser, real network, real screenshots) but you never touch selectors or wait conditions. The same AI agent that wrote the suspect fallback can extend the scenario file in the same turn. There is no proprietary YAML.
Is it open source and self-hosted?
Yes. assrt-mcp and the CLI are open source, run locally, and require no cloud account. You install with npx @assrt-ai/assrt setup. The artifact uploader is opt-in. Tests live in your repo as plain text files. Vendor lock-in is zero.
Can I pipe this into my existing CI?
Yes. assrt run --json writes a TestReport (url, scenarios[], totalDuration, passedCount, failedCount, generatedAt) to stdout. A single jq .failedCount check is a complete gate. Run it in GitHub Actions, Vercel build hooks, or a pre-push git hook. failedCount > 0 fails the build; there is nothing else to learn.
What kinds of fallback code does this actually catch?
Catch-all try/catch around fetch that returns [] or null. Next.js route handlers that swallow thrown errors and return {error: 'Something went wrong'}. React error boundaries that render a generic 'Temporarily unavailable' banner. useEffect hooks that silently set loading=false on rejection. Server components that return a default empty object when the DB call throws. All of those render a page that looks fine to a passing green test but is broken from the user's point of view.
Open source. Self-hosted. No account required.