Negative Assertions

Your AI's try/catch is why the test passes. Write the test that fails when the fallback fires.

The reason your green CI is lying is that Claude wrapped your fetch in a catch-all and returned []. The page renders a valid empty state. Nothing throws. The test goes green. Users see a blank list. You need an assertion that specifically fails when the defensive path activates, not when the page crashes.

M
Matthew Diakonov
8 min read
4.9from developers running Assrt locally
Plain-text passCriteria becomes a MANDATORY contract in the agent prompt
Real Playwright browser, real network, real fallback detection
Open source. Tests live in your repo. Zero vendor lock-in.

The failure mode nobody writes tests for

Every post-mortem you read about AI-generated code eventually lands on the same sentence: the try/catch was too wide. The linter did not care. The typechecker did not care. The 40-line unit test suite did not care, because the unit test mocked the fetch so the catch branch never even ran. The code review missed it because to a human it looked like reasonable hardening. Then a real API call failed in staging and the cart total rendered as $0.00, because the AI returned an empty list as the “safe default”.

The standard advice in every blog post is the same: let exceptions propagate. This is correct and also useless when you are shipping five PRs a day from an agent that was literally trained to add defensive wrappers. You need a test contract the agent has to satisfy before the code leaves your machine. A contract that says: the fallback banner does not render, the empty state does not render, the total is not zero. Not a hope. A hard fail.

The same function with and without the silent fallback

// What Claude wrote. Lint is happy. Typecheck is happy. // The "no errors" test is happy. The user sees nothing. export async function getProducts() { try { const res = await fetch(API_URL + "/products"); if (!res.ok) return []; return res.json(); } catch (err) { console.error(err); return []; // <-- silent fallback. tests go green. prod is empty. } }

  • Lint passes
  • Typecheck passes
  • Regular 'page loaded' test passes
  • Real API failure returns [] to the UI silently

The anchor: the exact prompt that makes this work

The mechanism is seven lines of code in the open-source assrt-mcp repo. When you pass passCriteria to assrt_test, it is injected into the browser agent's system prompt as a MANDATORY verification section. The agent is told, in the same language you would tell a human QA, that the scenario fails the moment any listed condition is violated. This is the entire unlock.

src/core/agent.ts

Because this is just injected prompt text, you write the criteria in plain English. The canonical example documented in the MCP tool schema at src/mcp/server.ts:343 is literally “Error toast does NOT appear”. Negative assertions are first-class. That is the whole point.

What the passCriteria list looks like

Every entry below is a condition the defensive fallback path would silently violate. If the real API returns garbage and the catch branch covers it, at least one of these flips false and the scenario fails. This is not prose for a human; it is the verification contract the agent evaluates during the run.

passCriteria (passed to assrt_test)

How the pieces connect

Three inputs, one gate, one decision. The agent that edited the code authors the passCriteria in the same turn, hands it to the Assrt MCP tool, and reads the result back before it moves on. No handoff. No YAML. No separate CI wait.

Inputs, the gate, and the decision

AI-edited code
#Case scenarios
passCriteria list
assrt_test
Scenario PASS
Scenario FAIL
Agent re-edits

The MCP call in one paste

This is the entire integration. If the agent is allowed to call assrt_test via MCP, it is also allowed to author the passCriteria. In most setups you seed a starter list once, then the agent extends it every time it touches a new surface.

agent tool call

What a #Case file looks like alongside it

Scenarios describe the user flow. passCriteria describes the states that must not exist at the end of that flow. Together they describe a happy-path outcome the fallback cannot fake.

tests/fallback-guards.txt

What an actual run looks like

Here is the terminal output from a run against a dev server where the products endpoint returned a 503 and the catch branch silently returned []. The standard assertion set would have passed. The passCriteria entry failed on “No items found does NOT appear”.

npx assrt run --url http://localhost:3000 --plan-file tests/fallback-guards.txt
0lines of assrt-mcp code that power the contract
0Playwright selectors you author
0field on assrt_test (passCriteria)
0vendor lock-in

Positive assertions vs. negative criteria

Most test frameworks are optimized for positive assertions: proving that a thing is. AI defensive code tricks positive assertions because the fallback is a valid thing. Negative criteria are where the leverage is. You are not asking the agent to prove the page loaded. You are asking it to prove the fallback did not.

The assertion shape that actually catches silent fallbacks

FeatureRegular positive assertionspassCriteria (Assrt)
Detects empty-state fallback from catch branchNo. 'page loaded, no errors' passes on empty stateYes. 'No items found does NOT appear' fails the scenario
Detects generic 'Something went wrong' bannerNo. The banner has no error-boundary throwYes. 'page does NOT contain Something went wrong'
Detects $0.00 cart from fallback defaultNo. 0 is a valid numberYes. 'cart total is NOT $0.00' on checkout
Detects silent redirect to /offlineSometimes. Depends on the URL assertionYes. 'does NOT redirect to /offline' is explicit
Prompt-level contract, not just a post-run assertNo. Asserts evaluated after the agent is doneYes. Injected as MANDATORY into the agent's system prompt
Open source, self-hosted, in-repo testsVariesYes. assrt-mcp, plain text, your repo

A checklist you can copy

If you are adding passCriteria to a project for the first time, these are the negative conditions worth seeding before you let the agent extend them. They cover the failure modes that a catch-all try/catch loves to hide.

Seed list of negative criteria

  • The string 'Something went wrong' does NOT appear on any rendered page.
  • The string 'Temporarily unavailable' does NOT appear on any rendered page.
  • A generic empty-state headline does NOT appear when the route should have data.
  • The cart total is NOT displayed as $0.00 on /checkout.
  • The page does NOT redirect to /offline, /error, or /maintenance.
  • The page does NOT render a toast with severity=error.
  • The page does NOT render an empty skeleton for longer than 4 seconds.
  • The body does NOT contain the literal word 'fallback'.

Wiring it up in four steps

This is a one-time setup per machine. After it runs, the agent has the MCP tool available and a global reminder to use it whenever it touches user-facing code. passCriteria becomes a field the agent knows to fill in.

1

Install the MCP server

Run npx @assrt-ai/assrt setup. Registers the assrt MCP server globally with Claude Code, writes the PostToolUse hook that nudges the agent to run assrt_test after commits, and appends a QA section to your global CLAUDE.md.

2

Check in a #Case file with the scenarios

Drop tests/fallback-guards.txt into the repo. Describe user flows in plain English. Each #Case runs in a real Playwright browser. The agent that edited the code extends this file in the same turn.

3

Seed the passCriteria list

Keep the list of negative conditions in the same file (or a sibling file the agent passes as --pass-criteria-file). Start with the copy-paste list above. The agent will grow it each time it touches a new surface.

4

Gate the build on failedCount

assrt run --json writes a TestReport to stdout. jq '.failedCount' > 0 is the only condition you need. Run it in a pre-push git hook, a GitHub Actions step, or a Vercel build hook.

The one-line mental model

A defensive fallback test is a test that says: “if the catch branch runs, the scenario FAILS.”

Every other framing leaks. Positive assertions go green on empty states. Error-boundary tests miss silent empty-array defaults. Network mocks bypass the exact layer the AI wrapped in try/catch. The only framing that survives contact with a 50-messages-a-day agent is a negative contract, injected into the agent's own prompt, evaluated by a real browser.

Coverage at a glance

Most teams who adopt this pattern end up with a small list of negative criteria that covers the majority of silent-fallback failure modes. The concrete number varies; the shape does not.

0
seed negative criteria
0
#Case scenarios to start
0
field on assrt_test
$0
per month, self-hosted

Surfaces where this pattern pays rent

Any surface where the user sees a list, a number, or a status is a surface where the defensive fallback has somewhere plausible to hide. Seed the negative criteria against the noun the user cares about.

product lists / search resultscart totals / checkout mathauth / session restoredashboard metricsfeature flags / plan tiersfile uploadspayment confirmationsnotification toastswebhook-driven state

Want the seed passCriteria list for your repo?

Hop on a 20-minute call. We will walk through your code and leave you with a concrete list of negative criteria your agent can extend from day one.

Book a call

Frequently asked questions

Why does a passing test miss AI defensive fallback code?

Because the fallback is the success path from the test's point of view. If the real API call fails and Claude's try/catch returns {items: []} or a mocked response, the page renders a valid-looking empty state. Your assertion of 'page loaded, no errors' is satisfied. The green check is lying. You need an assertion that specifically fails when the fallback path activates, not when the page crashes.

What is passCriteria in Assrt and how is it different from a regular assertion?

passCriteria is a free-text field on assrt_test that the agent treats as a MANDATORY verification contract. It injects into the agent's system prompt as '## Pass Criteria (MANDATORY). The test MUST verify ALL of the following conditions. Mark the scenario as FAILED if any condition is not met.' This is the exact text at src/core/agent.ts:670-672 of the open-source assrt-mcp. Unlike a regular assertion that the agent writes after exploring the page, passCriteria is a pre-committed negative contract you supply BEFORE the run starts.

What does a negative criterion look like in practice?

The canonical example shipped in the MCP tool schema (src/mcp/server.ts:343) is literally 'Error toast does NOT appear'. More useful fallback-catching criteria: 'A generic empty state message like No items found does NOT appear when the API key is valid', 'The fallback banner with text Temporarily unavailable does NOT appear', 'The cart total is NOT $0.00', 'The page does NOT redirect to /offline'. Each forces the agent to affirmatively prove the defensive path did not fire.

How is this different from just mocking the network layer in a unit test?

Unit tests with mocked network calls prove the happy path of the real implementation. They cannot prove the AI-inserted try/catch did not convert a 500 into a silent empty array at runtime. Assrt runs a real Playwright browser against your real dev server, which calls the real backend. If the fallback fires in production-shaped conditions, the scenario fails. Mocks can't catch this because mocks bypass the layer where the defensive fallback lives.

Do I have to write Playwright code to get this?

No. You write plain-language scenarios in a #Case N: format, and a short list of passCriteria. Assrt uses Playwright under the hood (real browser, real network, real screenshots) but you never touch selectors or wait conditions. The same AI agent that wrote the suspect fallback can extend the scenario file in the same turn. There is no proprietary YAML.

Is it open source and self-hosted?

Yes. assrt-mcp and the CLI are open source, run locally, and require no cloud account. You install with npx @assrt-ai/assrt setup. The artifact uploader is opt-in. Tests live in your repo as plain text files. Vendor lock-in is zero.

Can I pipe this into my existing CI?

Yes. assrt run --json writes a TestReport (url, scenarios[], totalDuration, passedCount, failedCount, generatedAt) to stdout. A single jq .failedCount check is a complete gate. Run it in GitHub Actions, Vercel build hooks, or a pre-push git hook. failedCount > 0 fails the build; there is nothing else to learn.

What kinds of fallback code does this actually catch?

Catch-all try/catch around fetch that returns [] or null. Next.js route handlers that swallow thrown errors and return {error: 'Something went wrong'}. React error boundaries that render a generic 'Temporarily unavailable' banner. useEffect hooks that silently set loading=false on rejection. Server components that return a default empty object when the DB call throws. All of those render a page that looks fine to a passing green test but is broken from the user's point of view.