E2E vs integration testing, in one #Case

Splitting tests by layer is a tooling artifact. One agent, both surfaces.

The conventional split is clean: integration tests hit the backend, E2E tests drive the browser, never the two shall meet in a single file. That partition exists because most runners only give you one of the two surfaces. Assrt declares browser actions and HTTP requests as peer tools the agent invokes from inside the same #Case. You watch the form submit, then poll the webhook, then assert the payload, all in one assertion stream. This page shows the exact lines in src/core/agent.ts where that becomes real.

M
Matthew Diakonov
11 min read
4.9from one #Case, both layers
http_request tool defined at src/core/agent.ts:171-184, peer of click and assert
External API Verification system prompt at agent.ts:243-247 wires the cross-layer pattern
30s fetch timeout, 4000-char response window at agent.ts:925-955
{{VAR}} substitution at agent.ts:377-381 keeps tokens out of plans
scenarioPassed at agent.ts:901 collapses browser + API asserts to one bit
MIT licensed, plans are plain markdown, no proprietary YAML

Why the split exists in the first place

Integration tests stop short of the browser because the runners that catch most of them (Vitest, Jest, pytest) cannot drive a browser. E2E tests stop short of internals because the runners that drive a browser (Playwright, Cypress) treat raw HTTP as a fixture, not as a peer of click. That is why the two kinds of tests live in different files. The split is an artifact of who owns the loop, not a fact about the bugs you are trying to catch.

A real bug in a contact form does not respect that boundary. The form submit goes through the browser, the webhook fires through a server, the proof is in Slack. Catching that bug from a browser-only test means trusting the response toast. Catching it from an API-only test means trusting that nothing in the UI is quietly broken. The bug actually lives across the seam, which is exactly where conventional suites do not look.

TOOLS array at agent.ts:16-196http_request at agent.ts:171-184External API Verification at agent.ts:243-247{{VAR}} substitution at agent.ts:377-38130s fetch timeout at agent.ts:938-942scenarioPassed flip at agent.ts:901complete_scenario at agent.ts:905-913

Where the boundary disappears in the source

The single load-bearing fact is that http_request is declared in the same TOOLS array as click, type_text, and assert. Nothing in the runtime distinguishes a browser tool from an API tool; the LLM picks whichever fits the next step. That is why a plan can move between layers without a fixture or a context switch.

src/core/agent.ts:171-184 (http_request tool definition)
src/core/agent.ts:243-247 (system prompt block)
1 prompt block

When testing integrations (Telegram, Slack, GitHub, etc.): use http_request to call external APIs. This lets you verify that actions in the web app produced the expected external effect.

/Users/matthewdi/assrt-mcp/src/core/agent.ts:243-247

The TestAgent picks the next tool, regardless of layer

Browser actions, accessibility-tree snapshots, in-page JS, and external HTTP calls all flow through one agent into one assertion stream. The diagram below is the actual data flow at runtime, not a marketing abstraction.

One TestAgent, multiple peer tools, one assertion stream

click
type_text
snapshot
evaluate
http_request
TestAgent
assert
scenarioPassed
complete_scenario

A cross-layer #Case, written in markdown

The plan below is a single #Case that proves the contact form works end-to-end and the webhook fired. It uses click and type_text for the browser side, http_request for the Slack side, and assert twice. There is no fixture, no second context, no setup hook.

/tmp/assrt/scenario.md

The same coverage in Playwright + APIRequestContext

The Playwright equivalent works, but it pays for the layer split: a separate APIRequestContext fixture, a separate auth setup, a separate dispose(), and an expect() per side that the spec author has to remember to write. The Assrt version is the same idea expressed once.

tests/contact-form.spec.ts (Playwright + APIRequestContext)
/tmp/assrt/scenario.md (Assrt cross-layer #Case)

What it looks like when the agent runs it

The terminal stream below is what shows up in Claude Code when a cross-layer #Case executes. Browser steps and HTTP steps interleave naturally; the agent decides per-turn which tool to call. There is no per-layer routing log.

claude-code: assrt_test

The execution flow, message by message

Lifelines for the four actors that participate in a cross-layer #Case. The TestAgent is the only place the plan sees; the browser and the Slack API are siblings under it.

Cross-layer execution: browser and API as peer downstreams

#Case 1TestAgentBrowser (Playwright MCP)Slack APIclick 'Send'click ref=e23snapshot: 'Thanks' visibleassert: toast okhttp_request GET /conversations.historyfetch with bearer200 + messages[]assert: webhook payload matches

How both layers collapse to one passed bit

The assert tool is what closes the loop. Whether the assertion came from inspecting the DOM or from inspecting an API response, it writes into the same scalar. One passed:false flips the whole #Case to failed. There is no per-layer reconciliation in the report.

src/core/agent.ts:893-903 (assert runtime)

Five things that follow from peer tools

1

1. Plan reads as one narrative

A #Case alternates browser steps and HTTP steps in the order a human would: click the button, then check that the side effect actually landed. No 'browser block, then API block,' no fixture switch.

2

2. Agent picks the next tool from the same TOOLS array

agent.ts:16-196 declares click, type_text, snapshot, evaluate, http_request, and assert as peer entries. The LLM picks one per turn based on what the plan says next. There is no per-layer routing logic; the layer is a property of the tool, not the test.

3

3. Bearer tokens and signed payloads pass inline

http_request takes headers as a key-value object and body as a JSON string (agent.ts:171-184). The agent at agent.ts:925-955 calls fetch with those values, returns status plus up to 4000 chars of response. {{SLACK_TOKEN}} substitution lets you keep secrets out of the plan.

4

4. assert collapses both layers to one bit

The assert tool (agent.ts:893-903) takes description, passed, evidence and writes to scenarioPassed. A failed UI check and a failed API check feed the same scalar. There is no 'this side passed but that side failed': the #Case passed or it did not.

5

5. complete_scenario emits a structured run report

Each #Case ends with complete_scenario. The run report stores assertions as { description, passed, evidence } so a CI consumer can see exactly which layer produced which evidence. You get one stream of evidence, not two reports to reconcile.

The mechanics

Six concrete pieces of the runtime that together make the cross-layer pattern possible. Each one points at a line of open-source code at github.com/assrt-ai/assrt-mcp.

http_request is declared next to click

TOOLS at agent.ts:16-196 lists every action the agent can take. http_request is index 17, click is index 3, assert is index 14. Nothing in the system separates them by layer. The agent picks whichever fits the next step in the plan.

30s timeout, 4000-char response window

agent.ts:925-955 issues fetch with a 30-second AbortController and truncates the body to 4000 chars. Long bodies print '...(truncated)' so the LLM does not blow context on a 1MB response. Status line is always returned.

Variables for tokens and signing keys

Plans support {{VAR_NAME}} substitution at agent.ts:377-381. Pass SLACK_TOKEN, GITHUB_PAT, signing keys as variables on the assrt_test call; the runtime swaps them in before the agent ever sees the plan text.

No new context, no dispose()

Playwright requires a separate APIRequestContext.newContext() for non-browser HTTP and a matching dispose() at the end. Assrt's http_request lives in the same execution context as the browser tools. The plan stays linear.

scenarioPassed is one bit

agent.ts:901: `if (!passed) scenarioPassed = false`. Browser-side asserts and API-side asserts both write to it. There is no per-layer pass/fail split a downstream consumer would have to reconcile.

Open source, MIT, your tests are markdown

The plan is plain markdown stored anywhere you want. The runner is MIT-licensed at github.com/assrt-ai/assrt-mcp. If Assrt disappears tomorrow, your plans still describe the test; you bring your own runner.

What you get when the boundary collapses

Practical consequences of writing one cross-layer scenario instead of two single-layer ones.

Effects of writing one #Case across both layers

  • Single source of truth per behavior: one #Case covers the form submit and the webhook fire, no sibling directory to hunt.
  • Drift between mock and prod is gone: the HTTP call hits the real Slack API, not a mock, so upstream schema changes surface as scenario failures.
  • Failure messages name the layer: each failed assert stores description plus evidence, so reports show 'form passed, webhook missing,' not just red.
  • Claude Code or Cursor can author them: plans are markdown, so an IDE agent writes a new #Case from the diff with no DSL to learn.

By the numbers, pulled from the runtime

Four constants you are working against, all sourced from the agent code, none invented.

0s
http_request fetch timeout (agent.ts:938-942)
0 chars
response window before truncation
0 bit
scenarioPassed across both layers
0 YAML
plans are plain markdown

Cross-layer #Case vs. Playwright + APIRequestContext

Both approaches can verify a contact form posts to Slack. The Playwright route maintains two contexts and joins them in your spec author's head. The Assrt route keeps one agent, one plan, one assertion stream.

FeaturePlaywright + APIRequestContextAssrt
Surface a single test can touchBrowser only, or API only, per fileBrowser + external HTTP, peer tools
How auth flows between layersTwo separate contexts, two auth setupsBearer in http_request headers, set inline
Where the assertion livesexpect() per layer, joined by spec authorassert tool, one stream, one passed bit
Failure attributionStack trace plus your own loggingWhich layer the assert that failed came from
Drift between mock and prodMock asserts contract, prod can divergeReal API hit, real response inspected
Authoring surfaceTS spec + APIRequestContext fixtureMarkdown #Case, no fixtures, no dispose()
What you keep when you leaveVendor YAML or proprietary DSLPlain markdown plans, MIT-licensed runner

When you should still keep the layer split

For pure-function units, the cross-layer pattern is overkill. A reducer, a SQL query plan, a date parser does not need a browser, and Vitest or pytest will run hundreds of those in milliseconds. Assrt is not a unit-test framework and should not be used like one.

The cross-layer pattern earns its keep wherever the bug-surface straddles the seam: form submits with side effects, OAuth callbacks, webhook integrations, billing flows, anything where a frontend assertion would be misleading without a backend verification, and a backend assertion would be misleading without proof the user could trigger it. Those tests are the ones that used to live in two files. Now they live in one.

Walk through a cross-layer #Case on your own app

Bring a webhook-driven flow you already have a flaky test for. We will write the cross-layer #Case live and run it against your staging URL.

Frequently asked questions

What is the textbook difference between E2E and integration testing?

Integration tests verify two or more units talk to each other correctly: a service against a real database, an API handler against its serializer, two modules sharing a contract. They stop short of driving a real browser. E2E tests drive a real browser through the full user-visible stack: the DOM, network requests, third-party scripts, authentication. The conventional advice is to write many integration tests and few E2E tests because the latter are slow, flaky, and expensive per assertion.

Why are these two kinds of tests usually written in different files?

Because most runners only give you one of the two surfaces. Playwright drives a browser; APIRequestContext is a fixture you opt into, and the convention is one or the other per spec. Cypress has cy.request, but it is treated as setup not as a peer of cy.visit. Vitest and Jest run in Node and never see a browser. The split between integration and E2E is therefore a tooling artifact, not a fact about the bugs you are trying to catch.

What does Assrt actually do differently?

Assrt declares browser actions and HTTP requests as peer tools the LLM agent calls in any order from inside one scenario. The TOOLS array at /Users/matthewdi/assrt-mcp/src/core/agent.ts:16-196 lists click, type_text, snapshot, evaluate, assert, and http_request side by side. The agent driving a #Case picks whichever tool fits the next step. A single scenario can fill out a form in the browser, then call an external API to verify the side effect, then return to the browser to confirm the UI updated.

Where is this wired in the source code?

Two places. The http_request tool definition is at agent.ts:171-184, with parameters for url, method, headers, and body. The system prompt at agent.ts:243-247 includes a section titled 'External API Verification' that tells the agent: 'When testing integrations (Telegram, Slack, GitHub, etc.): 1. Use http_request to call external APIs (e.g. poll Telegram Bot API for messages). 2. This lets you verify that actions in the web app produced the expected external effect.' The example given is calling https://api.telegram.org/bot<token>/getUpdates after a Telegram-connect flow. That is the entire integration-plus-E2E pattern in one block of prompt.

Is this just a cy.request equivalent?

No, the mental model is different. cy.request is a fixture that runs outside the browser and returns to the test runner; it is treated as out-of-band setup. http_request in Assrt is a tool the agent invokes mid-scenario the same way it calls click. There is no mental break, no 'now we go out of the browser, now we come back.' The agent reads the accessibility tree, sees the form was submitted, calls http_request to check the API, sees the response, then snapshots the DOM again to confirm the UI reflected the change. Each step is one tool call against a unified context.

What does that mean for how I structure my test suite?

You stop maintaining two suites. The integration-test directory and the E2E-test directory collapse into a single directory of #Case markdown files. A scenario named 'webhook fires after settings save' covers what would have been one webhook integration test plus one settings-save E2E test, with the cross-layer assertion that they agree. You write fewer scenarios, each catches more.

When does the boundary still matter?

When the unit under test has no UI surface. A pure function, a serializer, a SQL query: those should stay in Vitest or pytest, run in milliseconds, and never touch the browser. Assrt does not replace that layer. Where it changes the calculus is everything that involves a UI plus a service, which is most of the bug-surface in a real product.

Does the http_request tool support auth headers and bodies?

Yes. The tool schema at agent.ts:171-184 takes url, method (GET/POST/PUT/DELETE), headers as a key-value object, and body as a JSON string. The runtime at agent.ts:925-955 issues the request via fetch with a 30-second timeout, returns the status line and up to 4000 chars of the response, and truncates the rest. The agent can read that response and decide what to do next. Bearer tokens, signed payloads, and webhook signatures all work; you pass them as part of the plan.

Can the agent fail the scenario based on the API response?

Yes, and this is the point. After http_request returns, the agent calls assert with description, passed, and evidence. The runtime at agent.ts:893-903 records the assertion: a passed:false flips scenarioPassed for the whole #Case. The browser side and the API side feed into the same pass/fail bit. If the form submitted but the webhook never fired, or the webhook fired with the wrong payload, the scenario fails with a structured assertion you can read in the run report.

How is this different from API-mocking middleware in Playwright or MSW?

Mocking proves your frontend behaves correctly when the backend behaves correctly. http_request proves your backend actually behaved correctly. The two kinds of tests catch different bugs. Mocking can hide a real-world failure: the frontend passes against the mock and breaks against production. The cross-layer scenario in Assrt observes both sides of the boundary in one run, so 'the form said it worked' and 'the side effect happened' are tied together in a single assertion stream. You still mock in unit tests where mocking is the right move; you stop mocking in scenarios that need to catch full-stack drift.

assrtOpen-source AI testing framework
© 2026 Assrt. MIT License.