The test pyramid inverts when components mix logic and rendering. Extract the conditionals into pure functions and the pyramid rights itself.
Most teams reading their CI dashboard see a top-heavy pyramid (many E2E, almost no unit tests) and conclude they need more discipline. The real cause is upstream of testing: branching logic lives inside components that also render JSX, so the only path to exercise it is to mount, hydrate, and click. Move the conditionals into pure functions and most E2E cases collapse into millisecond unit tests. E2E shrinks by half on average, flake drops by more, CI minutes follow. 0 scenarios become 0.
What you are moving toward
The pyramid is a symptom. The cause is one component.
When CI takes nineteen minutes and 80 percent of that is Playwright, the instinct is to blame test discipline. Write more unit tests, ban new E2E cases, run a hackathon. None of it works for very long, because the next feature is going to land another component that mixes three new branches with its render output, and the only place those branches can be exercised is in a real browser.
The architectural fact is simple. A conditional inside JSX is a rule plus a render. The rule is reusable, deterministic, and could be tested in a microsecond. The render is not. Glued together they inherit each other's worst properties: the render needs a DOM, and now the rule does too. Once you see this, the pyramid metric stops mattering. The cause sits one layer up, in the file tree.
Symptom
CI is 80 percent E2E
Every rule in the app gets verified by clicking through the UI. Branches that should run in microseconds run in seconds. The feedback loop is broken.
Cause
Logic is glued to JSX
The rule and the render share a function. Testing the rule means mounting the render. Browser tests are the only path because the code made them the only path.
A normal-looking checkout component
Three rule clusters (tax, promo, shipping) live inside the component. Each branch is a real business decision. Each branch is also reachable only by rendering the component with the right props. The only test that proves the CA tax rate is correct is a Playwright scenario that adds an item, signs in as a CA user, and reads the row. Multiply by twenty branches and you have an inverted pyramid by accident.
The same logic, lifted into pure functions
The component now renders, does not branch
Tests for the rules, in vitest
Fifty cases. Plain object inputs, plain number outputs. No mock, no fixture, no DOM, no Playwright config. The whole file runs in under a second. When a tax rule changes, the test that breaks tells you the country, the state, and the expected number, in a stack trace that fits on one screen.
What the E2E suite looks like after extraction
The point of E2E is to prove that the user-visible output of the rules actually appears, that the form submits, that the redirect lands. After extraction, the scenarios shrink to three.
The flow, before and after
Before, every business decision crossed a browser. After, the browser only crosses for layout, integration, and side effect concerns.
Inputs → engine → rendered output → side effects
Why this pays for itself within a sprint
The benefits are not abstract. They show up the first time you change a rule and the CI verdict beats you back to the keyboard.
Test the rules in milliseconds
A pure function call returns in under a millisecond. Fifty pricing cases run in less than a second on a laptop, faster than a single Playwright page load. CI gives you the verdict on a price rule change before you finish typing the commit message.
No DOM, no fixtures, no waiting
Pure functions need no rendered tree, no jsdom shim, no MSW mocks, no react-testing-library setup. The input is a plain object. The output is a plain object. The diff between expected and actual is a JSON diff.
Coverage that means something
Branch coverage on pure functions actually maps to business rules. Branch coverage on a component is a mix of business rules, render branches, and accessibility quirks. Separating the two lets you set a meaningful coverage gate per layer.
Property-based tests become possible
Fast-check or jsverify can blast a thousand random inputs through computeTotal in two seconds. You catch the cases your fixtures forgot. You cannot do that against a Playwright scenario without burning fifteen minutes of CI per run.
E2E becomes a layout and integration check
After extraction, E2E does what only E2E can do: confirm that the user-visible output of the rules actually appears, that the form submits, that the redirect lands, that the webhook fires. The rule-checking happens earlier and cheaper.
Less flake, by construction
Most E2E flakes are not the rule under test. They are timing on a render, animation on a row, or a spinner that lingered 50ms. Cutting the E2E surface to a third cuts the flake budget by at least the same factor and usually more.
Refactors stop breaking tests
Move CheckoutSummary into a Server Component, swap to Tailwind from CSS modules, switch the row layout from grid to flex. Pricing tests never run because pricing.ts did not change. The architecture made the tests robust to layout changes.
The before and after, head to head
Same rules, same coverage, different layer. The cost difference is not a few percent, it is an order of magnitude on every axis that matters: runtime, flake, fixture footprint, refactor robustness.
| Feature | Before extraction | After extraction |
|---|---|---|
| Time to verify a tax rule change | Spin up dev server, navigate to checkout, fill cart, change country, eyeball the row. 90 seconds in CI, 4 minutes locally if the dev server cold-starts. | Edit pricing.ts, save, watch vitest fire 6 cases in 80 milliseconds. |
| Cost per branch covered | One Playwright scenario per branch. Each scenario averages 12 to 30 seconds in CI. Twenty branches is six to ten minutes. | One it() per branch. Twenty branches in under a second total. |
| Flakiness budget consumed by pricing tests | Each scenario adds DOM timing, network mocks, fixture data, and a real browser. Flake rate climbs roughly linearly with scenario count. | Pure functions cannot flake. Zero contribution to the flake budget. |
| Effect of a CSS refactor on the test suite | Selector drift breaks scenarios. Test maintenance becomes a CSS chase. | Pricing tests do not touch the DOM, so the CSS refactor is invisible. E2E layer catches actual user-visible regressions. |
| Fixture footprint | Each E2E scenario needs a logged-in user, a populated cart, a country selection, and a promo code seed. Fixtures balloon. | Inputs are plain objects literal-defined inside the test file. No fixture loader, no test database row. |
| Property-based testing | Impractical. A thousand random Playwright scenarios is hours of CI. | Trivial. fast-check fires a thousand random PricingInputs through computeTotal in 2 seconds. |
| Where new rule lives | In a component, mixed with JSX and styling. Hard to grep for the rule. Easy to duplicate when a similar component needs the same rule. | In pricing.ts. One source of truth. Two components can import it without duplicating logic. |
| What E2E now proves | A confused mix of business rules, layout, and integration that no single layer would have proven cleanly. | Layout matches design, the form submits, the right number lands in the right cell. Rule correctness is already proven. |
The numbers come from a real checkout module migration on a Next.js app with about 4000 lines of route code. Your mileage will vary with how branchy your components are; the more branches, the bigger the gain.
A six step extraction you can run on one component this afternoon
Pick the component with the slowest test file or the most branches and walk it through these steps. The first one takes an hour. The tenth one takes ten minutes.
Find the conditional
Open the component. Search for if, else, ternary, switch, and any reduce that is doing more than summing. Anything that decides what number, what string, or what status to show is a candidate. Anything that just decides what wrapper element to render is layout, leave it alone.
Name the function by the question it answers
computeTax, applyPromo, isEligibleForRefund, formatShippingEstimate. Name verbs the way the product team would describe the rule on a Notion page. The name is what the test file becomes, and the test names follow.
Inject every external read
Date.now, fetch, localStorage, window, the router, anything that depends on time or the runtime. Pass it in as an argument. The function becomes deterministic, which is the precondition for fast tests with no setup.
Move the function to a separate file
Same folder is fine. Adjacent .ts file with no React import. The compiler will refuse to let you accidentally drag JSX into it. The component imports the function and passes its inputs.
Write the unit tests against the rule, not the screen
One describe block per rule, one it per branch. Coverage of the function should hit 100 percent before you touch the E2E case. The E2E now only verifies that the rendered output reflects what the function returned, which is one assertion per case rather than one per branch.
Trim the E2E suite
Open the existing E2E specs that exercised the old branches through the UI. Most of them are now redundant. Keep one happy path, one error path, and one boundary case per surface. Delete the rest. CI thanks you, your on-call thanks you, and your coverage report does not regress because the unit tests caught the branches that the E2E specs were proving.
The CI delta, in your terminal
One commit, one branch, two test commands. The new pyramid is visible on day one of the migration.
The reframing
You do not have a testing problem. You have a layering problem.
Inverted pyramids do not appear because engineers prefer slow tests. They appear because the code made fast tests impossible to write. The fix is not a test policy, a coverage gate, or a new tool. The fix is a one-file refactor, repeated across the twenty most branchy components in the app, that lets the unit layer carry the weight it was always supposed to carry.
An agentic tester (Assrt is one option) is more useful after this refactor, not less, because the scenarios you hand it become meaningful end to end checks instead of expensive rule lookups.
“The pyramid is a metric on the testing layer. The cause lives on the architecture layer. Move conditionals into pure functions and the pyramid rights itself, no policy required.”
Six checkout migrations, 2024 to 2026
Want help spotting the most branchy component in your repo?
Bring the GitHub URL, we will pair on one extraction in 30 minutes and you keep the diff. No pitch.
Frequently asked questions
What does it mean for the test pyramid to be inverted?
The classical pyramid has many fast unit tests at the base, fewer integration tests in the middle, and a small number of slow end-to-end tests at the top. An inverted pyramid is the opposite shape: few unit tests at the bottom, many E2E tests at the top, and almost nothing in the middle. Inverted pyramids are slow, flaky, and expensive to run, because every change waits on a real browser to confirm a one-line rule. Most teams arrive at this shape not by choice but by accident: they wrote components that mix logic and rendering, so the only way to verify a rule is to mount the component, which is a job for E2E.
Why do components end up mixing logic and rendering?
Two reasons. First, it feels fast in the moment. The rule is small, the JSX is right there, just throw an if next to the row that displays the result. Second, frameworks make it the default. Server components, hooks, useEffect, and JSX expressions encourage you to inline a ternary or a switch directly into the tree. There is no friction asking you to extract. The friction shows up months later when the component has fifteen branches and the test file is a Playwright spec that takes twelve minutes to run.
What counts as a pure function for the purpose of this refactor?
A function whose output is determined entirely by its inputs, with no side effects, no calls to Date.now or Math.random or fetch, no access to the DOM, no reading of localStorage or environment variables. If you call it twice with the same arguments, you get the same answer. If you need a clock or a random seed, pass it in as an argument. The point of purity is that the function can be tested with a literal object and a literal expected output, no setup, no teardown, no mocks.
How do I handle async logic, like a fetch that depends on user state?
Split the function into two layers. The pure layer takes data and returns a decision. The impure layer fetches the data and calls the pure layer. The pure layer gets unit tested exhaustively. The impure layer gets a single integration test that verifies the wiring. Most async logic, when you look closely, is 90 percent rule and 10 percent fetch glue. The 90 percent should be pure.
Will this make my components feel less idiomatic in React or Next?
It will make them shorter and simpler. A component whose only branches are layout branches reads like a template, which is what components were originally meant to be. The business logic moves to a place where the React reconciler does not need to know about it. Performance often improves because the component re-renders less. The mental model becomes: pure functions own the rules, components own the pixels, hooks own the lifecycle.
Should I delete all my E2E tests for these rules after extraction?
Delete most, keep a few. The E2E layer should still prove that the rule output makes it onto the screen, that the form submission triggers a recompute, and that the rendered numbers match what the rule engine returned. One happy path, one error path, and one boundary case per UI surface is usually enough. The savings come from not running the same rule branch through the browser thirty times.
How does this interact with AI-driven testing tools like Assrt?
Cleanly. An agentic tester like Assrt is most valuable when the E2E layer is small and intentional, because the AI is good at flexibly verifying user-visible behavior but expensive at re-proving rules that a unit test could have proven in a millisecond. After extraction, the scenarios you hand to Assrt or any other agent become layout, integration, and end-to-end side effect checks: did the email actually arrive, did the webhook actually fire, does the page render the right number. The rule correctness is settled by vitest before the agent ever loads a browser.
How big is the E2E reduction in practice?
On a typical checkout, signup, or billing surface, somewhere between 40 and 70 percent. Anywhere there is a switch on country, role, plan, feature flag, A/B variant, or promo code, the extraction collapses many E2E branches into a single E2E case that proves the wiring plus a unit test file that proves the rules. Teams that have done this report E2E suite size cut in half and CI minutes cut by an order of magnitude on the affected modules.
What about visual regressions and design drift?
Those still belong to E2E or to a dedicated visual layer. Visual regressions are not a logic concern, they are a layout concern, and pure functions cannot help with them. The benefit is that your visual tests are no longer drowning in rule-driven scenarios, so they run more often and feedback arrives faster.
Where should the pure functions actually live in the file tree?
Co-located with the feature, in a separate file from the component. For a checkout module, src/checkout/pricing.ts and src/checkout/pricing.test.ts sit next to src/checkout/CheckoutSummary.tsx. The unit tests live next to the function. The component imports both the function and its types. There is no shared utils dump and no global lib of business rules; each feature owns its own engine.
Does this work for backend code too?
Yes, even more so. The same shape applies on the server: HTTP handlers and database calls are the impure shell, the rule engine is a pure function. Integration tests cover the shell. Unit tests cover the rules. The pyramid stays right-side up because the surface area for E2E shrinks to the genuinely end-to-end concerns: contracts between services, latency, retries, idempotency.
Is there a way for an AI tool to spot conditionals that should be extracted?
Static analysis can flag components above a certain branch-count threshold and suggest extraction targets. An agentic tool can do better: it can read the component, identify the rule clusters, and propose a pricing.ts equivalent with a paired test file. Assrt's planning layer surfaces high-branch components in a codebase as candidates for extraction; the actual refactor stays a human decision because the rule names and the function boundaries are a design choice, not a mechanical one.
How did this page land for you?
React to reveal totals
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.