Testing Guide
Run Browser Tests After Every Deploy: The Automation Guide
Your TypeScript compiles. The linter is green. Your unit tests pass. Claude Code helped you build the feature in two hours. You deploy with confidence. And then someone messages you: the checkout button is hidden behind a modal on mobile. The gap between “static analysis clean” and “actually works for users” is what automated browser testing closes.
“Type checks pass, linter happy, unit tests green. The checkout button is hidden behind a modal on mobile. No static analysis tool catches this. A browser test catches it in seconds.”
1. Why Static Analysis Cannot Catch Real Browser Bugs
Modern frontend tooling is genuinely impressive at catching a specific category of bugs: bugs that exist in the code before it runs. TypeScript catches type mismatches. ESLint catches patterns known to cause problems. Bundlers catch missing imports. These tools work at the code level, before the code encounters a real browser, real users, or real data.
The bugs that escape all of these tools are bugs that only exist when code runs in a browser. The DOM is a runtime artifact, not a static structure. Your code describes what should happen. The browser interprets that description through a pipeline of HTML parsing, CSS cascade resolution, layout calculation, compositing, and event delegation. Bugs can emerge at any stage of that pipeline, and none of them show up in a TypeScript type check.
Consider the checkout button example. A developer adds a promotional banner that appears above the checkout area on mobile viewports. The banner uses position: sticky with a z-index value, and it creates a new stacking context. The checkout button, which previously worked perfectly, now sits beneath the banner in the stacking order on screens narrower than 480px. The button is rendered. It is in the DOM. It passes TypeScript. ESLint does not know that z-index stacking contexts work this way. The developer's desktop test environment never shows the 480px breakpoint. Users on iPhone SE see a button they cannot click.
AI-assisted development accelerates the creation of this category of bug. When you accept AI-generated code for a component, you may not fully understand every CSS interaction it introduces. The AI is confident. The type checker is satisfied. The browser disagrees with both of them. The only way to know is to run the code in a real browser and verify that users can actually complete their tasks.
2. What Automated Browser Tests Actually Catch
Browser tests that run in real browsers (via Playwright, which controls Chromium, Firefox, and WebKit) catch a category of bugs that is essentially invisible to any other tool. Understanding this category explains why browser testing is not redundant with the static analysis you already run.
Layout and stacking context bugs. Elements positioned incorrectly due to CSS cascade interactions, z-index stacking contexts created by transforms or filters, overflow hidden cutting off interactive elements. These only exist in a rendered layout, which only a browser can compute.
Mobile viewport regressions. Features that work at desktop widths but break at mobile viewport sizes. Navigation menus that overflow their containers, CTAs that become inaccessible due to a fixed footer, forms that render off-screen on small displays. Playwright can run tests at multiple viewport sizes with a single configuration, catching these regressions automatically.
Third-party script interference. Your analytics, support chat, or cookie consent banner injects elements that interact with your layout. These scripts often load asynchronously and only appear after the page has been live for a few seconds. Browser tests that wait for the full page load state catch interference that tests against your dev environment (which may block these scripts) miss.
JavaScript runtime errors. Code that throws in the browser console but not in unit tests, because the error depends on the browser environment, on a DOM API not available in jsdom, or on timing of async operations that behave differently in a real browser than in a test runner.
Integration failures across components. Unit tests test components in isolation. Browser tests test them working together. A state management issue that causes component A to pass stale props to component B only shows up when both components are rendered in the same browser session.
Browser tests without writing tests from scratch
Assrt crawls your deployed app and generates Playwright tests for the flows it finds. Generates standard Playwright files you can inspect, modify, and run in any CI pipeline.
Get Started →3. Setting Up Browser Tests to Run After Every Deploy
The architecture for post-deploy browser testing is straightforward. Your CI pipeline waits for the deployment to complete, receives the deployment URL (preview URL for PRs, staging URL for merges to main), and runs the browser test suite against that URL. This tests the actual deployed artifact rather than a local development server.
For Vercel deployments, the deployment URL is available as an environment variable in GitHub Actions after the Vercel deployment step completes. The pattern is: deploy to Vercel, capture the deployment URL, run Playwright with that URL as the base URL. Netlify, Render, and Railway follow similar patterns with their own environment variables.
The Playwright configuration for testing against a deployment URL uses the webServer option when testing locally and environment variable injection when testing against a deployed URL. A single Playwright config file can handle both cases: if PLAYWRIGHT_TEST_BASE_URL is set, use it; otherwise, start a local dev server. This makes the same test suite runnable both locally and in CI without configuration changes.
Authentication is the most common complication in CI browser testing. If your critical flows require a logged-in user, you need a way to authenticate in CI. Playwright supports storage state files that capture browser cookies and localStorage after authentication, which can be reused across tests without repeating the login flow for each test. Store the authentication state file in CI cache to avoid authenticating on every run.
For teams using Claude Code or similar AI coding assistants, the deployment-triggered test run serves as an independent verification layer. The AI writes code, the code deploys, and the browser tests verify that what the AI built actually works in a real browser. This is exactly the kind of automated safety net that makes AI-assisted development sustainable at scale.
4. Generating Tests Without Writing Them From Scratch
The biggest friction in browser testing is writing the tests. A meaningful test suite for a moderately complex application takes weeks to write from scratch. For developers already working fast with AI assistance, adding weeks of test writing is not realistic. This is where automated test generation changes the calculus.
Playwright includes a code generation tool (npx playwright codegen) that records browser interactions and generates Playwright test code from them. This works well for flows you already know: you navigate through the flow manually while the recorder watches and produces test code. The limitation is that you have to manually navigate every flow you want to test, which is time-consuming for large applications.
AI-powered test discovery tools take this further by crawling the application automatically. Tools like Assrt navigate through your deployed application, identify interactive elements and flows, and generate test scenarios without requiring manual navigation. The output is standard Playwright test code, not a proprietary format, so the tests commit to your repository and run in any CI pipeline. Open-source and free, with self-healing selectors that reduce maintenance when the UI changes.
The workflow that works well for AI-assisted developers: use automated generation to produce an initial test suite, review the generated tests to add assertions for the specific behaviors you care about, commit the tests alongside the feature code. When the next feature ships, re-run generation on the updated application to pick up any new flows, review the diff to see what changed, and commit the updates. This keeps the test suite in sync with the application without a dedicated test-writing sprint.
5. Keeping Tests Fast Enough for Every Feature Change
The objection to running browser tests after every deploy is usually speed. A comprehensive browser test suite takes 10 to 30 minutes. If you are deploying multiple times per day, waiting 30 minutes after every deploy is not practical. There are well-established strategies for making this work.
Parallelization is the biggest speed lever. Playwright supports test sharding, which splits the test suite across multiple CI workers. A 20-minute suite running on 4 workers takes 5 to 7 minutes. GitHub Actions supports matrix builds that can shard Playwright tests with a few lines of configuration. Most CI platforms support a similar approach. The additional cost in CI minutes is real but small compared to the engineering time saved by catching bugs before production.
Tiered test execution is the other key strategy. Separate tests into tiers based on criticality. Tier one covers critical flows (authentication, primary value delivery, billing) and runs on every deploy, taking 3 to 5 minutes. Tier two covers secondary flows and runs on merges to the main branch. Tier three runs nightly. Every deploy gets meaningful browser verification from tier one. The full suite runs at controlled points in the development cycle.
Smoke tests specifically designed for deployment verification are a refinement of the tier one concept. A smoke test suite covers 5 to 10 critical user interactions: can the page load, can a user authenticate, can they access the core feature, can they complete a transaction. These tests take 2 to 3 minutes and provide high confidence that the deployment did not break anything fundamental. Full coverage comes from the broader suite that runs less frequently.
6. Integrating Browser Testing Into an AI-Assisted Dev Workflow
AI-assisted development changes how testing fits into the development workflow. When code generation is fast, the friction in the pipeline moves to verification. The developer who can verify AI-generated code quickly and reliably ships more confidently than the developer who either skips verification or spends days writing tests manually.
The practical integration looks like this. You use an AI assistant to build a feature. The code looks right. You deploy to a preview environment. Automated browser tests run against the preview URL. You get a result: either the tests pass (the feature works in a real browser), or they catch a regression (something broke). If they catch something, you fix it before merging. If they pass, you merge with confidence.
The key is that the browser tests are running against the actual deployment, not against a mocked environment. This is especially important for AI-generated code because the AI may have generated something that works in unit tests but fails due to environment differences, third-party integrations, or browser-specific behavior it did not account for.
For developers who use Claude Code specifically, this workflow is directly relevant. Claude Code can generate code quickly and accurately, but the generated code still needs browser verification. Adding a post-deploy test step to the workflow catches the class of bugs that AI coding assistants generate most frequently: CSS layout issues, mobile viewport problems, and integration failures that only show up when the full application stack is running in a real browser.
The developer who catches bugs in CI before production reaches users is the developer who ships fast without breaking things. That is the reputation that leads to more interesting work, more responsibility, and better career outcomes. Browser testing after every deploy is not overhead. It is the mechanism that makes sustainable velocity possible.