Testing Guide
Playwright Beyond the Basics: Isolation, Stable Locators, and Behavior-Focused Tests
Playwright is easy to pick up, but the real learning curve starts when your test suite grows. Isolation, stable selectors, and behavior-focused structure are what separate a proof of concept from a production-grade test suite.
“Once we locked down isolation and switched to role-based locators, our CI flake rate dropped to near zero.”
1. Why Test Isolation Matters More Than You Think
Every Playwright tutorial starts with writing a single test file that navigates to a page, clicks around, and asserts something. That works fine in isolation. The trouble begins when you have 200 tests sharing a browser instance, a database, or a session cookie. One test logs in as an admin, and the next test unexpectedly inherits that session. Suddenly your "add to cart" test is failing because it's running as an admin user who doesn't see the public storefront.
True isolation means each test starts with a clean slate. Playwright's architecture makes this easier than most frameworks. Each test can spin up a fresh BrowserContext, which is essentially a private browsing session with its own cookies, storage, and cache. Unlike Selenium, where you'd need to manage separate WebDriver instances, Playwright handles this with minimal overhead.
The practical pattern looks like this: use Playwright's built-in fixtures to create a new context and page for every test. If your tests depend on a logged-in state, create a storageState file during a global setup step and load it into each context. This gives you authentication without coupling tests together.
Database and API State
Browser-level isolation only solves half the problem. If two tests both create a user named "testuser@example.com" and your database enforces uniqueness, the second test will fail. The solution is to either seed unique data per test (using timestamps or UUIDs in test data) or to reset state between tests with API calls or database transactions.
Some teams use Playwright's beforeEach hook to hit an API endpoint that resets the test environment. Others wrap each test in a database transaction and roll it back after the test completes. Both approaches work. The key principle is that no test should depend on the outcome of any other test, and no test should leave behind state that could affect another.
2. Choosing Stable Locators That Survive Refactors
The fastest way to create a brittle test suite is to rely on CSS selectors tied to your component library's internal class names. When your team upgrades Tailwind or swaps out a UI library, every test that selects .bg-blue-500.rounded-lg.px-4 breaks overnight. The fix isn't to write more specific CSS selectors. It's to stop using implementation details as locators entirely.
The Locator Hierarchy
Playwright's documentation recommends a clear priority order for locators, and following it will save you significant maintenance time:
Role-based locators come first. Using page.getByRole('button', {name: 'Submit' }) targets the semantic meaning of an element, not its visual appearance. If your team changes the button from a <button> to a styled <a> tag with the button role, the test still passes. Role-based locators also double as accessibility checks, catching issues where interactive elements lack proper ARIA roles.
Text-based locators are next. page.getByText('Add to cart') is stable as long as the user-facing text stays the same, which it typically does even through major refactors. The caveat is internationalization: if your app supports multiple languages, text locators will break when the locale changes. For i18n apps, consider role-based locators or data attributes instead.
Data-testid attributes are your fallback. When an element has no meaningful role or visible text (like an icon-only button or a complex composite component), adding data-testid="checkout-summary" gives tests a stable hook without coupling to implementation. Some teams strip data-testid from production builds for cleaner HTML, which is a reasonable tradeoff.
Tools like Assrt, an open-source AI-powered test automation framework, can help here by auto-discovering test scenarios and generating Playwright tests with self-healing selectors. When a locator breaks due to a UI change, the framework detects it and suggests a fix. This can be especially useful on large codebases where manually updating hundreds of selectors after a refactor is impractical.
3. Structuring Tests Around User Behavior
A common anti-pattern is writing tests that mirror the DOM structure of your app. You end up with tests like "renders the header component," "renders the sidebar," and "renders the footer." These tests tell you almost nothing about whether the application works for users. They just confirm that React (or Vue, or Svelte) is rendering components, which the framework itself already guarantees.
Better tests describe user journeys. Instead of "renders the login form," write a test called "user can log in with valid credentials." Instead of "renders the product list," write "user can search for a product and add it to cart." Each test should start with an action a real person would take and end with an outcome that person would care about.
The Arrange-Act-Assert Pattern
Keep each test focused on a single behavior using the Arrange-Act-Assert structure. Arrange sets up the preconditions (navigate to the page, seed data, log in). Act performs the user action (click a button, fill a form, drag an element). Assert verifies the outcome (a success message appears, the URL changes, an item appears in a list).
Resist the temptation to chain multiple behaviors into a single test. A test that logs in, searches for a product, adds it to cart, checks out, and verifies the confirmation email is testing five things at once. When it fails, you have no idea which step broke. Split it into focused tests that each verify one behavior. The slight increase in setup time (logging in for each test) is worth the debugging clarity.
4. Page Object Models: When and How to Use Them
The Page Object Model (POM) is a design pattern where you create a class for each page or major component in your app. The class encapsulates all the locators and interactions for that page, so your tests read like high-level descriptions of user behavior rather than low-level DOM manipulation.
For example, a LoginPage class might expose methods like login(username, password) and getErrorMessage(). Your test calls loginPage.login('user@test.com', 'password123') without caring about which input fields exist or what their selectors are. When the login form changes (say, from email/password to a magic link flow), you update the POM class in one place instead of 40 test files.
When POMs Become Overhead
Not every page needs a Page Object. If a page is simple and only referenced in one or two tests, creating a class for it adds indirection without saving maintenance effort. The sweet spot is pages or components that appear in many tests and have locators that change frequently. Authentication flows, navigation menus, and checkout forms are classic POM candidates. A static "About Us" page is probably not.
Playwright's fixture system offers an alternative to traditional POMs. You can create custom fixtures that provide pre-configured page objects, making them available to any test that needs them without manual instantiation. This approach integrates naturally with Playwright's built-in test runner and avoids the boilerplate of constructor injection.
5. Scaling Your Test Suite Without the Bloat
As your test suite grows from 50 to 500 tests, execution time and maintenance cost become real concerns. Playwright supports parallel execution out of the box, running tests across multiple workers. But parallelism only works well if your tests are truly isolated. This is where the investment in proper isolation (from Section 1) pays off.
Organize tests by feature, not by type. Instead of folders like tests/e2e/, tests/smoke/, and tests/regression/, try tests/auth/, tests/checkout/, and tests/search/. Feature folders make it obvious where new tests belong and which tests to run when a specific feature changes. Use Playwright's --grep flag or project configurations to run subsets of tests in CI.
Use tags to control what runs where. Mark critical paths with a @smoketag so you can run a fast smoke suite on every commit and the full suite on merges to main. Playwright's tag-based filtering makes this straightforward.
For teams looking to reduce the overhead of writing and maintaining tests, frameworks like Assrt can auto-discover test scenarios from your application and generate Playwright test code with stable selectors. Combined with visual regression testing, this approach can catch UI regressions that traditional assertion-based tests miss entirely. Since Assrt is free and open-source, it's worth evaluating alongside your existing Playwright setup.
The bottom line: Playwright gives you excellent primitives for writing reliable UI tests. The challenge is using those primitives well. Invest in isolation, pick stable locators, structure tests around behavior, and keep your suite organized as it grows. These practices will save you far more time than any individual test you write.