What causes flaky Playwright tests?

Flakiness is almost always caused by bad locators, missing auto-wait, or shared test state. Use getByRole locators, toBeVisible assertions instead of fixed timeouts, and storage state files for per-test isolation.

Test Automation Tutorial

Test Automation Tutorial: From Zero to a Green Suite in One Afternoon

Q: How does Assrt compare to commercial AI testing platforms?

Commercial platforms charge $300 to $7,500 per month and lock tests into proprietary formats. Assrt is open source, self-hosted, and emits standard Playwright code you can version in Git and run anywhere. Zero vendor lock-in.

This tutorial walks you through building a real automated test suite from scratch. You will install Playwright, write your first test with accessible locators, run it locally, wire it into CI, and learn how to generate the rest of your coverage with AI. Every example is runnable. Every tool is free and open source. Zero vendor lock-in.

74%

“74% of engineering teams cite slow or flaky test suites as the primary obstacle to faster releases. Tutorials that stop at hello-world leave readers stuck exactly where the real problems begin.”

Stack Overflow Developer Survey, 2025

0minTo first passing test

$0Tooling cost end to end

0xBrowsers covered by Playwright

0%Tests you own and can export

The Test Automation Loop You Will Build

1. What Test Automation Actually Is

Test automation is the practice of replacing repetitive manual verification with code that drives your application the way a user would, then asserts on the result. Every time a human would click a button and check that the next screen shows the right number, an automated test can do the same thing in milliseconds, deterministically, across every browser and viewport your users actually use.

Most tutorials spend their first thousand words on testing theory. This one will not. You will need three mental models and nothing else: a test runner (the thing that executes your tests), a browser driver (the thing that controls the browser), and a locator strategy (the thing that tells the driver which element to interact with). Playwright bundles all three into a single npm package, which is why this tutorial starts there.

The Three Pieces of Any Test Automation Stack

⚙️

Runner

Collects and runs tests

🌐

Driver

Controls the browser

✅

Locators

Find elements reliably

✅

Assertions

Verify expected state

↪️

Reporter

Output pass/fail + traces

What a Good Test Automation Suite Delivers

Runs on every commit without human supervision
Catches real user-facing regressions, not internal refactors
Produces deterministic pass/fail, not flaky timeouts
Stays green through normal UI evolution
Exports trace files a human can replay to debug failures
Runs in under 10 minutes end to end

2. Choose Your Stack (and Why Playwright)

Before you install anything, you need to pick a framework. There are dozens, but for a 2026 test automation tutorial the honest answer is Playwright. It supports Chromium, Firefox, and WebKit out of the box. It auto-waits for elements, which eliminates most of the race conditions that make older frameworks flaky. It has a first-class trace viewer. And it is free, open source, and maintained by Microsoft.

Cypress is a reasonable alternative if you only test Chromium-based browsers and you never need to drive multiple tabs at once. Selenium is still around, but its architecture shows its age and its locator APIs are significantly more brittle than Playwright's. For a tutorial that teaches patterns you can use for the next five years, Playwright is the pragmatic choice.

Capability	Playwright	Cypress	Selenium
Chromium, Firefox, WebKit	All three	Chromium only	All three
Auto-waiting for elements	Built-in	Built-in	Manual waits
Multi-tab and iframe support	Native	Limited	Native
Trace viewer for debugging	Yes	Paid add-on	No
API testing in-process	APIRequestContext	cy.request	Separate tool
License	Apache 2.0	MIT (core)	Apache 2.0

3. Install Playwright and Set Up Your Project

Open a terminal in your project root. You need Node.js 18 or later. If you are starting fresh, create a new directory first. Then run the Playwright init command, which installs the test runner, downloads the browser binaries, and writes a baseline config file.

Playwright Install Session

The init command creates four important files: playwright.config.ts (runner configuration), tests/example.spec.ts (a starter test), a .github/workflows/playwright.yml file for CI, and a tests-examples directory you can delete. The config file is where you will later configure cross-browser projects, retries, and the base URL for your app.

playwright.config.ts

Skip the manual writing entirely

Assrt points at your running app, crawls it, and writes real Playwright .spec.ts files you can commit. Open source, self-hosted, zero vendor lock-in.

4. Write Your First Test

Delete the example test and create a new file at tests/homepage.spec.ts. The anatomy of every Playwright test is the same: import test and expect, declare a test block with a description, use the injected page fixture to drive a browser, and assert on the result.

tests/homepage.spec.ts

Run it with npx playwright test. The runner starts your dev server via the webServer config, launches Chromium, Firefox, and WebKit in parallel, executes the test in each, and produces an HTML report. Your first test suite is live.

First Test Run

5. Locators: The Heart of Resilient Automation

Most flaky tests come from bad locators, not bad code. The difference between a test suite that stays green for a year and one that breaks every sprint is almost entirely locator discipline. The rule is simple: prefer locators that target semantics (role, label, text) over locators that target implementation details (CSS classes, DOM hierarchy, auto- generated IDs).

Brittle Selectors vs Resilient Locators

// Fragile: breaks when markup changes
await page.click('.btn.btn-primary.mt-4');
await page.click('div > div:nth-child(3) > button');
await page.click('#user-menu-dropdown-trigger-v2');
await page.locator('xpath=//div[@class="card"]//button[2]').click();

// Any refactor breaks these.
// Class rename? Broken.
// Extra wrapper div? Broken.
// New ID suffix from a build tool? Broken.

0% fewer lines

Playwright provides a clear locator hierarchy. Reach for each one in order until you find one that works. The higher you land on this list, the more resilient your test will be to UI refactoring.

Locator Priority Order

getByRole (buttons, links, headings, textboxes)
getByLabel (form fields associated with <label>)
getByPlaceholder (fallback for unlabeled inputs)
getByText (static copy that rarely changes)
getByTestId (explicit test IDs for ambiguous elements)
CSS or XPath (last resort, never for interactive elements)

6. Real Scenarios: Login, CRUD, and Payments

A homepage smoke test proves the toolchain works. Real coverage starts when you test the flows users actually care about. Below are three scenarios that map directly to the critical paths of most SaaS apps. Each one is a runnable Playwright test you can adapt to your own application.

Authentication: Email and Password Login

Straightforward

tests/auth.spec.ts

CRUD: Create, Read, and Delete a Record

Moderate

tests/projects-crud.spec.ts

Payment: Stripe Checkout Happy Path

Complex

tests/stripe-checkout.spec.ts

Notice that every interactive element in these tests is found by getByRole or getByLabel. That is not an accident. Accessibility-first locators also happen to be the most refactor-resistant locators, which is why teams that care about a11y usually end up with the least flaky test suites too.

7. Debugging Failures With Trace Viewer

Every test you write will eventually fail in CI, and when it does you need to know why without rerunning it locally. This is where Playwright's trace viewer changes the game. A trace file captures a full recording of the test: every DOM snapshot, every network request, every console message, every action timing. You can open one locally and scrub backward and forward through the test like a video.

The config you wrote in section 3 already enables traces on first retry. When a test fails twice in CI, Playwright writes a trace.zip under test-results/. Upload it as a build artifact, download it, and open it with npx playwright show-trace trace.zip.

Debugging a Failed Test

Three debugging commands cover 90% of real failures. Memorize them.

debug-cheatsheet.sh

8. Running the Suite in CI

A test suite that only runs on your laptop catches nothing. The real value of automation is a suite that runs on every commit, blocks broken pull requests, and sends the author a trace they can click into. The GitHub Actions workflow below is the minimum viable CI setup. Adapt it for GitLab CI, CircleCI, or Jenkins with the same three logical steps: install, run, upload artifacts.

.github/workflows/playwright.yml

Two details matter most here. The --with-deps flag on the install command ensures the runner downloads the correct browser system libraries for the Ubuntu image, which saves you from cryptic failures about missing libnss3. The upload-artifact step runs on always() so the HTML report is available for passing runs too, not just failures. When a test starts flaking, you will want the historical reports.

CI Setup Checklist

Run npx playwright install --with-deps before tests
Set retries: 2 in playwright.config.ts for CI only
Upload playwright-report as an always-artifact
Upload test-results (traces) on failure only
Shard across workers for parallel execution
Fail the build on any test failure (forbidOnly: true)

9. Scaling Coverage With AI Generation

Once you have five tests you feel good about, the honest question is how you get to fifty without spending a month on it. Manual test authoring costs two to four hours per scenario when you include writing, debugging, and stabilizing. For a solo developer or a lean team, that math does not work.

This is where AI test generation closes the gap. Assrt is an open-source tool that points a headless browser at your running app, builds an interaction graph from the accessibility tree, and generates standard Playwright .spec.ts files for every flow it discovers. The output is real code you can read, edit, commit, and run with npx playwright test. Unlike proprietary platforms that charge $7,500 per month and lock your tests behind their cloud runner, Assrt is free and self-hosted, and the generated tests are yours to keep forever.

Proprietary YAML vs Real Playwright Code

# What you get from most AI testing platforms:
# proprietary YAML locked to their cloud runner.
name: login_test
tags: [smoke, auth]
steps:
  - visit: /login
  - fill:
      selector: "#email"
      value: "demo@example.com"
  - fill:
      selector: "#password"
      value: "pass1234"
  - click:
      text: "Sign in"
  - assert:
      url_matches: "/dashboard"

# Cancel subscription = tests stop running.
# Estimated lock-in cost over 3 years: $270,000.

21% fewer lines

Generating a Full Suite With Assrt

The workflow that scales: write the critical path tests by hand so you learn the patterns, then use generation to cover everything else. Review the generated tests with the same care you would give a pull request from a junior engineer. Commit the ones that look right. Fix the ones that do not. Your coverage grows by an order of magnitude in a single afternoon, and the suite stays entirely under your control because every file is a plain TypeScript spec that lives in your repository.

10. FAQ

Do I need to know TypeScript to follow this tutorial?

Basic JavaScript is enough. Playwright's API is small and the tests in this tutorial use only a dozen functions. If you have written a fetch call, you can write a Playwright test. The TypeScript types are there to help you autocomplete locators, not to get in your way.

How long should a good test suite take to run?

For a medium-sized SaaS app, aim for under 10 minutes end to end in CI with parallel sharding. Smoke tests on every commit should run in 2 to 3 minutes. If your suite is slower than that, parallelize with Playwright's workers, shard across multiple CI runners, and move any expensive setup into shared fixtures.

Should I use Page Object Model?

For small suites, no. Write tests flat and duplicate locators until the duplication actually hurts. When you have 30+ tests that share a locator, extract it into a helper. Premature POM abstraction is one of the fastest ways to make a test suite harder to read and harder to maintain.

What if my tests are flaky?

Flakiness is almost always caused by one of three things: bad locators (use getByRole instead of CSS selectors), missing auto-wait (use toBeVisible and toHaveURL instead of fixed timeouts), or shared test state (use storage state files and fresh browser contexts per test). Fix all three and your flake rate drops toward zero.

Is Playwright free for commercial use?

Yes. Playwright is released under the Apache 2.0 license and is free for any commercial use. Microsoft maintains it and ships a new version roughly every six weeks. No seat licenses, no CI minute limits, no vendor portal to log into.

How does Assrt compare to commercial AI testing platforms?

Commercial platforms like QA Wolf, Mabl, and Testim charge $300 to $7,500 per month and lock your tests into proprietary formats. Assrt is open source and free, runs entirely on your own infrastructure, and emits standard Playwright code you can version in Git and run anywhere. If you uninstall Assrt, every generated test keeps working. Tests are yours to keep.

Generate your first suite with AI

Point Assrt at your running app and let it write the tests for you. Real Playwright code, open source, free, and self-hosted.

Your test suite, written for you

Assrt generates real Playwright tests from a running app and self-heals them when the UI changes. Open source, free, and zero vendor lock-in.

View on GitHub

Test Automation Tutorial: From Zero to a Green Suite in One Afternoon

1. What Test Automation Actually Is

2. Choose Your Stack (and Why Playwright)

3. Install Playwright and Set Up Your Project

4. Write Your First Test

5. Locators: The Heart of Resilient Automation

6. Real Scenarios: Login, CRUD, and Payments

Authentication: Email and Password Login

CRUD: Create, Read, and Delete a Record

Payment: Stripe Checkout Happy Path

7. Debugging Failures With Trace Viewer

8. Running the Suite in CI

9. Scaling Coverage With AI Generation

10. FAQ

Do I need to know TypeScript to follow this tutorial?

How long should a good test suite take to run?

Should I use Page Object Model?

What if my tests are flaky?

Is Playwright free for commercial use?

How does Assrt compare to commercial AI testing platforms?

Related Guides

How to Fix Flaky Tests

Debugging Playwright Tests

Playwright Testing Best Practices

Your test suite, written for you

Comments ()

1. What Test Automation Actually Is

2. Choose Your Stack (and Why Playwright)

3. Install Playwright and Set Up Your Project

4. Write Your First Test

5. Locators: The Heart of Resilient Automation

6. Real Scenarios: Login, CRUD, and Payments

Authentication: Email and Password Login

CRUD: Create, Read, and Delete a Record

Payment: Stripe Checkout Happy Path

7. Debugging Failures With Trace Viewer

8. Running the Suite in CI

9. Scaling Coverage With AI Generation

10. FAQ

Do I need to know TypeScript to follow this tutorial?

How long should a good test suite take to run?

Should I use Page Object Model?

What if my tests are flaky?

Is Playwright free for commercial use?

How does Assrt compare to commercial AI testing platforms?

Related Guides

How to Fix Flaky Tests

Debugging Playwright Tests

Playwright Testing Best Practices

Your test suite, written for you

Comments (••)

Comments ()