Performance Guide

Parallel Test Execution: Run Your Test Suite 10x Faster

By Pavel Borji··Founder @ Assrt

A complete guide to parallelizing your test suite for maximum speed. Covers Playwright workers, CI/CD sharding, cross-browser strategies, and the architecture patterns that make tests safe to run concurrently.

Much faster

Teams that parallelize across workers and CI shards consistently see a dramatic reduction in test suite execution time compared to sequential runs.

1. Why Sequential Testing Fails at Scale

When your test suite is small, sequential execution works fine. Twenty tests running one after another might take three minutes. Developers barely notice the wait. But test suites grow. Features get added. Edge cases get covered. Before long, you have 500 tests and the suite takes 45 minutes to complete.

A 45-minute test suite creates a cascade of problems. Developers stop running tests locally because they cannot afford to wait. Pull request feedback loops stretch to an hour. Engineers start batching changes into larger PRs to avoid triggering multiple long CI runs, which makes code review harder and merges riskier. Some teams start skipping E2E tests entirely on certain branches, creating blind spots where bugs slip through.

The math is straightforward. If each E2E test takes an average of 5 seconds (including browser startup, navigation, assertions, and teardown), 500 tests will take roughly 2,500 seconds or about 42 minutes. That is just the tests themselves, not counting CI setup, dependency installation, or build time. With those overhead costs, you are looking at 50 to 60 minutes from push to green.

The CI bottleneck also has a dollar cost. Longer pipelines consume more compute minutes. On GitHub Actions, a 45-minute run on ubuntu-latest costs roughly $0.36. If your team pushes 50 times per day, that is $18 daily or $540 per month just for E2E test execution. Cutting that to 5 minutes through parallelization saves $480 per month while making developers dramatically more productive.

2. Fundamentals of Parallel Execution

Parallel test execution means running multiple tests simultaneously instead of waiting for one to finish before starting the next. There are two primary approaches: process-level parallelism and thread-level parallelism. Each has different tradeoffs for E2E testing.

Process-Level Parallelism

Each test (or group of tests) runs in its own operating system process with its own memory space, browser instance, and execution context. This is the approach Playwright uses with its worker model. Process-level parallelism provides strong isolation because tests cannot accidentally share state. If one test crashes, it does not affect other workers. The downside is higher memory consumption since each process needs its own browser instance.

Thread-Level Parallelism

Multiple tests share a single process but run on separate threads. This uses less memory but introduces the risk of shared state contamination. Thread-level parallelism is common in unit testing frameworks (like Jest workers or pytest-xdist) but is generally not suitable for E2E testing. Browser automation requires process-level isolation because browser contexts can leak state in subtle ways.

Test Independence: The Hard Requirement

Parallel execution only works when tests are independent. If Test B depends on a user account created by Test A, running them simultaneously will cause Test B to fail. Every test must be able to run in any order, at any time, alongside any other test. This is not just a parallel execution requirement; it is a fundamental principle of reliable test design that also eliminates flakiness in sequential runs.

Shared State Risks

The most common parallelization bugs come from shared state. Two tests that both write to the same database row. Two tests that both log in as the same user and one logs out mid-test. Two tests that both create an item with the same name and then assert on uniqueness. Identifying and eliminating shared state is the primary engineering challenge of parallelization.

Try Assrt for free

Open-source AI testing framework. No signup required.

Get Started

3. Playwright's Parallel Architecture

Playwright has one of the most sophisticated parallel execution engines of any E2E testing framework. Understanding its architecture helps you configure it correctly and squeeze maximum performance from your test suite.

Workers

A worker is an OS process that runs a subset of your tests. By default, Playwright uses half the number of CPU cores as workers. On an 8-core machine, that means 4 workers running tests in parallel. Each worker gets its own browser instance and runs tests sequentially within that worker. You can configure the number of workers explicitly.

playwright.config.ts
import { defineConfig } from '@playwright/test';

export default defineConfig({
  // Fixed number of workers
  workers: 4,

  // Or use a percentage of CPU cores
  workers: '75%',

  // Or use all available cores
  workers: process.env.CI ? 4 : undefined,
});

Sharding

Sharding splits your test suite across multiple machines. While workers parallelize within a single machine, sharding parallelizes across machines. This is essential for CI/CD where you can spin up multiple runners. Playwright's shard syntax divides your test files evenly across the specified number of shards.

Terminal
# Run shard 1 of 4 (each on a different CI runner)
npx playwright test --shard=1/4

# Run shard 2 of 4
npx playwright test --shard=2/4

# Combined: 4 shards x 4 workers = 16 parallel test streams

The fullyParallel Mode

By default, Playwright runs tests within the same file sequentially in one worker. The fullyParallel mode changes this behavior so that individual tests within a file can be distributed across different workers. This is faster but requires that tests within the same file have zero dependencies on each other.

playwright.config.ts
export default defineConfig({
  fullyParallel: true,
});

// Or configure per-file:
// in your test file:
test.describe.configure({ mode: 'parallel' });

// Force serial execution for specific describe blocks:
test.describe.configure({ mode: 'serial' });

test.describe.configure

This API gives you fine-grained control over parallelism at the describe block level. You can set the entire suite to fullyParallel and then mark specific describe blocks as serial where order matters. For example, a test sequence that creates a user, verifies the user, and then deletes the user needs serial execution. Everything else can run in parallel.

4. Designing Tests for Parallelism

The biggest barrier to parallel execution is not tooling. It is test design. Tests written with implicit sequential dependencies will fail randomly when parallelized. Here are the patterns that make tests parallel-safe.

State Isolation

Each test should create its own state and clean up after itself. Never rely on state created by a previous test. Use Playwright fixtures to set up fresh browser contexts, authenticated sessions, and test data for each test independently.

fixtures.ts
import { test as base } from '@playwright/test';

type TestFixtures = {
  authenticatedPage: Page;
  testUser: { email: string; password: string };
};

export const test = base.extend<TestFixtures>({
  testUser: async ({}, use) => {
    // Create a unique user for this test
    const user = await createTestUser({
      email: `test-${Date.now()}-${Math.random().toString(36).slice(2)}@example.com`,
      password: 'TestPass123!',
    });
    await use(user);
    // Cleanup after test completes
    await deleteTestUser(user.email);
  },

  authenticatedPage: async ({ page, testUser }, use) => {
    await page.goto('/login');
    await page.fill('[name=email]', testUser.email);
    await page.fill('[name=password]', testUser.password);
    await page.click('button[type=submit]');
    await page.waitForURL('/dashboard');
    await use(page);
  },
});

Unique Test Data

Every piece of test data should include a unique identifier. Use timestamps, UUIDs, or random suffixes in usernames, email addresses, project names, and any other data that might conflict. Two tests creating a project called "Test Project" will collide when running in parallel. Two tests creating "Test Project a7f3x9" and "Test Project k2m8p1" will not.

Independent Fixtures

Playwright fixtures are the ideal mechanism for parallel-safe setup and teardown. Each worker gets its own fixture instances. Use worker-scoped fixtures for expensive resources (like database connections) and test-scoped fixtures for per-test state (like user accounts). Worker-scoped fixtures are created once per worker and shared across all tests in that worker.

No Shared Database State

If your tests interact with a database, either use a separate database per worker or ensure tests operate on isolated rows. A common pattern is to create a unique tenant or workspace for each test that scopes all data access. This way, test A querying "all items" only sees its own items, not items created by test B running on another worker.

5. Local Parallelization

Running tests in parallel locally requires understanding your machine's resource limits. Too many parallel workers will cause tests to slow down or fail due to resource contention. Too few will leave CPU cores idle.

CPU Core Utilization

Each Playwright worker runs a browser instance that consumes CPU for rendering, JavaScript execution, and network handling. A good starting point is to use half your CPU cores for workers. On an 8-core laptop, 4 workers gives each worker access to roughly 2 cores (one for the Node.js process and one for the browser). Monitor CPU usage during test runs and adjust up or down based on observed utilization.

Memory Management

Each Chromium instance uses 150 to 300 MB of RAM depending on the complexity of the pages being tested. Four workers with Chromium will consume 600 MB to 1.2 GB. Add the Node.js process overhead and your application under test, and you need at least 4 GB of free RAM for comfortable 4-worker parallel execution. On machines with 8 GB total RAM, limit workers to 2 or 3 to avoid swapping.

playwright.config.ts
import { defineConfig } from '@playwright/test';
import os from 'os';

const cpus = os.cpus().length;
const totalMemGB = os.totalmem() / (1024 ** 3);

// Scale workers based on available resources
const workers = Math.min(
  Math.floor(cpus / 2),          // Half of CPU cores
  Math.floor(totalMemGB / 1.5),  // 1.5 GB per worker
  8                               // Cap at 8 workers
);

export default defineConfig({
  workers,
  fullyParallel: true,
  use: {
    // Reduce memory by reusing browser contexts
    launchOptions: {
      args: ['--disable-gpu', '--disable-dev-shm-usage'],
    },
  },
});

Browser Resource Limits

Beyond CPU and memory, browsers compete for GPU access, file descriptors, and network ports. Disable GPU rendering in CI environments where there is no GPU available (it falls back to software rendering and wastes CPU). The --disable-dev-shm-usage flag is essential on Linux CI runners where the shared memory partition is too small for multiple browser instances.

6. CI/CD Sharding Strategies

Sharding distributes your test suite across multiple CI runners. Combined with per-runner workers, sharding is how you achieve that 10x speedup. The key decisions are how many shards to use and how to collect results back together.

GitHub Actions Matrix Strategy

.github/workflows/test.yml
jobs:
  e2e:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        shard: [1/6, 2/6, 3/6, 4/6, 5/6, 6/6]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 22
          cache: npm
      - run: npm ci
      - run: npx playwright install --with-deps
      - run: npx playwright test --shard=${{ matrix.shard }}
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: blob-report-${{ strategy.job-index }}
          path: blob-report/
          retention-days: 3

  merge-reports:
    needs: e2e
    if: always()
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 22
          cache: npm
      - run: npm ci
      - uses: actions/download-artifact@v4
        with:
          pattern: blob-report-*
          merge-multiple: true
          path: all-blob-reports/
      - run: npx playwright merge-reports --reporter=html all-blob-reports/
      - uses: actions/upload-artifact@v4
        with:
          name: html-report
          path: playwright-report/
          retention-days: 14

Shard Count Selection

The optimal shard count depends on your test suite size, individual test duration, and budget. As a rule of thumb, aim for each shard to run 2 to 4 minutes of tests. If your suite takes 24 minutes sequentially, 6 shards bring it down to 4 minutes per shard. Adding more shards has diminishing returns because CI setup overhead (checkout, install, browser download) becomes the bottleneck.

Result Merging

When tests run across multiple shards, each shard produces its own report. Playwright's merge-reports command combines blob reports from all shards into a single unified HTML report. This is essential for understanding overall test health and identifying patterns across the full suite.

Artifact Collection

Use the blob reporter in CI instead of the HTML reporter for individual shards. Blob reports are smaller and designed for merging. Upload them as artifacts from each shard, then download all artifacts in the merge job. Set appropriate retention periods: 3 days for individual shard blobs (they are only needed for merging) and 14 days for the final merged report.

7. Cross-Browser Parallel Testing

Playwright supports Chromium, Firefox, and WebKit (Safari's rendering engine). Running tests across all three browsers simultaneously multiplies your test matrix but provides comprehensive coverage that catches browser-specific rendering and behavior differences.

Configuring Multiple Browser Projects

playwright.config.ts
import { defineConfig, devices } from '@playwright/test';

export default defineConfig({
  fullyParallel: true,
  workers: '50%',
  projects: [
    {
      name: 'chromium',
      use: { ...devices['Desktop Chrome'] },
    },
    {
      name: 'firefox',
      use: { ...devices['Desktop Firefox'] },
    },
    {
      name: 'webkit',
      use: { ...devices['Desktop Safari'] },
    },
  ],
});

Chromium, Firefox, and WebKit Simultaneously

With three browser projects defined, Playwright will run your entire test suite three times (once per browser). If you have 100 tests, that is 300 test executions. With fullyParallel enabled and 4 workers, tests from all three browsers are interleaved across the worker pool. This means a Chromium test and a Firefox test can run at the same time on different workers.

Assrt Handles All Three Browsers

Assrt automatically detects your Playwright project configuration and generates tests that work across all configured browsers. When a selector or interaction behaves differently in WebKit compared to Chromium, Assrt adapts the test to handle browser-specific quirks. This eliminates the common problem of tests passing in Chromium but failing in Firefox due to subtle timing or rendering differences.

CI Matrix for Cross-Browser Sharding

.github/workflows/cross-browser.yml
jobs:
  e2e:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        browser: [chromium, firefox, webkit]
        shard: [1/3, 2/3, 3/3]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 22
          cache: npm
      - run: npm ci
      - run: npx playwright install --with-deps ${{ matrix.browser }}
      - name: Run tests
        run: |
          npx playwright test \
            --project=${{ matrix.browser }} \
            --shard=${{ matrix.shard }}
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: results-${{ matrix.browser }}-${{ strategy.job-index }}
          path: blob-report/

This matrix creates 9 parallel jobs (3 browsers x 3 shards). Each job installs only the browser it needs, saving download time and disk space. A 500-test suite that takes 45 minutes sequentially on a single browser now takes roughly 5 minutes per shard, and all 9 jobs run simultaneously. Total wall clock time from push to complete cross-browser results: about 7 minutes including CI overhead.

8. Performance Optimization

Beyond basic parallelization, there are several advanced techniques that squeeze even more speed from your test suite. These optimizations compound with each other and can reduce total execution time by an additional 30 to 50 percent.

Selective Test Runs

Run only the tests affected by your code changes. If you modified the checkout page, there is no need to run tests for the settings page. Tools like nx affected or custom scripts that map source files to test files can determine which tests to run. On large codebases, selective testing can reduce the number of tests from 500 to 20, making parallelization almost unnecessary for individual PRs.

Test Prioritization

Order tests so that the most likely to fail run first. Tests covering recently changed code, tests with a history of failures, and tests for high-risk areas should execute in the earliest shards. With fail-fast enabled at the shard level, a failure in the first minute saves the remaining 4 minutes of execution. Some CI platforms support test ordering based on historical pass/fail data.

Caching Strategies

Cache aggressively in CI. Node modules, Playwright browser binaries, and build artifacts should all be cached between runs. A cold Playwright install downloads 300+ MB of browser binaries. Caching them reduces job startup time by 30 to 60 seconds. Use content-addressable caching (keyed on lockfile hash) to ensure caches are invalidated when dependencies change.

.github/workflows/optimized.yml
- name: Cache Playwright browsers
  uses: actions/cache@v4
  id: playwright-cache
  with:
    path: ~/.cache/ms-playwright
    key: playwright-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

- name: Install Playwright (cache miss only)
  if: steps.playwright-cache.outputs.cache-hit != 'true'
  run: npx playwright install --with-deps

- name: Install system deps (cache hit)
  if: steps.playwright-cache.outputs.cache-hit == 'true'
  run: npx playwright install-deps

Warm-Up Strategies

Browser startup is one of the slowest operations in E2E testing. Playwright's reuseExistingServer option keeps the application server running across tests instead of restarting it for each one. Use globalSetup to perform one-time expensive operations like seeding the database or pre-authenticating users. Store authentication state to disk and load it in each test instead of going through the login flow every time.

playwright.config.ts
export default defineConfig({
  globalSetup: './global-setup.ts',
  use: {
    // Reuse auth state across tests
    storageState: './auth-state.json',
  },
  webServer: {
    command: 'npm run start',
    port: 3000,
    reuseExistingServer: !process.env.CI,
    timeout: 30_000,
  },
});

By combining parallel workers, CI sharding, selective test runs, caching, and warm-up strategies, teams routinely reduce their E2E execution time from 45 minutes to under 5 minutes. The investment in parallelization infrastructure pays for itself within the first week through faster feedback loops, higher developer productivity, and reduced CI costs.

Related Guides

Ready to automate your testing?

Assrt discovers test scenarios, writes Playwright tests from plain English, and self-heals when your UI changes.

$npm install @assrt/sdk