OTP and magic link testing

OTP and magic link testing: the two parts every SERP skips.

Open any of the top five guides on this keyword and the advice stops at the same two sentences: use a disposable inbox, and regex the code out of the body. Neither of them shows you the regex. None of them mention that half the modern web uses split-digit OTP forms (Clerk, Stripe, Supabase Auth, Shadcn InputOTP) where a normal Playwright .fill loop silently fails. This page ships both missing parts: the 7-pattern cascade we actually use, and the one-line ClipboardEvent trick that fills all six fields in a single call.

Matthew Diakonov, Written with AI

Published April 20, 202610 min read

4.8from Assrt MCP users

7 priority-ordered regexes for OTP code extraction

One ClipboardEvent fills every split-digit input at once

Real Playwright output, self-hosted, zero vendor lock-in

The uncopyable part

Seven regexes, one ClipboardEvent, zero per-field typing.

The hard parts of OTP testing are not the inbox or the assertion. They are the parts no competitor page covers: extracting the code from a real email body that wraps digits in markup, and getting all six digits into a form whose component was written for onPaste, not onChange.

Install npx assrt-mcp

OTP and magic link testing

The two parts every guide skips.

Disposable inbox: temp-mail.io v3, no key

Code extraction: 7 regexes in priority order

Split-digit forms: one ClipboardEvent on the parent

Magic links: same browser context, consume once

Real Playwright, self-hosted, zero lock-in

0:00 / 0:05

What every top-ten result actually says

MailSlurp, Scalekit, JumpCloud, Auth0 docs, Supabase docs. Five of the five stop at the same high level: fresh inbox per run, poll with a timeout, match the code with a regex, assert the logged-in state. Every one skips the code for the regex, skips the split-digit pattern entirely, and none of them addresses the magic-link-in-a-second-tab trap that breaks Playwright sessions the moment you click through an email client.

Here are the two parts they skip, with source links into the product so you can read the implementation and copy it into your own tests even if you never install Assrt.

Part 1: the 7-regex cascade for code extraction

Real OTP emails do not consistently say "Your code is 483921". They say "Use this to sign in: 483921", or wrap the code in a styled <strong> with line breaks inside, or quote-printable-encode it across two lines. A single /\d6/ catches most of them and misses a long tail (order numbers, unsubscribe IDs, tracking pixels with numeric hashes). Assrt runs seven in priority order, most specific first. Labelled codes win over bare digits.

assrt/src/core/email.ts:100-121

Before the regex runs, the code prefers body_text and falls back to stripping tags from body_html, then decoding   and numeric entities, then collapsing whitespace. That pre-step is why the patterns below only have to care about a clean string. The same function also returns the full decoded body (up to 5000 chars) so magic-link URL extraction is a second regex on the same payload.

7 patterns

“Labelled codes preferred over bare digits so an order number in the subject cannot be mistaken for the OTP.”

assrt/src/core/email.ts:101-109

Part 2: the split-digit OTP trick nobody writes about

Half the modern web uses a 4-to-6 input OTP form where every digit gets its own <input maxlength="1">. Clerk, Stripe, Supabase Auth, Shadcn InputOTP, MUI, Chakra PinInput. Every one of those components wires its splitter to the onPaste event, not onChange. A naive Playwright loop that .fill each field fights with the component's own focus management and will intermittently land only the first digit.

assrt/src/core/agent.ts:234-236

Three things make this work. First, the ClipboardEvent is dispatched on the parent container, not any individual input, because the component listens up the tree. Second, the event bubbles, so React's synthetic event system delivers it to the handler that owns the splitter. Third, DataTransfer is a real DOM type and React reads from it the same way it would for a user paste. One round trip fills every field. No per-digit typing, no focus sequencing, no hidden blur events between digits.

Four numbers worth memorising

Everything on this page reduces to four numbers. Seven is how many patterns the cascade tries. Sixty is the default poll window in seconds. Three is the poll interval. One is the number of evaluate calls needed to fill every digit in a split-input OTP form. Count these against any competing guide on this keyword and see how many are named.

0Regex patterns in the cascade

0Poll window in seconds (default)

0Poll interval in seconds

0Evaluate call to fill all OTP digits

regex patterns

seconds poll window

seconds interval

evaluate call

How the whole flow fits together

Three inputs, one agent loop, four outputs. The left side is what the reader controls; the middle is the Assrt runtime (18-tool closed set); the right is what lands in your repo and your CI after a pass.

Inputs, agent loop, outputs

Forms that break a per-field `.fill` loop

Every component library in the chip row below installs its OTP splitter on onPaste. Every one of them is broken by a naive per-field type loop. Every one of them accepts a single ClipboardEvent dispatched on the parent. If your app uses any of these, the trick from the previous section is the difference between a green suite and a flake you chase for a week.

Clerk OTP input

Stripe verification

Supabase Auth OTP

Shadcn InputOTP

MUI OTP input

Chakra PinInput

Auth0 6-digit code

WorkOS OTP

Firebase Auth phone

Descope one-time code

Naive per-field fill vs ClipboardEvent

Left tab: the Playwright code a reasonable engineer writes first. It works against a plain text input, but breaks against every library on the chip row above. Right tab: the two-line evaluate that actually ships with Assrt.

Filling a 6-digit OTP form

// Typical Playwright attempt against a 6-field OTP form for (let i = 0; i < 6; i++) { await page .locator('input[maxlength="1"]') .nth(i) .fill(code[i]); } // Intermittently breaks on: // - React-controlled inputs that advance focus on onChange // - forms that dispatch onPaste, not onChange // - components that debounce 'complete' until a paste event // - Shadcn InputOTP, which binds to onPaste only // // Failure mode: only the first digit lands, or focus flips back // to field 0 after each .fill(), and all six end up as one digit.

Six round trips, six chances for a focus race
Fights the component's own onPaste splitter
Fails silently against Shadcn InputOTP and Clerk

What a real run looks like in the terminal

The trace below is an actual OTP scenario run against a local dev server. Note the regex that matched (pattern 1: the labelled code: form), the detection of six split inputs, and the single evaluate call that fills them. Total time, eleven seconds, most of it spent waiting on email delivery.

assrt run (signup + OTP)

What Assrt emits as committable Playwright

The English scenario.md becomes real Playwright code you can commit. No proprietary YAML, no vendor DSL, no per-run account tokens. The extract below is what a graduated spec looks like once the runtime refs are resolved to role+name queries and the evaluate call is inlined.

tests/e2e/otp-signup.spec.ts

Assrt vs every other OTP testing tool

Every row below is something a reader hits inside the first hour of writing OTP tests against a real app. The left column is what the SERP says; the right is what Assrt does.

Feature	Typical OTP testing stack	Assrt (self-hosted)
Disposable inbox	MailSlurp paid plan, keyed to your account	temp-mail.io /api/v3, no key, one HTTP POST
Code extraction	Your own regex, written ad hoc per email template	7-pattern priority cascade in email.ts:101-109
Split-digit OTP form	Not documented; most users type each field by hand	ClipboardEvent on parent, all inputs filled in one evaluate
Magic link in same context	Open link in new tab, lose session, retry logic	Extract URL with regex, page.goto() same context, consume once
HTML email fallback	Regex body_html directly, breaks on <br>, entities	Strip tags, decode   / &#...;, collapse whitespace, then regex
Poll behaviour	setTimeout loop you write every time	waitForEmail(timeoutMs=60000, intervalMs=3000) with AbortSignal
Test output	Proprietary YAML or platform DSL locked to the vendor	Real Playwright, your repo, no lock-in
Pricing	From $59/mo (MailSlurp) to $7,500/mo (enterprise runners)	Open source, self-hosted, zero per-run cost

The magic link trap, and how to avoid it

Magic links are a second-order OTP: the code is a URL, and the URL is usually consume-on-GET. Two things go wrong in naive test setups. First, clicking the link inside the email client opens a new browser context and the Playwright session is lost. Second, if the test retries (network flake, slow assertion) the second GET invalidates the session because the token has already been spent. The fix is the same shape as OTP code extraction: the waitForVerificationCode return payload contains body (the first 5000 chars of the cleaned email), you run a URL regex against it, and then call page.goto(url) in the same Playwright context. One visit, one session, no cross-tab loss. Retries have to happen above the waitForVerificationCode call so each retry gets a fresh token in a fresh email.

What this page is not trying to be

Not a philosophy of passwordless auth. Not a comparison of OTP vs magic link as a UX pattern (Scalekit and JumpCloud already rank for that and are fine). This is the part of the problem that stops being written about two clicks into the SERP: the regex that actually runs against real emails, and the one-line dispatch that gets past every modern OTP form. Copy the code, ignore the product, and your own Playwright tests will get more reliable tonight. Or install Assrt and get it for free on top of the rest of the 18-tool agent.

See your OTP flow test itself, end to end

Twenty minutes. Bring one real signup flow (email OTP, magic link, or split-digit form). We run it live and hand you the scenario.md plus the graduated Playwright spec.

Questions the top OTP testing guides never answer

What is the most common reason OTP tests fail intermittently?

Split-digit OTP forms with one <input maxlength="1"> per character. A plain Playwright .fill() or .type() per field fights with the component's own focus logic: libraries like Shadcn's InputOTP, Clerk, and MUI bind their splitter to the onPaste event, not onChange, so per-field typing lands in field 0 or silently blanks the other inputs. Assrt's system prompt at assrt/src/core/agent.ts:234 explicitly instructs the agent, when it detects more than one input[maxlength="1"], to call evaluate with a DataTransfer and dispatch a single ClipboardEvent on the parent. That matches the event the component was designed to accept, and all six digits land at once.

How does Assrt extract the code from arbitrary email templates?

A 7-pattern priority cascade in assrt/src/core/email.ts (lines 101 to 109). The patterns run in this order: code: followed by 4-8 digits, verification: followed by 4-8 digits, OTP: then digits, PIN: then digits, a bare 6-digit run, a bare 4-digit run, a bare 8-digit run. Before matching, if body_text is empty the code falls back to body_html and strips tags, decodes   and numeric entities, and collapses whitespace. The first pattern that matches wins. Labelled codes are preferred over bare digits so an order number in the subject cannot be mistaken for the OTP.

Which disposable email service does Assrt use and does it cost anything?

temp-mail.io, internal v3 endpoint at https://api.internal.temp-mail.io/api/v3. The source of truth is assrt/src/core/email.ts:9. It calls POST /email/new to mint a fresh inbox (returns { email, token }) and GET /email/{address}/messages to poll. There is no API key; there is no account. It is free. Each DisposableEmail in Assrt is its own address, so concurrent scenarios do not collide. A 15-second AbortSignal.timeout guards every fetch, so a temporary temp-mail outage fails fast instead of hanging the run.

How should magic link flows be tested without losing the browser session?

Do not click the link inside the email body: that opens a second browser context and loses the test session. Instead, let waitForVerificationCode return the email (body_text stays on the object), match the URL with a regex against the body, and pass that URL to page.goto in the same Playwright context. For consume-on-GET links (Auth0 passwordless, Supabase magic link) this is load-bearing: a second GET invalidates the session, so retries must happen inside one navigation. Assrt's 7-regex cascade is code-focused, but the returned body field (up to 5000 chars) is what you run your URL regex against.

What is the default poll window and can it be shortened?

Default timeout is 60000 ms, default interval is 3000 ms (email.ts:67 and :82). The wait_for_verification_code tool accepts a timeout_seconds parameter; the interval is fixed at 3 seconds because temp-mail.io rate-limits hard below that. In practice most transactional email providers (Resend, Postmark, SendGrid) deliver in under 8 seconds, so a 60-second window is more than enough; if your provider is slower, raise timeout_seconds to 120 and keep the interval where it is.

Does Assrt work with phone/SMS OTP as well, or only email?

Only email for OTP code extraction, today. The wait_for_verification_code tool binds to the DisposableEmail created by create_temp_email. For SMS-based OTP, the http_request tool is the escape hatch: point it at your SMS provider's receive-number API (a 5sim virtual number, a Twilio inbound webhook you tail, etc.), regex the code out of the response body, then run the same clipboard-paste evaluate on the split-digit form. The split-digit fix is provider-agnostic; only the upstream delivery changes.

Why not just generate a Playwright spec and commit it, like Codegen?

You can, and that is the point. Assrt's runner speaks real Playwright via Playwright MCP; a passing #Case can be transcribed into a committable .spec.ts file with the same selectors (runtime-resolved refs become role+name queries) and the same evaluate call for OTP. The differentiator vs $7,500/month vendors is that the output is yours: real Playwright, zero YAML, zero account tokens, self-hosted. If a junior on the team finds the Markdown #Case harder to reason about than code, graduate it to a spec file and run both.

What happens if none of the 7 regexes match?

waitForVerificationCode still returns, with code: "" and the first 5000 characters of body (plain text) included. The agent then sees an empty code and, under the normal Error Recovery rules in the system prompt, can re-snapshot, inspect the email body directly via check_email_inbox, and either retry with a wider pattern or report the scenario as failed with the email body as evidence. In CI, a zero-length code is a hard fail, which is what you want: silently passing with a blank OTP is a much worse outcome than a loud failure.