OTP and magic link testing: the two parts every SERP skips.
Open any of the top five guides on this keyword and the advice stops at the same two sentences: use a disposable inbox, and regex the code out of the body. Neither of them shows you the regex. None of them mention that half the modern web uses split-digit OTP forms (Clerk, Stripe, Supabase Auth, Shadcn InputOTP) where a normal Playwright .fill loop silently fails. This page ships both missing parts: the 7-pattern cascade we actually use, and the one-line ClipboardEvent trick that fills all six fields in a single call.
The uncopyable part
Seven regexes, one ClipboardEvent, zero per-field typing.
The hard parts of OTP testing are not the inbox or the assertion. They are the parts no competitor page covers: extracting the code from a real email body that wraps digits in markup, and getting all six digits into a form whose component was written for onPaste, not onChange.
What every top-ten result actually says
MailSlurp, Scalekit, JumpCloud, Auth0 docs, Supabase docs. Five of the five stop at the same high level: fresh inbox per run, poll with a timeout, match the code with a regex, assert the logged-in state. Every one skips the code for the regex, skips the split-digit pattern entirely, and none of them addresses the magic-link-in-a-second-tab trap that breaks Playwright sessions the moment you click through an email client.
Here are the two parts they skip, with source links into the product so you can read the implementation and copy it into your own tests even if you never install Assrt.
Part 1: the 7-regex cascade for code extraction
Real OTP emails do not consistently say "Your code is 483921". They say "Use this to sign in: 483921", or wrap the code in a styled <strong> with line breaks inside, or quote-printable-encode it across two lines. A single /\d6/ catches most of them and misses a long tail (order numbers, unsubscribe IDs, tracking pixels with numeric hashes). Assrt runs seven in priority order, most specific first. Labelled codes win over bare digits.
Before the regex runs, the code prefers body_text and falls back to stripping tags from body_html, then decoding and numeric entities, then collapsing whitespace. That pre-step is why the patterns below only have to care about a clean string. The same function also returns the full decoded body (up to 5000 chars) so magic-link URL extraction is a second regex on the same payload.
“Labelled codes preferred over bare digits so an order number in the subject cannot be mistaken for the OTP.”
assrt/src/core/email.ts:101-109
Part 2: the split-digit OTP trick nobody writes about
Half the modern web uses a 4-to-6 input OTP form where every digit gets its own <input maxlength="1">. Clerk, Stripe, Supabase Auth, Shadcn InputOTP, MUI, Chakra PinInput. Every one of those components wires its splitter to the onPaste event, not onChange. A naive Playwright loop that .fill each field fights with the component's own focus management and will intermittently land only the first digit.
Three things make this work. First, the ClipboardEvent is dispatched on the parent container, not any individual input, because the component listens up the tree. Second, the event bubbles, so React's synthetic event system delivers it to the handler that owns the splitter. Third, DataTransfer is a real DOM type and React reads from it the same way it would for a user paste. One round trip fills every field. No per-digit typing, no focus sequencing, no hidden blur events between digits.
Four numbers worth memorising
Everything on this page reduces to four numbers. Seven is how many patterns the cascade tries. Sixty is the default poll window in seconds. Three is the poll interval. One is the number of evaluate calls needed to fill every digit in a split-input OTP form. Count these against any competing guide on this keyword and see how many are named.
How the whole flow fits together
Three inputs, one agent loop, four outputs. The left side is what the reader controls; the middle is the Assrt runtime (18-tool closed set); the right is what lands in your repo and your CI after a pass.
Inputs, agent loop, outputs
Forms that break a per-field .fill loop
Every component library in the chip row below installs its OTP splitter on onPaste. Every one of them is broken by a naive per-field type loop. Every one of them accepts a single ClipboardEvent dispatched on the parent. If your app uses any of these, the trick from the previous section is the difference between a green suite and a flake you chase for a week.
Naive per-field fill vs ClipboardEvent
Left tab: the Playwright code a reasonable engineer writes first. It works against a plain text input, but breaks against every library on the chip row above. Right tab: the two-line evaluate that actually ships with Assrt.
Filling a 6-digit OTP form
// Typical Playwright attempt against a 6-field OTP form for (let i = 0; i < 6; i++) { await page .locator('input[maxlength="1"]') .nth(i) .fill(code[i]); } // Intermittently breaks on: // - React-controlled inputs that advance focus on onChange // - forms that dispatch onPaste, not onChange // - components that debounce 'complete' until a paste event // - Shadcn InputOTP, which binds to onPaste only // // Failure mode: only the first digit lands, or focus flips back // to field 0 after each .fill(), and all six end up as one digit.
- Six round trips, six chances for a focus race
- Fights the component's own onPaste splitter
- Fails silently against Shadcn InputOTP and Clerk
What a real run looks like in the terminal
The trace below is an actual OTP scenario run against a local dev server. Note the regex that matched (pattern 1: the labelled code: form), the detection of six split inputs, and the single evaluate call that fills them. Total time, eleven seconds, most of it spent waiting on email delivery.
What Assrt emits as committable Playwright
The English scenario.md becomes real Playwright code you can commit. No proprietary YAML, no vendor DSL, no per-run account tokens. The extract below is what a graduated spec looks like once the runtime refs are resolved to role+name queries and the evaluate call is inlined.
Assrt vs every other OTP testing tool
Every row below is something a reader hits inside the first hour of writing OTP tests against a real app. The left column is what the SERP says; the right is what Assrt does.
| Feature | Typical OTP testing stack | Assrt (self-hosted) |
|---|---|---|
| Disposable inbox | MailSlurp paid plan, keyed to your account | temp-mail.io /api/v3, no key, one HTTP POST |
| Code extraction | Your own regex, written ad hoc per email template | 7-pattern priority cascade in email.ts:101-109 |
| Split-digit OTP form | Not documented; most users type each field by hand | ClipboardEvent on parent, all inputs filled in one evaluate |
| Magic link in same context | Open link in new tab, lose session, retry logic | Extract URL with regex, page.goto() same context, consume once |
| HTML email fallback | Regex body_html directly, breaks on <br>, entities | Strip tags, decode / &#...;, collapse whitespace, then regex |
| Poll behaviour | setTimeout loop you write every time | waitForEmail(timeoutMs=60000, intervalMs=3000) with AbortSignal |
| Test output | Proprietary YAML or platform DSL locked to the vendor | Real Playwright, your repo, no lock-in |
| Pricing | From $59/mo (MailSlurp) to $7,500/mo (enterprise runners) | Open source, self-hosted, zero per-run cost |
The magic link trap, and how to avoid it
Magic links are a second-order OTP: the code is a URL, and the URL is usually consume-on-GET. Two things go wrong in naive test setups. First, clicking the link inside the email client opens a new browser context and the Playwright session is lost. Second, if the test retries (network flake, slow assertion) the second GET invalidates the session because the token has already been spent. The fix is the same shape as OTP code extraction: the waitForVerificationCode return payload contains body (the first 5000 chars of the cleaned email), you run a URL regex against it, and then call page.goto(url) in the same Playwright context. One visit, one session, no cross-tab loss. Retries have to happen above the waitForVerificationCode call so each retry gets a fresh token in a fresh email.
What this page is not trying to be
Not a philosophy of passwordless auth. Not a comparison of OTP vs magic link as a UX pattern (Scalekit and JumpCloud already rank for that and are fine). This is the part of the problem that stops being written about two clicks into the SERP: the regex that actually runs against real emails, and the one-line dispatch that gets past every modern OTP form. Copy the code, ignore the product, and your own Playwright tests will get more reliable tonight. Or install Assrt and get it for free on top of the rest of the 18-tool agent.
See your OTP flow test itself, end to end
Twenty minutes. Bring one real signup flow (email OTP, magic link, or split-digit form). We run it live and hand you the scenario.md plus the graduated Playwright spec.
Book a call →Questions the top OTP testing guides never answer
What is the most common reason OTP tests fail intermittently?
Split-digit OTP forms with one <input maxlength="1"> per character. A plain Playwright .fill() or .type() per field fights with the component's own focus logic: libraries like Shadcn's InputOTP, Clerk, and MUI bind their splitter to the onPaste event, not onChange, so per-field typing lands in field 0 or silently blanks the other inputs. Assrt's system prompt at assrt/src/core/agent.ts:234 explicitly instructs the agent, when it detects more than one input[maxlength="1"], to call evaluate with a DataTransfer and dispatch a single ClipboardEvent on the parent. That matches the event the component was designed to accept, and all six digits land at once.
How does Assrt extract the code from arbitrary email templates?
A 7-pattern priority cascade in assrt/src/core/email.ts (lines 101 to 109). The patterns run in this order: code: followed by 4-8 digits, verification: followed by 4-8 digits, OTP: then digits, PIN: then digits, a bare 6-digit run, a bare 4-digit run, a bare 8-digit run. Before matching, if body_text is empty the code falls back to body_html and strips tags, decodes and numeric entities, and collapses whitespace. The first pattern that matches wins. Labelled codes are preferred over bare digits so an order number in the subject cannot be mistaken for the OTP.
Which disposable email service does Assrt use and does it cost anything?
temp-mail.io, internal v3 endpoint at https://api.internal.temp-mail.io/api/v3. The source of truth is assrt/src/core/email.ts:9. It calls POST /email/new to mint a fresh inbox (returns { email, token }) and GET /email/{address}/messages to poll. There is no API key; there is no account. It is free. Each DisposableEmail in Assrt is its own address, so concurrent scenarios do not collide. A 15-second AbortSignal.timeout guards every fetch, so a temporary temp-mail outage fails fast instead of hanging the run.
How should magic link flows be tested without losing the browser session?
Do not click the link inside the email body: that opens a second browser context and loses the test session. Instead, let waitForVerificationCode return the email (body_text stays on the object), match the URL with a regex against the body, and pass that URL to page.goto in the same Playwright context. For consume-on-GET links (Auth0 passwordless, Supabase magic link) this is load-bearing: a second GET invalidates the session, so retries must happen inside one navigation. Assrt's 7-regex cascade is code-focused, but the returned body field (up to 5000 chars) is what you run your URL regex against.
What is the default poll window and can it be shortened?
Default timeout is 60000 ms, default interval is 3000 ms (email.ts:67 and :82). The wait_for_verification_code tool accepts a timeout_seconds parameter; the interval is fixed at 3 seconds because temp-mail.io rate-limits hard below that. In practice most transactional email providers (Resend, Postmark, SendGrid) deliver in under 8 seconds, so a 60-second window is more than enough; if your provider is slower, raise timeout_seconds to 120 and keep the interval where it is.
Does Assrt work with phone/SMS OTP as well, or only email?
Only email for OTP code extraction, today. The wait_for_verification_code tool binds to the DisposableEmail created by create_temp_email. For SMS-based OTP, the http_request tool is the escape hatch: point it at your SMS provider's receive-number API (a 5sim virtual number, a Twilio inbound webhook you tail, etc.), regex the code out of the response body, then run the same clipboard-paste evaluate on the split-digit form. The split-digit fix is provider-agnostic; only the upstream delivery changes.
Why not just generate a Playwright spec and commit it, like Codegen?
You can, and that is the point. Assrt's runner speaks real Playwright via Playwright MCP; a passing #Case can be transcribed into a committable .spec.ts file with the same selectors (runtime-resolved refs become role+name queries) and the same evaluate call for OTP. The differentiator vs $7,500/month vendors is that the output is yours: real Playwright, zero YAML, zero account tokens, self-hosted. If a junior on the team finds the Markdown #Case harder to reason about than code, graduate it to a spec file and run both.
What happens if none of the 7 regexes match?
waitForVerificationCode still returns, with code: "" and the first 5000 characters of body (plain text) included. The agent then sees an empty code and, under the normal Error Recovery rules in the system prompt, can re-snapshot, inspect the email body directly via check_email_inbox, and either retry with a wider pattern or report the scenario as failed with the email body as evidence. In CI, a zero-length code is a hard fail, which is what you want: silently passing with a blank OTP is a much worse outcome than a loud failure.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.