Disambiguation: medical lab test plus engineering analogy

TSH 3rd generation lab test: a high-sensitivity thyroid assay, briefly and honestly

Direct answer (verified 2026-05-10)

A TSH 3rd generation lab test is a high-sensitivity thyroid stimulating hormone immunoassay with a functional sensitivity around 0.01 mIU/L. It is the same analyte that ARUP, Mayo Clinic Labs, Labcorp, and Quest also list as Sensitive TSH, s-TSH, HS-TSH, or Ultrasensitive TSH. The reference range for the assay is the standard adult TSH range (about 0.4 to 4.0 mIU/L per the American Thyroid Association); the third generation does not change the range, it lowers the floor of reliable detection so that a fully suppressed TSH can be distinguished from a low-normal one.

Authoritative sources for the clinical answer: ARUP Laboratories test 0070225, Mayo Clinic Labs test 8939, MedlinePlus. This page is published on a software testing site and is not medical advice. For interpretation, talk to your clinician.

M
Matthew Diakonov
6 min read

Why this page exists on a software testing site

Two different audiences land on this exact phrase. Patients and clinicians arrive looking for the medical assay; the answer above and the linked authorities are for them. Engineers arrive because “generations of testing sensitivity” is a metaphor that shows up in software QA, and the clinical version is the cleanest example of how the metaphor actually works. Rather than write a padded post that pretends to be one or the other, this page tries to do both jobs in the open: a real direct answer for the medical query, then a small section that draws the analogy back to test automation for the engineering reader.

Nothing below is medical advice. The clinical facts are summarized from public materials by ARUP Laboratories, Mayo Clinic Labs, MedlinePlus, the American Thyroid Association, and Cleveland Clinic. Each fact links to its source so the page is checkable.

What changed between TSH generations

The number on the front of the assay refers to a roughly tenfold drop in the lowest TSH value the test can reliably resolve. Each generation kept the upper end alone and pushed the floor down. The original radioimmunoassay could not distinguish a low-normal TSH from a suppressed one because both fell under its noise floor; the third-generation immunometric assay can.

Feature1st generation TSH3rd generation TSH
Functional sensitivity (lower limit of reliable detection)About 1 to 2 mIU/LAbout 0.01 mIU/L
MethodRadioimmunoassay (RIA)Chemiluminescent immunometric assay (ICMA)
Can it distinguish low-normal from suppressed?No, both read as 'low'Yes, by roughly two orders of magnitude
Common clinical useLargely retired for routine workSubclinical hyperthyroidism, thyroid cancer follow-up

For the formal methodology and reference intervals, see the Paloma Health overview and the British Thyroid Foundation thyroid function tests page.

The engineering analogy: generations of test sensitivity

If you arrived here researching software testing, the medical framing is borrowed but the structure is the same. Each generation of test tooling pushes the floor of detectable regressions lower, while the upper end (catching outright crashes) was already solved in generation one. The interesting question, in both fields, is what the floor looks like and whether the new sensitivity is measuring real signal or amplifying noise.

Generations of testing sensitivity, applied to software

  • Generation 1 in software testing was manual click-through QA. It detects the obvious failures and nothing else.
  • Generation 2 was record-and-replay or hand-written Selenium and Playwright scripts. More sensitive, but brittle: a small DOM change reads as a regression.
  • Generation 3 is agentic test execution against an accessibility tree. Picks up real regressions while ignoring cosmetic DOM churn, the same way a 3rd gen TSH assay reads through noise that earlier generations could not.
  • The clinical metric for sensitivity is mIU/L. The engineering metric is 'real bugs caught per false alarm raised'. Different units, same shape: each generation pushes the floor down.
  1. Generation 1: manual QA. A human clicks through the app, reports what they saw. Catches obvious crashes and visible regressions. Very low sensitivity to anything subtle.
  2. Generation 2: scripted automation. Selenium, Cypress, hand-written Playwright. Fixed selectors. More repeatable, but the floor of detectable regressions is set by how much the DOM is allowed to drift before tests start failing for the wrong reasons.
  3. Generation 3: agentic test execution. The agent re-reads the accessibility tree on each step, re-derives targets, and asserts on observable behavior rather than DOM shape. Lower floor for false positives, higher floor for cosmetic noise. This is where Assrt sits.
  4. The diminishing-returns curve looks similar in both fields: each generation drops the floor by about an order of magnitude, but each new generation also requires more interpretation skill from the person reading the result.

What this page is not

It is not a clinical reference. It does not list reference ranges for pediatric or pregnant patients, does not describe drug interactions, and does not interpret a specific lab result. For any of that, the linked authorities are the right destinations, and a clinician who has your chart is the right interpreter. The rest of this site is about software test automation; if you came for thyroid information, treat the links as the page and treat everything else as context.

Building software, not running a lab? Talk through your testing setup

If you found this page through the engineering analogy and want to see how an open-source agent does third-generation testing on a real app, book a short call.

Frequently asked questions

What is a TSH 3rd generation lab test, in one sentence?

It is a high-sensitivity thyroid stimulating hormone immunoassay with a functional sensitivity in the neighborhood of 0.01 mIU/L, sensitive enough to distinguish a low-normal TSH from a fully suppressed TSH. ARUP Laboratories lists the analyte explicitly as 'Thyroid Stimulating Hormone 3rd Generation' (test 0070225) and Mayo Clinic Labs catalogs it under 'Thyroid-Stimulating Hormone-Sensitive (s-TSH), Serum' (test 8939). For the patient-facing explanation, MedlinePlus is the canonical reference. None of those is this page; this page is published on a software testing site and links you out to the authorities for the clinical answer.

Where is the authoritative medical answer? I want a real source, not a blog.

Three places. ARUP Laboratories test directory entry for Thyroid Stimulating Hormone 3rd Generation, at ltd.aruplab.com/Tests/Pub/0070225, lists the methodology, reference range, and reporting units. Mayo Clinic Labs publishes the equivalent under sensitive TSH (s-TSH) at mayocliniclabs.com/test-catalog/overview/8939. MedlinePlus, run by the U.S. National Library of Medicine, has the patient explanation at medlineplus.gov/lab-tests/tsh-thyroid-stimulating-hormone-test/. For interpretation specific to your case, talk to your clinician; this page is not medical advice.

What is the reference range?

The American Thyroid Association considers a TSH between roughly 0.4 and 4.0 mIU/L to fall inside the adult reference interval, with method-specific and population-specific variation. A 3rd generation assay does not change the reference range itself; it changes how reliably you can read values that fall well below 0.4 mIU/L, which matters for monitoring suppressed states (treated thyroid cancer, certain thyroid hormone replacement regimens). Always read the range printed on your specific lab report, because cutoffs vary by analyzer and lab.

Why does the word 'generation' show up at all? What changed between them?

Each generation lowered the floor of the assay by roughly an order of magnitude. First generation (radioimmunoassay) bottomed out around 1 to 2 mIU/L, which is inside the normal range, so a low result was indistinguishable from a fully suppressed result. Second generation (immunoradiometric, then immunometric chemiluminescent) reached around 0.1 mIU/L. Third generation reaches around 0.01 mIU/L. Some labs now offer a fourth generation closer to 0.001 mIU/L. The progression is purely about lowering the limit of reliable measurement.

Why is this article on a software testing site?

Honesty: it is here because the phrase 'generations of testing sensitivity' has a useful meaning in both endocrinology and software testing, and writers from both fields land on this page. For the medical question, the authoritative sources are linked at the top and in the FAQ above. The other half of the page draws the analogy back to software test design, where the same idea applies: each generation of test automation lowers the floor of what counts as a detectable regression. If you came for the medical answer, you have it. If you came for the engineering analogy, the section below is for you.

What does this have to do with Assrt?

Assrt is an open-source test automation framework that sits in the third generation of software testing tools by the same logic above: it reads a live accessibility tree on every step rather than running against pre-compiled selectors, which lowers the floor of what counts as a real regression and ignores cosmetic DOM churn. That is the only connection. Assrt is not a medical product, does not interpret lab results, and has nothing to say about thyroid health. We disambiguate this on purpose because patients and clinicians searching this term deserve an honest pointer to actual medical authorities.

Is a 3rd generation TSH test the same as a 'sensitive TSH' or 'highly sensitive TSH'?

Yes, they are the same analyte under different marketing names. ARUP, Mayo, Labcorp, and Quest variously call it Thyroid Stimulating Hormone 3rd Generation, Sensitive TSH (s-TSH), HS-TSH, Highly Sensitive TSH, Thyroid Stimulating Hormone Ultrasensitive, and Third-Generation TSH. The functional sensitivity (about 0.01 mIU/L) is what the labels are pointing at. If your order says any of those, expect the same range of clinical use.

Are you claiming clinical authority on thyroid testing?

No. We are an open-source software testing project. The clinical content on this page is a short summary of what ARUP, Mayo Clinic Labs, MedlinePlus, the American Thyroid Association, and Cleveland Clinic publish, with links so you can verify each fact at the source. None of this is medical advice. For interpretation, dosing, or follow-up, talk to a clinician who has your full chart.