Testing, primer
Test cases in software testing: the 1998 template you keep being taught, and what is quietly replacing it
Open any guide on this topic and you get the same eight fields in the same order: Test Case ID, Description, Preconditions, Steps, Test Data, Expected Result, Actual Result, Status. That row of fields is from a standard that was withdrawn. Here is what a test case actually looks like when an LLM is the executor, with the exact format from a real open-source runner.
Direct answer (verified 2026-05-08)
A test case is a specific, repeatable check that says: given this setup, do these steps, observe this result. Traditionally it is an eight-field record (ID, description, preconditions, steps, test data, expected result, actual result, status) stored in a test management tool.
Increasingly it is a one-or-two line Markdown paragraph an LLM reads and executes against a real browser, with run history stored as JSON next to the plan. Both shapes carry the same information. The second shape is what an agent in the editing loop can actually act on.
The eight-field template comes from IEEE 829, the Standard for Software and System Test Documentation. The 1998 revision was the one every Excel template copied. IEEE 829-2008 was superseded by ISO/IEC/IEEE 29119 and the 829 standard itself has been withdrawn.
“The IEEE 829 record has eight fields. The Assrt prompt that auto-generates the same case caps it at one or two lines of body, four steps maximum.”
agent.ts:259-267, assrt-ai/assrt-mcp
Where the eight-field template came from
The shape every QA bootcamp teaches is not a law of nature. It is the test case description specification (TCDS) section of IEEE 829, first published in 1983, last revised in 2008. The fields were designed for a world where test cases were written by dedicated QA staff, executed by hand against a release candidate, and reported on paper to a release board. The standard expected you to be able to hand the document to a stranger who would execute it identically.
That world had different constraints. Releases were rare, the executor was human, traceability mattered to auditors, and the test case lived in a document control system rather than a repository. The eight-field record solved real problems for that shape of work.
Two things have changed since. The standard was withdrawn (829 went inactive after the IEEE balloting cycle and was succeeded by ISO/IEC/IEEE 29119). And the executor is no longer always a human; for an end-to-end web check, the executor is increasingly a coding agent driving a real browser through Playwright. The template did not adapt with that shift.
The same case, in both shapes
To make the contrast concrete, here is the same login check written first as the IEEE 829 eight-field record (this is the template you find on most QA blogs), then as the Markdown plan a coding agent executes through the Assrt MCP server. They carry the same information. They are not the same artifact.
One login check, two shapes
Test Case ID: TC-AUTH-001
Description: Verify successful sign-in
with a valid existing
email address.
Preconditions: - User has a verified
account.
- Browser has no
existing session.
Test Steps: 1. Navigate to /signin
2. Enter email address
3. Click "Continue"
4. Enter password
5. Click "Sign in"
Test Data: email=test@example.com
password=********
Expected Result: User is redirected to
/dashboard within 2s.
Actual Result: <to be filled at run>
Status: <Pass / Fail / Blocked>The eight-field record was written for a human reading a document. The Markdown plan was written for an LLM reading a file, and incidentally a human reading the same file in the same repo. The LLM does not need a separate Test Case ID column (the headers in the file already identify it), it does not need a separate Test Data column (data goes inline), and it does not need a Status column (status is the JSON the runner emits next to the plan).
The Markdown form is not a simplification for amateurs. It is a response to a different executor. When the executor is a model, fewer columns is more readable. The fields you remove are not information you lose; they are information that moved.
The exact format, from the source
Here is the prompt the Assrt MCP server hands to the model when it is asked to discover test cases on a fresh page. Reading it tells you more about modern test-case shape than any blog template. From src/core/agent.ts, lines 256 through 267:
DISCOVERY_SYSTEM_PROMPT (verbatim)
You are a QA engineer generating quick test cases for an AI browser agent that just landed on a new page. The agent can click, type, scroll, and verify visible text. ## Output Format #Case 1: [short name] [1-2 lines: what to click/type and what to verify] ## Rules - Generate only 1-2 cases - Each case must be completable in 3-4 actions max - Reference ACTUAL buttons/links/inputs visible on the page - Do NOT generate login/signup cases - Do NOT generate cases about CSS, responsive layout, or performance
Every line here is a design decision. One or two lines of body per case stops the model from writing a 12-step Excel macro masquerading as a test. Three to four actions max keeps each case independently runnable, which is what kills test order dependency. Reference actual buttons stops the model from inventing element names that do not exist. The two explicit forbids (no login/signup, no CSS/responsive/performance) are there because those are the cases models love to generate and that always rot first.
Notice what is not in the prompt. There is no Test Case ID field, no Preconditions field, no Test Data field, no Expected Result column. Those have been collapsed. The case header carries the ID. The body carries the steps and the assertion. Preconditions live as state the agent reads from /tmp/assrt/scenario.json. Expected and actual results live as JSON in /tmp/assrt/results/latest.json alongside the plan.
What changed when the executor changed
The QA-team-as-document-reader era is not gone. It still exists in regulated industries and in any product where a human tester is the right tool for the job. What changed is that for ordinary web app E2E checks, a second valid shape now exists: the test case as something an LLM reads and executes. Toggle between the two reader-perspectives below.
Who is the test case written for?
A new QA hire reads a printed test case description spec on Monday morning. They have to be able to execute it identically to whoever wrote it last release. Every field exists because losing context kills repeatability.
- Writer expects a human reader
- Eight fields, every column required
- Stored in a test management tool, separate from the code
- Run by hand, status entered by hand
- Audit trail is the file itself
A team that ships a regulated medical device probably stays in the document era for the cases that touch the regulated path. A team shipping a SaaS dashboard with twenty deploys a week probably moves the bulk of its E2E coverage to the agent-era shape and keeps the heavy template only where it earns its weight.
Types of test cases, briefly
Most lists of test case types are flat enumerations of about a dozen overlapping categories. The categories overlap on purpose, because a single E2E check on a sign-up form is functional, integration, regression, and (when it tests a bad email) negative all at once. The split that actually matters in practice is what you are asking the case to prove, not which taxonomy bucket it lands in.
| Feature | What it is asking to prove | Modern E2E case |
|---|---|---|
| Functional | The form submits and lands on the right route | One specific step in a user flow does the right thing |
| Integration | After signing up, the welcome email actually arrived | Two or more services agree on a contract under load |
| Regression | The full sign-in flow that shipped last sprint still passes | Behavior that worked yesterday still works today |
| Smoke / sanity | The home page renders, the login form shows, no 500 on the API root | The build is not catastrophically broken |
| Negative | Submitting an invalid email shows an error, not a 500 and a wiped form | The system fails in the way you expect, not in a worse way |
| Acceptance | A new user can sign up and reach the dashboard | The user goal in the story is actually met |
| Performance | The dashboard renders within 2s on a throttled connection | The action stays inside a budget under realistic conditions |
| Visual | Pixel diff against a baseline screenshot stays under threshold | What the user sees has not silently drifted |
When you write a case, the question to ask is not which type it is. The question is what specifically is this case trying to prove and would a passing run actually prove it. Most of the test cases that rot do so because they were never trying to prove anything specific in the first place; the writer was producing rows in a tracker, not assertions.
A practical heuristic for writing one
Whether you write your cases in IEEE 829 form for a regulator or in agent-era Markdown for your runner, the underlying discipline is the same. Start from the user goal, not the field. Pick the smallest sequence of actions a real user would do to reach that goal. Pick one concrete observation that distinguishes success from failure (not "works correctly": a specific element on a specific URL). Stop.
Three rules of thumb keep this honest. First, if the case is longer than five steps, it is two cases. Second, if the case relies on data that another case created, you have a test ordering problem; isolate it. Third, if the case asserts on an implementation detail (an internal class name, a route the user never sees), it will rot the moment that detail moves. Assert on what the user sees.
The Assrt discovery prompt enforces all three rules mechanically. The 3-4 action cap stops the model from writing a five-step case. Each generated case is self-contained (never references state from another case). And the forbid-list (no CSS, no responsive, no performance) blocks the two implementation-detail traps the model loves the most. The rules are useful even if you never use Assrt; they are good rules for any author.
Where the test cases live
One quiet consequence of the executor changing is that the question "where are the test cases stored" gets a new answer. The eight-field record assumed a test management tool: TestRail, Zephyr, Xray, an Excel sheet on a shared drive, sometimes a dedicated database. The agent-era shape puts the cases in plain text where the agent can already reach.
File layout (verified in scenario-files.ts:16-20)
/tmp/assrt/scenario.mdThe plan, plain Markdown with#Case N:headers. The agent can Read this. If the agent edits it, anfs.watchon the file syncs the change back to the central scenario store within a 1-second debounce./tmp/assrt/scenario.jsonMetadata: id, name, url, updatedAt. The agent uses this to know which scenario it is acting on across turns./tmp/assrt/results/latest.jsonMost recent run: per-case pass/fail, assertions, error strings, path to the recorded video. This is where Expected Result and Actual Result moved./tmp/assrt/results/<runId>.jsonPer-run history. The audit trail the eight-field template gave you, in a format the agent and CI can both read.
The whole point of the layout is that the cases do not live inside a vendor. The plan is a Markdown file, the runner is MIT-licensed, and the engine under the hood is Playwright. If you stop using Assrt tomorrow, the scenarios are still readable, the videos still play, and the format is still something a human can edit.
What you should still take from the 1998 template
None of this is an argument that the eight-field template was wrong. It is an argument that it was a presentation layer for a specific reader, and that the reader has changed for most teams. The discipline behind the eight fields is still load-bearing:
- Preconditions matter. They moved into the scenario metadata file and the agent's session state, but you still have to think about them, or your cases will be flaky on a clean machine.
- Expected Result is a commitment. If you cannot describe the success state before the run, the run will lie to you. The Markdown body still has to end with a verifiable observation.
- Status has to be machine readable. The pass/fail decision belongs in JSON, not in a column the human fills in. The 829 form had this idea right in spirit; it just put status in the wrong place.
- Traceability is real. For audit-bound work, you still want a stable ID per case and a way to point at runs. In the agent-era shape, the case header gives you the ID and the per-run JSON gives you the trace.
The skill of writing a good test case is not template-shaped. It is the same in both eras: pick a user goal, write the smallest set of actions that can prove it, assert on what the user sees, and put the result somewhere the next reader can act on. The eight-field record was a vehicle for that skill. So is the Markdown plan.
Want this wired into your editing loop?
Twenty minutes, a real screen-share. Bring your repo, walk away with #Case-format scenarios running against your local dev server.
FAQ
Frequently asked questions
What is a test case in software testing, in one sentence?
A test case is a specific, repeatable check that says given this setup, do these steps, and observe this result. The traditional form is an eight-field record (ID, description, preconditions, steps, test data, expected result, actual result, status) stored in a test management tool. The agent-era form is a one-or-two line Markdown paragraph an LLM can read and execute against a real browser, with the run history stored as JSON next to it.
What are the standard fields of a test case?
The eight fields every blog post lists (Test Case ID, Description, Preconditions, Test Steps, Test Data, Expected Result, Actual Result, Status) come from IEEE 829, the Standard for Software Test Documentation. The 1998 revision is what most templates copied. The 2008 revision was superseded by ISO/IEC/IEEE 29119, and IEEE 829 itself was administratively withdrawn. The fields are still useful as a thinking checklist; treating them as a hard schema is dated.
What are the main types of test cases?
By technique, the common categories are functional, integration, system, regression, smoke, sanity, acceptance, performance, security, accessibility, usability, and exploratory. By outcome, you split into positive (valid input, expected to pass) and negative (invalid input, expected to fail gracefully). The categories overlap on purpose. A single end-to-end check on a sign-up form can be functional, integration, regression, and negative in the same run.
What is the difference between a test case and a test scenario?
A test scenario is the user goal you care about, written at the level a product owner can read ("new user can sign up and reach the dashboard"). A test case is a concrete, executable check inside that scenario, with steps and an expected result. One scenario usually breaks into several cases. In agent-era runners, the line blurs: the scenario file is a Markdown plan with multiple #Case headers, and each header is one case.
How do you actually write a test case for a web app today?
Pick one user goal. Write the precondition ("on the home page, logged out"). Write the steps in the user's voice, not the developer's ("click Get started, type test@example.com, click Continue"). Write what you expect to see ("the dashboard renders with the welcome card visible"). Stop. If the case takes more than five steps, split it. The Assrt prompt that auto-generates cases at /Users/matthewdi/assrt-mcp/src/core/agent.ts:259 forces a one-or-two line body for exactly this reason.
Should I still keep test cases in Excel or a TestRail-style tool?
Keep a tool of record if your team needs traceability for audit or compliance. For day-to-day execution, the spreadsheet is dead weight; nobody opens it after the first sprint. The pragmatic shape is: scenarios live as plain text in the repo (or in /tmp/assrt/scenario.md if you are running Assrt), the tool of record holds a link to the latest run, and the run history is JSON the agent or CI can read. The eight-field row is a presentation layer, not a storage layer.
What is a good example of a test case for a login flow?
Plain Markdown form: "#Case 1: Sign in with valid email. Click Sign in, enter test@example.com, click Continue, verify the dashboard renders." That is the entire case. The IEEE 829 form would inflate this to eight fields and a fifteen-row table. The information content is the same. The Markdown form is what the LLM driving the browser actually uses, and what the human reviewer can read on the same screen as the diff.
What makes a test case bad?
Three failure modes. First, it tests an implementation detail (a CSS class name, a query selector) that will rot the moment the UI is refactored. Second, it depends on data that another case created (test order matters, which means flaky). Third, it asserts nothing concrete ("works correctly" is not an assertion). The Assrt discovery prompt explicitly forbids two of these failure modes: it refuses to generate cases about CSS, responsive layout, or performance, and it caps the steps at three to four to keep cases self-contained.
Where can I see a real test case in a real codebase?
Read /Users/matthewdi/assrt-mcp/src/core/agent.ts on GitHub at assrt-ai/assrt-mcp. Lines 256 through 267 are the discovery system prompt that prescribes the format. Line 621 is the parser regex that accepts "Scenario", "Test", or "Case" as the header keyword. The scenario file path (/tmp/assrt/scenario.md) and the run output paths (/tmp/assrt/results/latest.json, /tmp/assrt/results/<runId>.json) are defined in src/core/scenario-files.ts. The whole format is small enough to read in one sitting.
Adjacent guides
Keep reading
Test coverage during agentic coding: put the runner inside the loop
What test coverage means when a coding agent is editing in a tool loop, and the MCP integration that closes the gap CI cannot.
AI test case generation from requirements
Converting product requirements into executable Playwright cases, while keeping the generated code consistent with the team's existing patterns.
Manual QA test case discovery: a systematic guide
Boundary analysis, equivalence partitioning, decision tables, and risk-based prioritization for the cases an automation pass cannot find.