Structured output: when you need JSON, not prose

Most of what a model produces is for a human to read, and prose is the right shape for that. This page is about the other case: when the output feeds a system — a parser, a Flow, a database write, the next step in a pipeline — and a human never sees it. There the shape is not a stylistic choice; it is a contract. The system on the far side expects exactly these fields, these types, this structure, and it will break on anything else. When that is the job, you do not want prose you regex-scrape for the answer. You want a reliable shape.

This is a reference for getting that shape reliably. It is the payoff to a thread two prompting gotchas open — fragile format and prose-parsing (see prompting gotchas, gotchas 2 and 7) — and it builds directly on the schema discipline already laid out for agents in tools and actions; where that page covers the tool-calling mechanic in full, this one points at it rather than re-teaching it.

The problem: prose is not a contract

A prompt that says "respond with the fields below" and returns the right shape in ten test runs is not producing that shape reliably — it is producing it so far. Free-form generation is non-deterministic by nature. On the eleventh run the model opens with a friendly preamble, wraps the JSON in a code fence, reorders the fields, or renames one, and the parser that assumed the demo's exact shape chokes on the first input you didn't try. You hardened it with a regex; the model phrased the next one slightly differently and the regex missed. This is the brittle middle layer gotcha 7 names: a hand-built parser fighting a moving target, when the burden of producing a valid shape belongs on the model and the API contract instead.

The reframe that fixes it is to stop treating the output shape as something you hope for and start treating it as something you specify and enforce. Three paths do that, in order of how strong a guarantee they give. They are complementary, not a ladder — you pick the one whose guarantee matches what the downstream system actually needs.

Path 1: Structured Outputs — guaranteed schema compliance

When you need valid JSON that conforms to a schema and you cannot tolerate the occasional malformed response, this is the purpose-built path. Structured Outputs is Claude's feature for guaranteed schema adherence: you hand the API a JSON Schema, and the response is constrained — through constrained decoding — to conform to it. No preamble, no fence, no missing field, no JSON.parse() surprise. The guarantee is the point: the shape is enforced at generation time, not checked afterward and retried.

It is generally available on current Claude models — Opus 4.8, 4.7, 4.6, and 4.5, Sonnet 4.6 and 4.5, and Haiku 4.5 — through the Claude API. (The feature first shipped behind the beta header structured-outputs-2025-11-13; that header still works for a transition period, but it is no longer required now that the feature is GA.) The feature has two complementary halves: a JSON output format that constrains the whole response to your schema, and strict tool use (strict: true) that enforces the same schema validation on a tool's inputs. Reach for this path when the downstream system needs a hard guarantee — a write that cannot fail on a malformed field, a contract a parser must never have to defend against.

Path 2: Tool calling for structure — the typed arguments are the output

You can also get structure out the side door. Define a tool with a typed input schema and let the model "call" it; the arguments the model fills in are your structured output. You are not actually executing an action — you are using the tool definition as a shape the model has to produce. This is the same schema discipline tools and actions lays out in full, used for a different end: there the typed call is how an agent acts; here it is how you extract a reliable shape.

The lever that does the most work is the same one that page highlights: a typed schema, and an enum in particular, makes invalid output hard to express. For a classification task — route this ticket to one of four queues, tag this message with one of a fixed set of labels — an enum field means the model cannot return a category that does not exist. The constraint lives in the schema, not in a polite instruction the model can drift from.

{
  "name": "classify_ticket",
  "description": "Classify ONE incoming support ticket. Return the queue it routes to and the urgency. Use exactly the allowed values; do not invent new ones.",
  "input_schema": {
    "type": "object",
    "properties": {
      "queue": {
        "type": "string",
        "enum": ["billing", "technical", "account", "other"],
        "description": "Which queue this ticket routes to."
      },
      "urgency": {
        "type": "string",
        "enum": ["low", "normal", "high"],
        "description": "How urgent the ticket is."
      },
      "summary": {
        "type": "string",
        "description": "A one-sentence summary of the ticket."
      }
    },
    "required": ["queue", "urgency", "summary"]
  }
}

The enum fields cannot come back as a value you did not list; required means none of them comes back missing. When you pair this with strict tool use (path 1's second half), that schema is enforced, not merely requested — the two paths meet there. See tools and actions for the full tool-calling mechanic, the safe-tool rules, and where this fits in an agent.

Path 3: System-prompt instructions plus a precise format spec

When you need flexibility a strict schema does not give — a templated document, an XML shape, a format the JSON-Schema path does not cleanly express — you fall back to instructing the format precisely in the system prompt. This is the least-guaranteed path, so the precision is the whole job: define every element the output must contain, name each field, show the exact structure, and leave nothing to the model's discretion. "Respond in JSON" is not a spec; a worked example of the exact shape, every field labeled, with the rule that nothing precedes or follows it, is. The more precisely you pin the format, the closer this path's reliability gets to the enforced ones — but it never reaches a guarantee, which is exactly why this is the third choice, not the first.

This is also where the discipline from system prompts and instructions lands: a format spec is a hard constraint, so it earns structural salience — its own section, stated once and clearly — rather than a sentence buried mid-paragraph where the model under-weights it.

The discipline regardless of path: validate, then fall back

No path removes the two rules that make structured output safe to depend on, and they hold whether you used the guaranteed feature or the loosest format spec.

Validate every output before you use it. Schema-validate the shape against what the downstream system actually requires — the right fields, the right types, the values in range — before a single byte reaches that system. Even with a feature that guarantees schema conformance, validation is where the model's output meets your business rules, which the schema does not encode: an enum can guarantee a value is one of four queues without guaranteeing it is the correct one for this ticket, and a string field can be schema-valid and still empty. Never trust the shape blind; the check is cheap and the bad write it prevents is not.

Have a deterministic fallback when validation fails. This is principle 8 — non-determinism needs a gate where it is customer-facing. When validation fails, decide deliberately what happens: retry, route to a human, return a known-safe default, or stop. What you never do is let a model failure render as a blank, an error string, or a hallucinated value in the system downstream. A boring deterministic fallback beats an exciting wrong answer every time, and the failure path is something you design and test, not something you discover in production the first time the model returns a shape you didn't plan for. This is also principle 11 — trace it: log the raw output, the validation result, and which branch fired, so a silent shape failure is something you can replay instead of guess at.

The discipline, restated

When the output feeds a system, the shape is a contract, and the reliable way to honor a contract is to enforce it rather than hope for it. Reach for the strongest guarantee the job allows — Structured Outputs for a hard schema, a tool schema for typed arguments, a precise format spec only when you need flexibility the schema can't give — and never reach for prefill, which the newer models no longer support. Then, on every path, validate the output against your real requirements and have a deterministic fallback ready for when validation fails. Get that right and a downstream system can depend on what the model hands it. Get it wrong — trust prose, skip the validation, leave no fallback — and you have built a pipeline that runs clean for a week and breaks confidently on a Tuesday, on an output that was almost right.

Prompting gotchas — fragile format and prose-parsing, the failures this page fixes (gotchas 2 and 7)
System prompts and instructions — where a format spec earns its salience
What is context engineering — the discipline structured output sits inside
Tools and actions — the full tool-calling and schema mechanic this page builds on
Grounding gotchas — citations as a structured-output discipline on the grounding side
Prompting Style Guide — the bar a prompt clears before it ships
Debugging prompts — isolate the variable, then fix
AI Engineering principles — non-determinism needs a gate (8), trace everything (11)

Reference: