结构化输出：让 AI 稳定返回 JSON

本文是【AI 专题精讲】系列第 08 篇。上一篇：意图识别：从关键词匹配到语义路由 | 下一篇：AI 缓存策略：精确缓存 + 语义缓存，省钱又提速

这篇文章你会得到什么

AI 返回的是自由文本。你让它返回 JSON，它大部分时候会听话，但总有那么几次：

多了一段 “好的，以下是 JSON 格式的结果：” 然后才是 JSON
JSON 里某个字段缺了，或者多了个逗号
数字类型返成了字符串，"price": "99" 而不是 "price": 99
直接返了一大段 Markdown，根本不是 JSON

这在 demo 阶段无所谓，但到了生产环境——前端 JSON.parse() 一崩，整个页面白屏。

今天给你三种让 AI 稳定返回结构化数据的方案，外加一套验证和容错机制：

JSON Mode：最简单，告诉 AI “只返回 JSON”
Structured Outputs：给 AI 一个 JSON Schema，强制按结构返回
Function Calling：定义函数签名，AI 填参数

方案一：JSON Mode

最低门槛的方案。OpenAI 在 2023 年底引入的 response_format 参数：

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "system",
            "content": '从用户描述中提取商品信息。返回 JSON 格式：{"name": "商品名", "price": 数字, "category": "分类"}',
        },
        {"role": "user", "content": "我想买一双耐克跑步鞋，预算 800 块"},
    ],
    response_format={"type": "json_object"},
    temperature=0,
)

import json
result = json.loads(response.choices[0].message.content)
# {"name": "耐克跑步鞋", "price": 800, "category": "运动鞋"}

JS/TS 写法

const response = await openai.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [
    {
      role: 'system',
      content: '从用户描述中提取商品信息。返回 JSON：{"name": "商品名", "price": 数字, "category": "分类"}',
    },
    { role: 'user', content: '我想买一双耐克跑步鞋，预算 800 块' },
  ],
  response_format: { type: 'json_object' },
  temperature: 0,
});

const result = JSON.parse(response.choices[0].message.content!);

JSON Mode 的问题

不保证 Schema：AI 确实返回 JSON 了，但字段名可能和你期望的不一样（比如 product_name 而不是 name）
类型不保证：你要数字，它可能给字符串
嵌套结构容易出错：对象套数组套对象，AI 越写越乱

JSON Mode 只保证”返回的是合法 JSON”，不保证”JSON 结构是你要的”。

方案二：Structured Outputs（推荐）

OpenAI 在 2024 年推出的 Structured Outputs，允许你传一个 JSON Schema，AI 严格按 Schema 返回：

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "从用户描述中提取商品信息。"},
        {"role": "user", "content": "我想买一双耐克跑步鞋，预算 800 块"},
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "product_info",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string", "description": "商品名称"},
                    "price": {"type": "number", "description": "价格（元）"},
                    "category": {
                        "type": "string",
                        "enum": ["服装", "运动鞋", "电子产品", "食品", "其他"],
                    },
                    "tags": {
                        "type": "array",
                        "items": {"type": "string"},
                        "description": "商品标签",
                    },
                },
                "required": ["name", "price", "category", "tags"],
                "additionalProperties": False,
            },
        },
    },
    temperature=0,
)

result = json.loads(response.choices[0].message.content)
# {"name": "耐克跑步鞋", "price": 800, "category": "运动鞋", "tags": ["耐克", "跑步", "运动"]}

用 Pydantic 定义 Schema（更优雅）

手写 JSON Schema 太痛苦。用 Pydantic 定义模型，自动生成 Schema：

from pydantic import BaseModel, Field

class ProductInfo(BaseModel):
    name: str = Field(description="商品名称")
    price: float = Field(description="价格（元）")
    category: str = Field(description="分类")
    tags: list[str] = Field(description="商品标签", default_factory=list)


def structured_extract(text: str, schema_model: type[BaseModel]) -> BaseModel:
    schema = schema_model.model_json_schema()

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "从用户描述中提取结构化信息。"},
            {"role": "user", "content": text},
        ],
        response_format={
            "type": "json_schema",
            "json_schema": {
                "name": schema_model.__name__,
                "strict": True,
                "schema": schema,
            },
        },
        temperature=0,
    )

    data = json.loads(response.choices[0].message.content)
    return schema_model.model_validate(data)


product = structured_extract("我想买一双耐克跑步鞋，预算 800 块", ProductInfo)
print(product.name)      # "耐克跑步鞋"
print(product.price)     # 800.0
print(product.category)  # "运动鞋"

OpenAI SDK 原生支持

最新版 OpenAI SDK 直接支持 Pydantic 模型：

response = client.beta.chat.completions.parse(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "提取商品信息"},
        {"role": "user", "content": "我想买一双耐克跑步鞋，预算 800 块"},
    ],
    response_format=ProductInfo,
)

product = response.choices[0].message.parsed
# 直接是 ProductInfo 实例，不用手动 parse

复杂嵌套结构

class OrderItem(BaseModel):
    product: str
    quantity: int
    unit_price: float

class OrderInfo(BaseModel):
    customer_name: str
    items: list[OrderItem]
    total_amount: float
    delivery_address: str
    notes: str = ""


order = structured_extract(
    "张三要买3双耐克鞋每双800块，2件阿迪T恤每件300块，送到北京朝阳区xx小区，备注要发票",
    OrderInfo,
)
# OrderInfo(
#   customer_name="张三",
#   items=[
#     OrderItem(product="耐克鞋", quantity=3, unit_price=800.0),
#     OrderItem(product="阿迪T恤", quantity=2, unit_price=300.0),
#   ],
#   total_amount=3000.0,
#   delivery_address="北京朝阳区xx小区",
#   notes="要发票",
# )

方案三：Function Calling

不直接要求 AI 返回 JSON，而是定义一个”函数”，让 AI “调用”这个函数并填入参数。本质也是结构化输出：

tools = [
    {
        "type": "function",
        "function": {
            "name": "extract_product",
            "description": "从用户描述中提取商品信息",
            "parameters": {
                "type": "object",
                "properties": {
                    "name": {"type": "string", "description": "商品名称"},
                    "price": {"type": "number", "description": "价格"},
                    "category": {"type": "string", "description": "分类"},
                },
                "required": ["name", "price", "category"],
            },
        },
    }
]

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "我想买一双耐克跑步鞋，预算 800 块"}],
    tools=tools,
    tool_choice={"type": "function", "function": {"name": "extract_product"}},
)

args = json.loads(response.choices[0].message.tool_calls[0].function.arguments)
# {"name": "耐克跑步鞋", "price": 800, "category": "运动鞋"}

Function Calling vs Structured Outputs

维度	Function Calling	Structured Outputs
适用场景	需要执行操作（调 API、查数据库）	纯数据提取
Schema 严格度	较松（字段可能缺失）	strict: true 时 100% 严格
多工具选择	支持（AI 选调哪个函数）	不支持
嵌套复杂度	好	好

经验法则：纯提取数据用 Structured Outputs，需要”选择做什么并带参数”用 Function Calling。

前端验证：Zod

后端用 Pydantic 验证，前端/Node.js 用 Zod：

import { z } from 'zod';

const ProductSchema = z.object({
  name: z.string(),
  price: z.number().positive(),
  category: z.enum(['服装', '运动鞋', '电子产品', '食品', '其他']),
  tags: z.array(z.string()).default([]),
});

type Product = z.infer<typeof ProductSchema>;

function parseAIResponse(raw: string): Product {
  const data = JSON.parse(raw);
  return ProductSchema.parse(data);
}

try {
  const product = parseAIResponse(response.choices[0].message.content!);
  console.log(product.name);
} catch (error) {
  if (error instanceof z.ZodError) {
    console.error('AI 输出格式不符合预期：', error.issues);
  }
}

Zod 的妙用：从 Schema 生成 Prompt

function zodToPrompt(schema: z.ZodObject<any>): string {
  const shape = schema.shape;
  const fields = Object.entries(shape).map(([key, value]) => {
    const zodType = value as z.ZodTypeAny;
    let type = 'unknown';
    if (zodType instanceof z.ZodString) type = 'string';
    if (zodType instanceof z.ZodNumber) type = 'number';
    if (zodType instanceof z.ZodArray) type = 'array';
    if (zodType instanceof z.ZodEnum) type = `enum(${(zodType as any)._def.values.join('|')})`;
    return `  "${key}": ${type}`;
  });

  return `返回 JSON 格式：\n{\n${fields.join(',\n')}\n}`;
}

const prompt = zodToPrompt(ProductSchema);
// 返回 JSON 格式：
// {
//   "name": string,
//   "price": number,
//   "category": enum(服装|运动鞋|电子产品|食品|其他),
//   "tags": array
// }

容错策略

再好的方案也不能保证 100% 成功。生产环境必须有容错：

1. 自动修复

AI 返回的 JSON 可能有小毛病（多个逗号、缺括号），尝试自动修复：

def safe_json_parse(text: str) -> dict | None:
    # 尝试直接解析
    try:
        return json.loads(text)
    except json.JSONDecodeError:
        pass

    # 尝试提取 JSON 块（AI 可能在 JSON 外面加了说明文字）
    import re
    json_match = re.search(r'\{[\s\S]*\}', text)
    if json_match:
        try:
            return json.loads(json_match.group())
        except json.JSONDecodeError:
            pass

    # 尝试修复常见问题
    cleaned = text.strip()
    if cleaned.startswith('```'):
        lines = cleaned.split('\n')
        lines = [l for l in lines if not l.startswith('```')]
        cleaned = '\n'.join(lines)
        try:
            return json.loads(cleaned)
        except json.JSONDecodeError:
            pass

    return None

2. 重试 + Schema 修复

解析失败后，把错误信息反馈给 AI 重新生成：

async def structured_call_with_retry(
    messages: list[dict],
    schema_model: type[BaseModel],
    max_retries: int = 3,
) -> BaseModel:
    last_error = None

    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=messages,
                response_format={
                    "type": "json_schema",
                    "json_schema": {
                        "name": schema_model.__name__,
                        "strict": True,
                        "schema": schema_model.model_json_schema(),
                    },
                },
                temperature=0,
            )

            data = json.loads(response.choices[0].message.content)
            return schema_model.model_validate(data)

        except (json.JSONDecodeError, Exception) as e:
            last_error = e
            messages.append({
                "role": "assistant",
                "content": response.choices[0].message.content if response else "",
            })
            messages.append({
                "role": "user",
                "content": f"你的 JSON 格式有问题：{str(e)}。请严格按 Schema 重新输出。",
            })

    raise ValueError(f"重试 {max_retries} 次仍然失败: {last_error}")

3. 降级策略

结构化输出彻底失败时，降级到自由文本：

async def extract_with_fallback(text: str) -> dict:
    # 优先：Structured Outputs
    try:
        return structured_extract(text, ProductInfo).model_dump()
    except Exception:
        pass

    # 降级：JSON Mode
    try:
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                {"role": "system", "content": "提取商品信息，返回 JSON"},
                {"role": "user", "content": text},
            ],
            response_format={"type": "json_object"},
        )
        return json.loads(response.choices[0].message.content)
    except Exception:
        pass

    # 兜底：返回原始文本
    return {"raw_text": text, "parse_failed": True}

各厂商支持情况

厂商	JSON Mode	Structured Outputs	Function Calling
OpenAI	gpt-4o-mini, gpt-4o	gpt-4o-mini, gpt-4o	全系列
DeepSeek	支持	不支持	支持
Claude	不支持（Prompt 控制）	不支持	支持（tool_use）
通义千问	支持	不支持	支持
Gemini	支持	支持	支持

对于不支持 Structured Outputs 的模型，用 Prompt + 验证 + 重试的方式也能达到类似效果：

def prompt_based_structured_output(
    client,
    model: str,
    text: str,
    schema_model: type[BaseModel],
) -> BaseModel:
    schema_json = json.dumps(
        schema_model.model_json_schema(),
        indent=2,
        ensure_ascii=False,
    )

    response = client.chat.completions.create(
        model=model,
        messages=[
            {
                "role": "system",
                "content": f"""你必须严格按以下 JSON Schema 返回数据，不要包含任何其他内容：

{schema_json}

只返回 JSON，不要 markdown 代码块，不要解释。""",
            },
            {"role": "user", "content": text},
        ],
        temperature=0,
    )

    raw = response.choices[0].message.content
    data = safe_json_parse(raw)
    if data is None:
        raise ValueError(f"无法解析 JSON: {raw[:200]}")

    return schema_model.model_validate(data)

实战场景

场景 1：AI 生成表单配置

class FormField(BaseModel):
    name: str
    label: str
    type: str = Field(description="input/select/textarea/date/number")
    required: bool = True
    options: list[str] = Field(default_factory=list, description="select 类型的选项")
    placeholder: str = ""

class FormConfig(BaseModel):
    title: str
    fields: list[FormField]


form = structured_extract(
    "做一个请假申请表单，要填姓名、部门（技术部/产品部/设计部）、请假类型（年假/事假/病假）、开始日期、结束日期、请假原因",
    FormConfig,
)

# 前端拿到 FormConfig 直接渲染动态表单

场景 2：AI 提取用户意图 + 参数

class IntentWithParams(BaseModel):
    intent: str = Field(description="意图类型")
    params: dict = Field(description="提取的参数")
    confidence: float = Field(description="置信度 0-1")


result = structured_extract(
    "帮我查一下北京到上海 3 月 15 号的高铁票",
    IntentWithParams,
)
# IntentWithParams(
#   intent="search_train",
#   params={"from": "北京", "to": "上海", "date": "3月15日", "type": "高铁"},
#   confidence=0.95,
# )

场景 3：AI 批量标注数据

class SentimentLabel(BaseModel):
    text: str
    sentiment: str = Field(description="positive/negative/neutral")
    keywords: list[str]

class BatchLabels(BaseModel):
    results: list[SentimentLabel]


labels = structured_extract(
    """标注以下评论的情感：
1. 这个产品太好用了，强烈推荐！
2. 质量很差，退货了
3. 还行吧，一般般""",
    BatchLabels,
)

总结

JSON Mode 是入门——保证返回合法 JSON，但不保证结构和类型。
Structured Outputs 是首选——JSON Schema + strict: true，字段、类型、枚举全部严格保证。
Function Calling 适合动作场景——AI 选择调哪个函数并填参数，意图识别 + 结构化一步到位。
Pydantic / Zod 双端验证——Python 用 Pydantic，TypeScript 用 Zod，类型安全不打折。
容错三件套：自动修复（正则提取 JSON）→ 重试（错误反馈给 AI）→ 降级（退回自由文本）。
不支持 Structured Outputs 的模型——Prompt 约束 + 验证 + 重试也能做到 95%+ 的成功率。

下一篇聊 AI 缓存策略：同样的问题反复问，每次都调 API 太浪费。精确缓存 + 语义缓存，省钱又提速。

下一篇预告：09 | AI 缓存策略：精确缓存 + 语义缓存，省钱又提速

讨论话题：你在处理 AI 输出的 JSON 时踩过什么坑？有没有遇到过 AI 死活不按格式返回的情况？评论区聊聊你的解决方案。