意图识别：从关键词匹配到语义路由

本文是【AI 专题精讲】系列第 07 篇。上一篇：RAG 检索优化：混合搜索、Reranking、多路召回 | 下一篇：结构化输出：让 AI 稳定返回 JSON

这篇文章你会得到什么

前六篇聚焦 RAG 链路，今天跳出来聊一个更通用的问题：用户说了一句话，系统怎么知道他想干什么？

这就是意图识别——AI 应用的”路由层”。

用户输入”帮我查一下退货流程”，系统应该去查知识库。用户输入”今天天气怎么样”，系统应该调天气 API。用户输入”你好啊”，系统应该回一句闲聊。

没有意图识别，所有输入都往 LLM 扔，结果就是：该查知识库的没查，该调工具的没调，该拒绝的没拒绝，用户体验一塌糊涂。

今天给你三个层级的方案：

Level 1：关键词 / 正则匹配——简单粗暴，5 分钟搞定
Level 2：LLM 分类——用 Prompt 让 AI 判断意图
Level 3：Embedding 语义路由——最灵活，无需修改 Prompt

再加上多意图处理、与 Function Calling 的关系，给你一套完整的意图路由框架。

意图识别在 AI 应用中的位置

用户输入
    ↓
┌─────────────┐
│  意图识别层  │  ← 判断用户想干什么
└──────┬──────┘
       ↓
┌──────┴──────────────────────────┐
│                                  │
▼            ▼            ▼        ▼
知识库查询   工具调用     闲聊      拒绝/引导
(RAG)      (API/Agent)  (Chat)   (兜底)

意图识别是入口层，决定后续走哪条处理链路。它的准确率直接影响整个系统的表现。

常见意图分类

以企业知识库助手为例：

意图	示例	处理方式
`knowledge_query`	”退货流程是什么”	RAG 检索
`data_query`	”上个月销售额多少”	调数据 API
`task_execute`	”帮我创建一个工单”	调业务 API
`chitchat`	”你好” “谢谢”	直接回复
`out_of_scope`	”帮我写个小说”	引导回正题

Level 1：关键词 / 正则匹配

最原始但最可控的方案。适合意图边界清晰、表达方式固定的场景。

import re
from dataclasses import dataclass


@dataclass
class IntentResult:
    intent: str
    confidence: float
    matched_rule: str = ""


class KeywordIntentClassifier:
    def __init__(self):
        self.rules = [
            {
                "intent": "knowledge_query",
                "keywords": ["怎么", "如何", "什么是", "流程", "规定", "政策", "制度"],
                "patterns": [r".*(?:查|搜|找).*(?:知识|文档|手册)"],
            },
            {
                "intent": "data_query",
                "keywords": ["多少", "数据", "统计", "报表", "销售额", "营收"],
                "patterns": [r".*(?:上个?月|今年|Q\d).*(?:多少|数据)"],
            },
            {
                "intent": "task_execute",
                "keywords": ["创建", "提交", "发起", "申请", "审批"],
                "patterns": [r"帮我(?:创建|提交|发起)"],
            },
            {
                "intent": "chitchat",
                "keywords": ["你好", "谢谢", "再见", "哈哈", "嗯"],
                "patterns": [r"^(?:你好|hi|hello|谢谢|好的)$"],
            },
        ]

    def classify(self, text: str) -> IntentResult:
        text_lower = text.lower().strip()

        for rule in self.rules:
            # 正则匹配（优先）
            for pattern in rule.get("patterns", []):
                if re.search(pattern, text_lower):
                    return IntentResult(
                        intent=rule["intent"],
                        confidence=0.9,
                        matched_rule=f"pattern: {pattern}",
                    )

            # 关键词匹配
            for keyword in rule.get("keywords", []):
                if keyword in text_lower:
                    return IntentResult(
                        intent=rule["intent"],
                        confidence=0.7,
                        matched_rule=f"keyword: {keyword}",
                    )

        return IntentResult(intent="unknown", confidence=0.0)

优缺点

优点	缺点
零延迟，零成本	覆盖不了所有表达方式
100% 可控可调试	维护规则越来越多会变成噩梦
不依赖外部服务	同义词、口语化表达处理不好

适用场景：意图类型少（< 5 种）、表达方式固定（内部系统）、对延迟极度敏感。

Level 2：LLM 分类

用大模型理解自然语言，通过 Prompt 让 AI 判断意图。覆盖面和准确率远超关键词。

基础实现

from openai import OpenAI

class LLMIntentClassifier:
    def __init__(self, client: OpenAI, model: str = "gpt-4o-mini"):
        self.client = client
        self.model = model

    def classify(self, text: str, intents: list[dict]) -> IntentResult:
        intent_desc = "\n".join(
            f"- {i['name']}: {i['description']}"
            for i in intents
        )

        response = self.client.chat.completions.create(
            model=self.model,
            messages=[
                {
                    "role": "system",
                    "content": f"""你是一个意图分类器。根据用户输入判断意图类别。

可选意图：
{intent_desc}

返回 JSON 格式：
{{"intent": "意图名称", "confidence": 0.0-1.0, "reason": "简短理由"}}

只返回 JSON，不要其他内容。""",
                },
                {"role": "user", "content": text},
            ],
            temperature=0,
            response_format={"type": "json_object"},
        )

        import json
        result = json.loads(response.choices[0].message.content)

        return IntentResult(
            intent=result["intent"],
            confidence=result.get("confidence", 0.8),
        )

使用方式

client = OpenAI()
classifier = LLMIntentClassifier(client)

intents = [
    {"name": "knowledge_query", "description": "用户想查询知识库中的信息，如流程、规定、政策等"},
    {"name": "data_query", "description": "用户想查询数据报表，如销售额、统计数字等"},
    {"name": "task_execute", "description": "用户想执行某个操作，如创建工单、提交申请等"},
    {"name": "chitchat", "description": "闲聊、打招呼、感谢等社交性对话"},
    {"name": "out_of_scope", "description": "超出系统能力范围的请求"},
]

result = classifier.classify("帮我看看上个季度的退货率", intents)
# IntentResult(intent='data_query', confidence=0.95)

带上下文的意图识别

有时候单看一句话判断不了，需要结合对话历史：

def classify_with_context(
    self,
    text: str,
    history: list[dict],
    intents: list[dict],
) -> IntentResult:
    recent_history = history[-4:]
    history_text = "\n".join(
        f"{'用户' if m['role'] == 'user' else '助手'}: {m['content']}"
        for m in recent_history
    )

    response = self.client.chat.completions.create(
        model=self.model,
        messages=[
            {
                "role": "system",
                "content": f"""你是一个意图分类器。结合对话上下文判断用户最新一句话的意图。

最近对话：
{history_text}

可选意图：
{chr(10).join(f"- {i['name']}: {i['description']}" for i in intents)}

返回 JSON：{{"intent": "意图名称", "confidence": 0.0-1.0}}""",
            },
            {"role": "user", "content": text},
        ],
        temperature=0,
        response_format={"type": "json_object"},
    )

    import json
    result = json.loads(response.choices[0].message.content)
    return IntentResult(intent=result["intent"], confidence=result.get("confidence", 0.8))

举例：

用户: 退货流程是什么？      → knowledge_query
助手: 退货需在7天内...
用户: 那换货呢？            → knowledge_query（结合上下文才能判断）

“那换货呢”单独看不知道在问什么，有了上下文就清楚是在问换货流程。

优缺点

优点	缺点
理解自然语言，覆盖面广	每次分类都要调 API，有延迟和成本
新增意图只需改 Prompt	偶尔会分错（非确定性）
能结合上下文判断	依赖外部服务

适用场景：意图类型多、表达方式多样、对延迟有一定容忍度（200~500ms）。

Level 3：Embedding 语义路由

把每个意图用几个示例句描述，转成向量。用户输入也转成向量，看和哪个意图最近。

实现

import numpy as np

class SemanticRouter:
    def __init__(self, embedding_provider):
        self.embedding_provider = embedding_provider
        self.routes = []
        self.route_embeddings = []

    def add_route(self, name: str, examples: list[str], handler: str = ""):
        result = self.embedding_provider.embed(examples)
        avg_embedding = np.mean(result.embeddings, axis=0).tolist()

        self.routes.append({
            "name": name,
            "examples": examples,
            "handler": handler,
        })
        self.route_embeddings.append(avg_embedding)

    def classify(self, text: str, threshold: float = 0.5) -> IntentResult:
        result = self.embedding_provider.embed([text])
        query_vec = np.array(result.embeddings[0])

        route_vecs = np.array(self.route_embeddings)

        # 归一化
        query_vec = query_vec / np.linalg.norm(query_vec)
        norms = np.linalg.norm(route_vecs, axis=1, keepdims=True)
        route_vecs = route_vecs / norms

        # 余弦相似度
        scores = route_vecs @ query_vec
        best_idx = int(np.argmax(scores))
        best_score = float(scores[best_idx])

        if best_score < threshold:
            return IntentResult(intent="unknown", confidence=best_score)

        return IntentResult(
            intent=self.routes[best_idx]["name"],
            confidence=best_score,
        )

配置路由

router = SemanticRouter(embedding_provider=create_embedding_provider("openai"))

router.add_route(
    name="knowledge_query",
    examples=[
        "退货流程是什么",
        "怎么申请年假",
        "公司的报销制度",
        "出差审批流程",
        "新员工入职需要什么材料",
    ],
)

router.add_route(
    name="data_query",
    examples=[
        "上个月销售额多少",
        "今年Q3的营收数据",
        "退货率是多少",
        "本周订单量统计",
        "各部门人数",
    ],
)

router.add_route(
    name="task_execute",
    examples=[
        "帮我创建一个工单",
        "提交请假申请",
        "发起报销",
        "帮我预约会议室",
        "创建一个新项目",
    ],
)

router.add_route(
    name="chitchat",
    examples=[
        "你好", "谢谢", "再见", "哈哈", "你是谁",
        "今天心情不错", "辛苦了",
    ],
)

# 使用
result = router.classify("假期还剩几天")
# IntentResult(intent='knowledge_query', confidence=0.82)

动态更新路由

语义路由最大的优势：新增意图只需添加几个示例句，不需要改 Prompt 或重新训练：

router.add_route(
    name="meeting_summary",
    examples=[
        "总结一下刚才的会议",
        "帮我整理会议纪要",
        "把讨论的要点列出来",
    ],
)

优缺点

优点	缺点
语义理解，泛化能力强	需要 Embedding 调用（可缓存）
新增意图只需加示例	意图之间语义太近时容易混淆
可离线预计算	示例句质量影响很大

适用场景：意图经常变化、需要快速迭代、对灵活性要求高。

组合方案：三层融合

实际项目中我推荐三层方案组合使用：

class HybridIntentClassifier:
    def __init__(
        self,
        keyword_classifier: KeywordIntentClassifier,
        semantic_router: SemanticRouter,
        llm_classifier: LLMIntentClassifier,
        intents: list[dict],
    ):
        self.keyword = keyword_classifier
        self.semantic = semantic_router
        self.llm = llm_classifier
        self.intents = intents

    def classify(self, text: str, use_llm_fallback: bool = True) -> IntentResult:
        # Layer 1: 关键词快速匹配（0ms）
        result = self.keyword.classify(text)
        if result.confidence >= 0.9:
            return result

        # Layer 2: 语义路由（~50ms）
        result = self.semantic.classify(text, threshold=0.75)
        if result.intent != "unknown":
            return result

        # Layer 3: LLM 兜底（~300ms）
        if use_llm_fallback:
            return self.llm.classify(text, self.intents)

        return IntentResult(intent="unknown", confidence=0.0)

策略：

关键词能确定的，直接走，零延迟
关键词不确定的，语义路由补上
都不确定的，LLM 兜底

这样 80% 的请求在 50ms 内完成意图识别，只有 20% 的边缘 case 需要走 LLM。

意图路由分发

识别完意图，下一步是分发到对应的处理链路：

class IntentRouter:
    def __init__(self):
        self.handlers = {}

    def register(self, intent: str, handler):
        self.handlers[intent] = handler

    async def route(self, text: str, intent_result: IntentResult, context: dict = None) -> str:
        handler = self.handlers.get(intent_result.intent)

        if not handler:
            handler = self.handlers.get("fallback")

        if not handler:
            return "抱歉，我不太理解你的意思，可以换个说法吗？"

        return await handler(text, context or {})


# 注册处理器
router = IntentRouter()

router.register("knowledge_query", rag_handler)
router.register("data_query", data_api_handler)
router.register("task_execute", task_handler)
router.register("chitchat", chitchat_handler)
router.register("out_of_scope", scope_handler)
router.register("fallback", fallback_handler)


# 处理器示例
async def rag_handler(text: str, context: dict) -> str:
    results = await retriever.retrieve(text, top_k=5)
    # ... 把检索结果塞进 Prompt 让 LLM 回答 ...
    return answer

async def chitchat_handler(text: str, context: dict) -> str:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "你是一个友好的助手，简短回复即可。"},
            {"role": "user", "content": text},
        ],
    )
    return response.choices[0].message.content

async def scope_handler(text: str, context: dict) -> str:
    return "这个问题超出了我的能力范围，我主要负责公司知识库查询和数据查询，换个问题试试？"

多意图和模糊意图

多意图

用户一句话可能包含多个意图：

"帮我查一下退货流程，顺便看看上个月的退货率"
→ knowledge_query + data_query

class MultiIntentClassifier:
    def __init__(self, client: OpenAI):
        self.client = client

    def classify(self, text: str, intents: list[dict]) -> list[IntentResult]:
        intent_desc = "\n".join(f"- {i['name']}: {i['description']}" for i in intents)

        response = self.client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                {
                    "role": "system",
                    "content": f"""分析用户输入中包含的所有意图（可能有多个）。

可选意图：
{intent_desc}

返回 JSON 数组：
[{{"intent": "名称", "confidence": 0.0-1.0, "sub_query": "对应的子查询"}}]""",
                },
                {"role": "user", "content": text},
            ],
            temperature=0,
            response_format={"type": "json_object"},
        )

        import json
        data = json.loads(response.choices[0].message.content)
        items = data if isinstance(data, list) else data.get("intents", [data])

        return [
            IntentResult(intent=item["intent"], confidence=item.get("confidence", 0.8))
            for item in items
        ]

模糊意图

用户的表达可能在两个意图之间模棱两可：

"退货率太高了怎么办"
→ data_query（查退货率数据）？还是 knowledge_query（查降低退货率的方法）？

处理策略：

def handle_ambiguous(results: list[IntentResult], threshold: float = 0.15) -> str:
    if len(results) < 2:
        return None

    top_two = sorted(results, key=lambda r: r.confidence, reverse=True)[:2]
    gap = top_two[0].confidence - top_two[1].confidence

    if gap < threshold:
        return f"你是想{describe_intent(top_two[0].intent)}，还是{describe_intent(top_two[1].intent)}？"

    return None

当两个意图的置信度差距很小时，主动追问用户而不是猜。

与 Function Calling 的关系

OpenAI 的 Function Calling（工具调用）本质上也是意图识别——模型根据用户输入判断该调哪个函数。

tools = [
    {
        "type": "function",
        "function": {
            "name": "search_knowledge_base",
            "description": "搜索公司知识库获取流程、制度等信息",
            "parameters": {"type": "object", "properties": {"query": {"type": "string"}}},
        },
    },
    {
        "type": "function",
        "function": {
            "name": "query_sales_data",
            "description": "查询销售数据和统计报表",
            "parameters": {"type": "object", "properties": {"metric": {"type": "string"}, "period": {"type": "string"}}},
        },
    },
]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "上个月销售额多少"}],
    tools=tools,
)

什么时候用 Function Calling 代替自己做意图识别？

维度	自研意图识别	Function Calling
延迟	可做到 < 50ms	300ms+（必须调 LLM）
成本	语义路由几乎免费	每次都消耗 Token
灵活度	完全自控	受限于 LLM 能力
适用场景	高性能路由层	工具调用 + 参数提取一步到位

我的建议：用自研意图识别做第一层路由（快、便宜），需要参数提取时再用 Function Calling。

总结

意图识别是 AI 应用的路由层——没有它，所有请求都混在一起处理，效果和效率都差。
三个层级递进：关键词（快）→ 语义路由（准）→ LLM 分类（全），组合使用效果最佳。
语义路由是甜蜜点——新增意图只需加示例句，无需改代码或 Prompt，50ms 内完成。
多意图和模糊意图要处理——用户一句话可能包含多个意图，置信度接近时应主动追问。
Function Calling 不是万能的——延迟高、成本高，适合需要参数提取的场景，不适合做轻量路由。

下一篇回到工程实战：结构化输出。AI 返回的是自由文本，前端怎么稳定解析？JSON 格式经常出错怎么办？

下一篇预告：08 | 结构化输出：让 AI 稳定返回 JSON

讨论话题：你的 AI 应用有做意图识别吗？用的什么方案？遇到过用户输入太离谱导致意图判断错误的情况吗？评论区聊聊。