使用LangChain ChatOpenAI.with_structured_output时OpenAI GPT系列模型间歇性返回畸形JSON的问题咨询

阿华AIGC实验室

2026-4-27

Answer

Yes, this is a known intermittent issue with OpenAI's structured output capabilities, especially on their smaller models like the mini/nano variants you mentioned. The problem often stems from unexpected generation truncation, internal model processing glitches, or occasional failure to adhere strictly to JSON schema requirements.

Here are several actionable fixes to resolve this:

1. Implement Retry Logic with Error Feedback

LangChain provides a RetryWithErrorOutputParser that automatically re-sends the request to the LLM when validation fails, along with the error message to guide the model to correct its output. This is one of the most effective solutions for intermittent JSON issues.

from langchain.output_parsers import RetryWithErrorOutputParser, PydanticOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
from your_module import RouteDecision  # Import your Pydantic model

# Initialize the base LLM and parser
llm = ChatOpenAI(temperature=0, model="gpt-5-mini", timeout=15, max_retries=2)
parser = PydanticOutputParser(pydantic_object=RouteDecision)

# Set up the retry parser with error feedback
retry_parser = RetryWithErrorOutputParser.from_llm(
    llm=llm,
    parser=parser,
)

# Create a prompt that explicitly includes schema instructions
prompt = PromptTemplate(
    template="""Classify the user's intent into the required JSON format.
Strictly follow these format instructions:
{format_instructions}
Important: Return ONLY a valid, fully closed JSON object with no extra whitespace, newlines, or tabs.
User message: {message}""",
    input_variables=["message"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

# Build the chain and invoke it
chain = prompt | llm | retry_parser
try:
    decision = chain.invoke({"message": your_user_message})
except Exception as e:
    # Fallback logic if retries fail
    print(f"Failed after retries: {str(e)}")

2. Enforce Strict JSON Requirements in the Prompt

Explicitly reinforce in your prompt that the model must return a complete, syntactically correct JSON object with no trailing whitespace. Small models are more prone to cutting corners on these details, so clear, direct instructions help:

Add these lines to your prompt template:

"You must return a single, valid JSON object that fully matches the provided schema. Do NOT include any extra text, comments, whitespace, newlines, or tabs outside the JSON structure. Ensure the JSON is properly closed with a } at the end."

3. Manual JSON Cleaning and Repair

For cases where retries aren't sufficient, you can add a pre-validation step to clean the raw output and attempt to fix minor structural issues:

import json
import re

def sanitize_raw_response(raw_response):
    # Remove all excess whitespace, newlines, and tabs
    cleaned = re.sub(r'[\t\n\r\s]+', ' ', raw_response).strip()
    # Ensure the JSON is properly closed if cut off
    if cleaned.startswith('{') and not cleaned.endswith('}'):
        # Find the last valid closing brace (if any) and truncate or append
        last_close_brace = cleaned.rfind('}')
        if last_close_brace != -1:
            cleaned = cleaned[:last_close_brace + 1]
        else:
            cleaned += '}'
    return cleaned

# Usage example
raw_output = llm.invoke(your_messages).content
sanitized_output = sanitize_raw_response(raw_output)
try:
    decision = RouteDecision(**json.loads(sanitized_output))
except json.JSONDecodeError:
    # Trigger retry or fallback logic here
    pass

4. Switch to a More Stable Model

Smaller models like gpt-5-mini, gpt-5.4-nano are optimized for speed and cost, not perfect reliability. If consistency is critical, consider upgrading to GPT-4o or GPT-4 Turbo—these models have far fewer structured output glitches, though they come with higher API costs.

5. Adjust Timeout Settings

Your current timeout=15 might be too short for some generation tasks, leading to truncated outputs. Try increasing the timeout to 30 seconds to give the model enough time to complete the JSON:

llm = ChatOpenAI(temperature=0, model="gpt-5-mini", timeout=30, max_retries=2)

内容的提问来源于stack exchange，提问作者Gevv

火山引擎最新活动

方舟 Coding Plan

HOT

模型自由，工具不限，最新支持 DeepSeek-V4 系列与 GLM-5.1，受邀下单叠加9.5折

查看详情

ArkClaw

7×24在线专属智能伙伴

查看详情

Seedance 2.0 全面开放 API

创作无限可能，一键生成电影级 AI 视频

查看详情

新用户特惠专场

大模型19元起，Al应用9.9元畅享，新人首购爆款尽享优惠

查看详情

方舟 Agent Plan