基于Unsloth微调的Qwen2.5-7B在Ollama无限循环，Transformers正常

阿华AIGC实验室

2026-6-12

Qwen2.5-7B微调后转GGUF格式通过Ollama部署出现无限循环问题

问题概述

使用Unsloth微调Qwen2.5-7B模型后，通过Transformers库测试生成完全正常，但转换为Q8_0 GGUF格式并通过Ollama部署时，模型生成响应后无法停止，陷入无限循环。相同流程处理Mistral-v0.3和Llama-3.1模型无异常，仅Qwen2.5出现该问题。

环境配置

基础模型：unsloth/Qwen2.5-7B
微调模板：Alpaca
量化格式：Q8_0 GGUF
部署环境：Ollama

正常运行验证

以下Transformers代码可正常生成合理响应：

inputs = tokenizer(
    [
        alpaca_prompt.format(
            "Continue the fibonacci sequence.",  # instruction
            "1, 1, 2, 3, 5, 8",                # input
            "",                                # output - leave blank for generation
        )
    ], 
    return_tensors="pt"
).to("cuda")

from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer=text_streamer, max_new_tokens=128)

异常场景

转换为GGUF格式后，使用以下Modelfile通过Ollama运行时出现无限循环：

FROM /home/ilab/Desktop/ollama_model/unsloth.Q8_0.gguf

TEMPLATE """{{ if .System }}{{ .System }}{{ else }}Below are some instructions that describe some tasks. Write responses that appropriately complete each request.{{ end }}

USER: {{ .Prompt }}

ASSISTANT: {{ .Response }}{{ if .Response }}&lt;eos&gt;{{ end }}"""

PARAMETER stop "[toxicity=0]"
PARAMETER stop "[@BOS@]"
PARAMETER stop "&lt;eos&gt;"
PARAMETER stop "&lt;unused"
PARAMETER stop "　"
PARAMETER stop "　"
PARAMETER stop "　"
PARAMETER stop "　"
PARAMETER temperature 1.5
PARAMETER min_p 0.1

SYSTEM "Below are some instructions that describe some tasks. Write responses that appropriately complete each request."

问题原因与解决方案

1. 停止词配置不匹配Qwen2.5原生Token

Qwen2.5的EOS token是<|endoftext|>，而非自定义的<eos>，且原Modelfile中的[toxicity=0]、[@BOS@]等属于无效停止词，重复的全角空格也会干扰停止逻辑。

2. 模板格式与微调时的Alpaca模板不一致

微调使用的是Alpaca的指令格式（包含### Instruction:、### Response:等标记），但Ollama模板用了USER:/ASSISTANT:的聊天格式，导致模型无法识别生成终止的边界。

修正后的Modelfile

FROM /home/ilab/Desktop/ollama_model/unsloth.Q8_0.gguf

TEMPLATE """Below are some instructions that describe some tasks. Write responses that appropriately complete each request.

### Instruction:
{{ .Prompt }}

### Response:
{{ .Response }}{{ if .Response }}<|endoftext|>{{ end }}"""

PARAMETER stop "<|endoftext|>"
PARAMETER stop "###"
PARAMETER temperature 1.5
PARAMETER min_p 0.1

SYSTEM "Below are some instructions that describe some tasks. Write responses that appropriately complete each request."