Python3中利用正则表达式分离选择题题干与选项的实现方法

阿华AIGC实验室

2026-4-30

搞定选择题题干与选项分离的小技巧

嘿，我来帮你解决这个问题～你之前用正则得到不符合预期的结果，核心原因是贪婪匹配在搞鬼——.*会尽可能匹配最长的内容，导致捕获的题干不完整。下面给你两种靠谱的解决方案：

方案一：用非贪婪匹配修正正则

把正则里的.*改成.*?（非贪婪模式），它会乖乖匹配到第一个选项字母（A）之前的内容，不会“贪多”。同时搭配正则提取所有选项的内容：

import re
newstr = "1 which season do you like best after looking at these pictures A spring B summer C autumn D winter E none"
# 捕获题干：跳过开头的数字，提取到第一个A之前的内容
stem_match = re.search(r'1\s*(.*?)\s*A', newstr)
str1 = stem_match.group(1).strip() if stem_match else ""

# 提取所有选项：匹配每个大写字母后的内容
options = re.findall(r'[A-Z]\s*(.*?)\s*(?=[A-Z]|$)', newstr)
str2 = ' '.join(options)

print("题干：", str1)
print("选项：", str2)

运行后就能得到你想要的结果：

题干： which season do you like best after looking at these pictures
选项： spring summer autumn winter none

方案二：按选项前缀分割（更直观）

如果你的选择题选项都是以大写字母（A/B/C...）开头的，直接分割题干和选项部分会更易懂，不需要复杂的正则捕获：

import re
newstr = "1 which season do you like best after looking at these pictures A spring B summer C autumn D winter E none"

# 找到第一个选项字母的位置，作为分割点
split_point = re.search(r'\s+[A-Z]\s', newstr).start()
# 提取题干：去掉开头的数字，再清理多余空格
str1 = newstr[1:split_point].strip()
# 处理选项部分：拆分出每个选项的文本
option_section = newstr[split_point:].strip()
options = [opt.strip() for opt in re.split(r'[A-Z]\s*', option_section) if opt]
str2 = ' '.join(options)

print(str1)
print(str2)