Python+Selenium自动化中精准识别扭曲4位数字验证码的方案咨询
4位扭曲数字验证码精准识别解决方案
一、预处理优化(提升OCR工具识别精度)
pytesseract识别率低大多因为缺少针对性预处理,先通过OpenCV做图像净化:
import cv2 import numpy as np import pytesseract def preprocess_captcha(img_path): # 读取图像并灰度化 img = cv2.imread(img_path) gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # 自适应二值化,应对不同光照下的扭曲数字 binary = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 11, 2) # 形态学开运算去噪(先腐蚀再膨胀) kernel = np.ones((2,2), np.uint8) denoised = cv2.morphologyEx(binary, cv2.MORPH_OPEN, kernel) # 膨胀操作强化数字边缘 denoised = cv2.dilate(denoised, kernel, iterations=1) return denoised # 预处理后调用pytesseract,限定只识别数字 processed_img = preprocess_captcha("captcha.png") result = pytesseract.image_to_string(processed_img, config='--psm 8 --oem 3 -c tessedit_char_whitelist=0123456789') print(result.strip())
注:
--psm 8参数指定图像为单个字符块,适合4位数字的整体识别场景,whitelist限定识别范围可大幅减少干扰。
二、使用专门验证码识别工具:ddddocr
针对国内常见验证码优化的工具,无需复杂预处理即可识别扭曲数字:
import ddddocr from selenium import webdriver driver = webdriver.Chrome() driver.get("目标页面URL") # 截取验证码元素保存 captcha_elem = driver.find_element_by_xpath("验证码元素的XPath") captcha_elem.screenshot("captcha.png") # 调用ddddocr识别 ocr = ddddocr.DdddOcr() with open("captcha.png", "rb") as f: captcha_bytes = f.read() result = ocr.classification(captcha_bytes) print(result)
安装命令:pip install ddddocr
三、训练自定义CNN模型(最高精度,适配固定样式验证码)
如果验证码样式固定,自定义模型是精度最高的方案:
- 生成训练数据:用
captcha库生成与目标样式一致的样本:
from captcha.image import ImageCaptcha import random import os os.makedirs("train_data", exist_ok=True) # 生成10000张标注好的训练图 for i in range(10000): code = ''.join(random.choices('0123456789', k=4)) img_gen = ImageCaptcha(width=120, height=60) img_gen.write(code, f"train_data/{code}_{i}.png")
- 构建CNN模型:针对4位数字分类任务设计:
from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout model = Sequential([ Conv2D(32, (3,3), activation='relu', input_shape=(60, 120, 1)), MaxPooling2D((2,2)), Conv2D(64, (3,3), activation='relu'), MaxPooling2D((2,2)), Conv2D(128, (3,3), activation='relu'), MaxPooling2D((2,2)), Flatten(), Dense(256, activation='relu'), Dropout(0.5), # 输出4个数字的10分类结果 Dense(4*10, activation='softmax'), ]) model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
- 训练并保存模型后,在Selenium脚本中加载模型进行预测,只要训练样本足够贴近目标验证码,识别率可达99%以上。
四、模板匹配(适配低扭曲度验证码)
若数字扭曲程度低,可制作0-9的数字模板,通过匹配相似度识别:
import cv2 import numpy as np def match_captcha(processed_img, templates): # 分割验证码为4个数字区域(假设宽度120,每个数字占30像素) digit_parts = [processed_img[:, i*30:(i+1)*30] for i in range(4)] result = '' for part in digit_parts: max_score = 0 best_digit = '' for digit, template in templates.items(): res = cv2.matchTemplate(part, template, cv2.TM_CCOEFF_NORMED) current_score = np.max(res) if current_score > max_score: max_score = current_score best_digit = digit result += best_digit return result # 加载0-9的数字模板 templates = {str(i): cv2.imread(f"templates/{i}.png", 0) for i in range(10)} # 传入预处理后的验证码图像 processed_img = preprocess_captcha("captcha.png") print(match_captcha(processed_img, templates))
内容的提问来源于stack exchange,提问作者abyss




