关于OpenCV多线程加速未生效的技术咨询
我最近在验证OpenCV的多线程加速效果,结果发现不管怎么调整线程数,图像处理的耗时几乎没有变化,看起来多线程完全没起作用。下面是我的环境、测试代码和运行结果,想请各位大佬帮忙分析下问题所在。
一、环境信息
先贴一下我的系统和OpenCV版本:
~$ mamba list | grep opencv libopencv 4.12.0 qt6_py312h322f462_605 conda-forge opencv 4.12.0 qt6_py312h7bb6282_605 conda-forge py-opencv 4.12.0 qt6_py312h598be00_605 conda-forge ~$ nproc 64
系统是64核CPU,用conda-forge的mamba安装的OpenCV 4.12.0。
二、OpenCV并行框架与线程配置
我先检查了OpenCV的编译信息,确认并行框架是OpenMP,同时查看了线程相关的配置:
import cv2 for line in cv2.getBuildInformation().split("\n"): if "Parallel framework" in line: print(line.strip()) threads = cv2.getNumThreads() cpus = cv2.getNumberOfCPUs() print(f"thread of cpu: {threads}/{cpus}")
输出结果:
Parallel framework: OpenMP
thread of cpu: 8/64
三、测试代码
为了测试多线程效果,我写了一个简单的脚本,分别测试Box Filter和Gaussian Blur在不同线程数下的处理耗时:
import time import numpy as np import cv2 if __name__ == "__main__": # 打印并行框架和线程配置 for line in cv2.getBuildInformation().split("\n"): if "Parallel framework" in line: print(line.strip()) threads = cv2.getNumThreads() cpus = cv2.getNumberOfCPUs() print(f"thread of cpu: {threads}/{cpus}") # 生成随机测试图像 image = np.random.randint(0, 256, (4000, 3000))/255 ksize = (51, 51) count = 1 # 单次测试 # 测试Box Filter print("Box filter") for i in [0, 1, 2, 4, 8]: cv2.setNumThreads(i) print(f"thread: {cv2.getNumThreads()}", end=": ") t1 = time.time() for _ in range(count): cv2.boxFilter(image, cv2.CV_32F, ksize) d1 = time.time() - t1 print(f"consumed {int(d1*1000)}ms") # 测试Gaussian Blur print("Gaussian blur") for i in [0, 1, 2, 4, 8]: cv2.setNumThreads(i) print(f"thread: {cv2.getNumThreads()}", end=": ") t1 = time.time() for _ in range(count): cv2.GaussianBlur(image, ksize, 0) d1 = time.time() - t1 print(f"consumed {int(d1*1000)}ms")
四、运行结果
实际运行后,得到的耗时数据如下:
Parallel framework: OpenMP
thread of cpu: 8/64
Box filter
thread: 1: consumed 62ms
thread: 1: consumed 60ms
thread: 2: consumed 60ms
thread: 4: consumed 59ms
thread: 8: consumed 59ms
Gaussian blur
thread: 1: consumed 610ms
thread: 1: consumed 613ms
thread: 2: consumed 612ms
thread: 4: consumed 651ms
thread: 8: consumed 615ms
我的疑问
从结果可以看到:
- Box Filter不管设置1/2/4/8线程,耗时基本稳定在60ms左右,完全看不到多线程加速的效果;
- Gaussian Blur的情况更奇怪,设置4线程时耗时反而比单线程更长,8线程也和单线程差不多;
- 明明OpenCV已经启用了OpenMP并行框架,而且CPU有64核,为什么多线程完全没起作用呢?
有没有大佬遇到过类似的问题?或者能帮我分析下可能的原因?比如是不是我测试的方式不对,还是OpenCV的某些配置没弄好?




