如何在OpenCV的cv::cuda::GpuMat或cv::ogl::Buffer对象上绘制文字与矩形？

阿华AIGC实验室

2026-5-26

Hey there! Since you're building a fully GPU-accelerated pipeline (FFmpeg + NVDEC/NVENC + OpenCV CUDA + Caffe), keeping all drawing operations on the GPU is crucial to avoid performance hits from CPU-GPU data copies. Let's break down how to draw rectangles and text on cv::cuda::GpuMat and cv::ogl::Buffer:

Drawing on cv::cuda::GpuMat

Drawing Rectangles

OpenCV's CUDA module has limited built-in drawing functions, but you have two efficient options depending on your color requirements:

Option 1: Draw on Y-plane (Grayscale, Minimal Overhead)

Since your input is YUV_NV12 (a 1.5-channel format with separate Y and UV planes), you can directly modify the Y-plane to draw high-contrast rectangles without format conversion. This is fast because it only touches the luminance channel:

// Assume `src` is your YUV_NV12 cv::cuda::GpuMat
cv::cuda::GpuMat y_plane(src, cv::Rect(0, 0, src.cols, src.rows * 2 / 3)); // Extract Y plane
cv::cuda::GpuMat uv_plane(src, cv::Rect(0, src.rows * 2 / 3, src.cols, src.rows / 3)); // UV plane (no modification needed)

// Draw a white rectangle (255 = max luminance) with line width 2
cv::cuda::rectangle(y_plane, cv::Rect(x, y, width, height), cv::Scalar(255), 2);

// No need to merge: y_plane is a sub-matrix of src, so changes are reflected automatically

Option 2: Draw Color Rectangles (Format Conversion Required)

If you need colored rectangles, convert the YUV_NV12 frame to BGR on the GPU, draw, then convert back. This adds some overhead but is still better than copying to CPU:

// Convert YUV_NV12 to BGR on GPU
cv::cuda::GpuMat bgr_frame;
cv::cuda::cvtColor(src, bgr_frame, cv::COLOR_YUV2BGR_NV12);

// Draw green rectangle (BGR order: 0=blue, 255=green, 0=red)
cv::Scalar color(0, 255, 0);
cv::cuda::rectangle(bgr_frame, cv::Rect(x, y, width, height), color, 2);

// Convert back to YUV_NV12 for encoding
cv::cuda::cvtColor(bgr_frame, src, cv::COLOR_BGR2YUV_NV12);

Drawing Text

OpenCV CUDA doesn't have a native text-drawing function, so here are two practical approaches:

Option 1: Lightweight CPU-to-GPU Overlay (Small Text)

For small text labels, render the text on a tiny CPU Mat, upload it to GPU, and overlay it on your frame. The overhead is minimal because the text matrix is small:

// Step 1: Render text on CPU Mat
cv::Mat text_mat(60, 220, CV_8UC3, cv::Scalar(0, 0, 0)); // Black background
cv::putText(text_mat, "Person", cv::Point(10, 40), cv::FONT_HERSHEY_SIMPLEX, 1.2, cv::Scalar(0, 255, 0), 2);

// Step 2: Upload text to GPU
cv::cuda::GpuMat gpu_text;
gpu_text.upload(text_mat);

// Step 3: Convert frame to BGR, overlay text, convert back
cv::cuda::GpuMat bgr_frame;
cv::cuda::cvtColor(src, bgr_frame, cv::COLOR_YUV2BGR_NV12);

// Define ROI where text will be placed
cv::cuda::GpuMat text_roi(bgr_frame, cv::Rect(x, y - 60, gpu_text.cols, gpu_text.rows));
cv::cuda::addWeighted(text_roi, 1.0, gpu_text, 1.0, 0.0, text_roi);

// Convert back to YUV_NV12
cv::cuda::cvtColor(bgr_frame, src, cv::COLOR_BGR2YUV_NV12);

Option 2: Full GPU Text Rendering (Advanced)

For zero CPU-GPU copies, use a CUDA-compatible text rendering library like FreeType with CUDA bindings or implement a custom CUDA kernel to render text from glyph textures. This is more complex but ideal for high-throughput pipelines.

Drawing on cv::ogl::Buffer

Since cv::ogl::Buffer is tied to OpenGL, you can use raw OpenGL APIs to draw directly on the buffer. This is perfect if your pipeline already uses OpenGL for rendering.

Drawing Rectangles

Bind the buffer and use OpenGL's primitive drawing functions:

cv::ogl::Buffer ogl_buffer = ...; // Your existing ogl::Buffer
ogl_buffer.bind();

// Set up vertex data for the rectangle (pixel coordinates)
float vertices[] = {
    x, y,
    x + width, y,
    x + width, y + height,
    x, y + height
};

// Configure OpenGL state
glEnableClientState(GL_VERTEX_ARRAY);
glVertexPointer(2, GL_FLOAT, 0, vertices);
glColor3f(0.0f, 1.0f, 0.0f); // Green color (RGB normalized)
glLineWidth(2.0f);

// Draw the rectangle as a line loop
glDrawArrays(GL_LINE_LOOP, 0, 4);

// Clean up state
glDisableClientState(GL_VERTEX_ARRAY);
ogl_buffer.unbind();

Drawing Text

Use OpenGL with the FreeType library to render text directly to the buffer:

Load your font with FreeType and generate glyph textures for characters you need.
For each character in your text, bind its glyph texture and draw a quad at the target position.
Ensure your OpenGL projection matrix matches the buffer's pixel coordinates to avoid stretching.

This approach keeps all operations on the GPU and is highly performant for real-time streams.

Key Considerations

Avoid CPU-GPU Copies: Every download()/upload() call adds latency. Prioritize GPU-native operations whenever possible.
YUV Format Nuances: YUV_NV12's planar structure means color operations require format conversion—use grayscale drawing on the Y-plane to skip this if acceptable.
Context Compatibility: If mixing OpenGL and CUDA, ensure you use interop functions (like cv::cuda::registerBuffer) to avoid context conflicts.

内容的提问来源于stack exchange，提问作者Bin Zhou