无需相机校准，如何在OpenCV中由图像XY坐标和物距获取3D坐标？

阿华AIGC实验室

2026-5-19

Absolutely, there are several workable approaches for your use case—no strict camera calibration required, and they’re well-suited for low-precision, multi-camera scenarios. Here are the most practical ones tailored to your needs:

1. 单相机近似3D坐标计算（基于已知目标距离）

Since you already have the target's distance (Z) from the camera, you can use a simplified pinhole camera model with rough approximations (no formal calibration needed):

Core idea: Treat the camera's image center as the principal point (cx, cy)—just use the midpoint of your image resolution (e.g., for a 1920x1080 image, cx=960, cy=540).
Approximate focal length: If you don't have the camera's exact focal length, use a reasonable estimate based on common camera types:
- For phone/webcam cameras, the pixel focal length (fx) is typically between 1000-2000 pixels. You can refine this once by holding your marker at a known distance, measuring its pixel width, and solving fx = (marker_pixel_width * known_distance) / marker_real_width—this is a one-time quick check, not formal calibration.
Calculation formula:
```
X_3D = (X_pix - cx) * Z / fx
Y_3D = (Y_pix - cy) * Z / fx
```
Here, X_3D/Y_3D are the coordinates relative to the camera's optical axis (Z-axis points toward the target).

2. 多相机近似三角化（无严格外参校准）

For multi-camera setups, you can combine the single-camera approach with rough relative positioning of your cameras:

Step 1: Define a simple world coordinate system manually. For example, place Camera 1 at (0, 0, 0), measure the approximate distance/offset of Camera 2 relative to Camera 1 (e.g., (0.5m, 0, 0) if it's 50cm to the right) and note these rough positions.
Step 2: For each camera, generate a 3D ray from the camera's position through the target's 2D pixel coordinate (using the single-camera formula above to get direction vectors).
Step 3: Compute the "best fit" intersection of all rays using a least-squares method. This accounts for small errors in your rough camera positions and pixel coordinates, giving you a reasonable 3D world coordinate.
Pro tip: Give your cameras a noticeable baseline (distance between them) rather than placing them too close—this reduces the error in the triangulated 3D position.

3. 纯比例映射（零校准参数）

If you want to skip even the approximate focal length estimate, use the marker's known size to map pixels directly to real-world space:

Core idea: At distance Z, the marker's real width W corresponds to w pixels in the image. Each pixel in the image then represents (W * Z) / w real-world units at that distance.
Calculation:
- Find the image center (centerX, centerY) as before.
- Compute the real-world X/Y offset from the camera's optical axis:
```
X_3D = (X_pix - centerX) * (W * Z) / (w * image_width)
Y_3D = (Y_pix - centerY) * (W * Z) / (w * image_height)
```
This is the simplest method, though it assumes your camera has no lens distortion (which is acceptable for low-precision needs).

关键注意事项

The accuracy of all these methods depends heavily on the precision of your known target distance Z—if that's off, your 3D coordinates will be too.
For multi-camera setups, avoid placing cameras with overlapping fields of view that are too narrow—wider overlap helps get a better triangulation result.
If you need slightly better precision, you can do a one-time "quick calibration" with your marker at a few known distances to refine your focal length estimate, but this is still far less involved than formal camera calibration.

内容的提问来源于stack exchange，提问作者Nucklear