如何对齐PDFium渲染PDF与stb_image加载图片得到的NumPy数组（内容一致场景下）

阿华AIGC实验室

2026-4-2

我之前在做PDF与图片的像素对齐时也踩过类似的坑，核心问题往往出在尺寸不匹配、缩放算法差异、色彩空间/gamma校正不一致这几个点上。结合你的代码，我给你一步步拆解修复方案，完全保留PDFium的前提下让两者结果对齐。

一、先强制统一尺寸与渲染范围

你的图片加载逻辑是固定缩放到224x224，但PDF渲染函数默认用DPI=80计算尺寸，这会导致两者输出尺寸大概率不匹配，第一步必须先统一：

统一PDF渲染的目标尺寸
调用render_page_helper时直接传入target_width=224, target_height=224，让PDF渲染直接输出和图片一样的固定尺寸，而不是通过DPI计算。从根源上保证两者的像素网格完全一致。

确认PDF的实际渲染范围
目前你用FPDF_GetPageWidth获取的是页面的媒体框尺寸，有些PDF可能会有裁剪框（CropBox），实际可见内容是裁剪框范围内的部分。可以把页面尺寸计算改成裁剪框：

// 替换原来的width/height计算逻辑（当target_width/target_height未传入时）
double crop_left, crop_bottom, crop_right, crop_top;
FPDF_GetPageCropBox(page, &crop_left, &crop_bottom, &crop_right, &crop_top);
double page_width = crop_right - crop_left;
double page_height = crop_top - crop_bottom;
width = static_cast<int>(page_width * dpi / 72.0);
height = static_cast<int>(page_height * dpi / 72.0);

这样能确保你渲染的是PDF的实际可见内容，和图片的显示范围一致。

二、统一缩放算法（核心差异点）

你当前的PDF渲染是直接通过FPDF_RenderPageBitmap将页面缩放到目标尺寸，而图片是先加载原图再用stbir_resize_uint8缩放。这两个缩放算法完全不同，必然导致像素差异。解决方案是让PDF渲染也走先渲染原始尺寸bitmap，再用stb_image的resize函数缩放到目标尺寸的流程，和图片处理对齐：

修改render_page_helper函数：

py::array_t<uint8_t> render_page_helper(FPDF_PAGE page, int target_width = 224, int target_height = 224, int dpi = 300) {
    // 1. 先渲染到PDF原始尺寸的高分辨率bitmap（用300DPI减少缩放失真）
    double crop_left, crop_bottom, crop_right, crop_top;
    FPDF_GetPageCropBox(page, &crop_left, &crop_bottom, &crop_right, &crop_top);
    double page_width = crop_right - crop_left;
    double page_height = crop_top - crop_bottom;
    int orig_width = static_cast<int>(page_width * dpi / 72.0);
    int orig_height = static_cast<int>(page_height * dpi / 72.0);

    FPDF_BITMAP bitmap = FPDFBitmap_Create(orig_width, orig_height, 1);
    if (!bitmap) throw std::runtime_error("Failed to create bitmap");
    FPDFBitmap_FillRect(bitmap, 0, 0, orig_width, orig_height, 0xFFFFFFFF);
    FPDF_RenderPageBitmap(bitmap, page, 0, 0, orig_width, orig_height, 0, FPDF_ANNOT);

    uint8_t* buffer = static_cast<uint8_t*>(FPDFBitmap_GetBuffer(bitmap));
    int stride = FPDFBitmap_GetStride(bitmap);

    // 2. 用stbir_resize_uint8缩放到目标尺寸（和图片处理用同一个缩放算法）
    std::vector<uint8_t> resized(target_width * target_height * 4);
    stbir_resize_uint8(buffer, orig_width, orig_height, stride,
                      resized.data(), target_width, target_height, 0,
                      4); // 4通道BGRA

    // 3. 转成RGB格式（和图片处理的转码逻辑完全对齐）
    py::array_t<uint8_t> result({target_height, target_width, 3});
    auto buf = result.mutable_unchecked<3>();
    for (int y = 0; y < target_height; ++y) {
        for (int x = 0; x < target_width; ++x) {
            int idx = (y * target_width + x) * 4;
            // BGRA → RGB：取R、G、B通道（对应BGRA的索引2、1、0）
            buf(y, x, 0) = resized[idx + 2];
            buf(y, x, 1) = resized[idx + 1];
            buf(y, x, 2) = resized[idx + 0];
        }
    }

    FPDFBitmap_Destroy(bitmap);
    return result;
}

这样PDF的缩放流程和图片完全一致，消除了算法差异带来的像素不同。

三、统一色彩空间与Gamma校正

PDF默认使用sRGB色彩空间，而stb_image加载图片时如果是JPG/PNG（sRGB格式），默认可能不会自动应用gamma校正，而PDFium的渲染可能已经应用了gamma。我们需要手动统一：

给stb_image添加gamma校正
在图片加载的代码开头添加gamma设置，对齐sRGB的标准gamma值：
```
// 在stbi_load之前调用，统一gamma为sRGB的2.2
stbi_set_gamma(2.2, 2.2);
```
这会让stb_image加载sRGB图片时自动应用gamma校正，和PDFium的渲染结果对齐。
强制PDFium用sRGB渲染
如果你的PDFium版本支持，修改FPDF_RenderPageBitmap的flags参数，强制使用sRGB色彩空间：
```
FPDF_RenderPageBitmap(bitmap, page, 0, 0, orig_width, orig_height, 0, FPDF_ANNOT | FPDF_COLORSPACE_SRGB);
```

四、最后验证像素对齐

修改完成后，在Python侧可以用以下代码验证差异：

import numpy as np

# 加载PDF渲染的数组和图片加载的数组
arr_pdf = ...
arr_img = ...

# 计算像素差异的绝对值
diff = np.abs(arr_pdf - arr_img)
print("最大像素差异:", diff.max())
print("平均像素差异:", diff.mean())

如果最大差异在5以内，基本可以认为对齐了（PDF渲染和图片加载的细微误差不可避免）。如果还是有大差异，可以检查：