创建COCO标注数据集时，如何生成未压缩RLE编码？

阿华AIGC实验室

2026-5-21

Generating Uncompressed RLE from Binary Masks for COCO Datasets

Absolutely! You don’t need any specialized "RLE encoder" beyond basic Python and numpy (or even pure Python if you prefer) to generate uncompressed RLE for your COCO dataset. Let’s walk through how to do this, since pycocotools does focus mostly on compressed RLE.

First: What is COCO’s Uncompressed RLE Format?

COCO’s uncompressed RLE is a simple list of integers that represents consecutive runs of 0s and 1s in your flattened binary mask. The sequence alternates between counts of 0s and 1s, starting with the value of the first pixel in the mask. For example, a flattened mask [0,0,1,1,0,1] would translate to the uncompressed RLE [2,2,1,1] (2 zeros → 2 ones → 1 zero → 1 one).

Method 1: Manual Implementation with NumPy

This is the most straightforward approach. We’ll flatten the mask, detect transitions between 0s and 1s, and calculate run lengths:

import numpy as np

def mask_to_uncompressed_rle(mask):
    # Convert mask to a 1D numpy array (flattened row-wise)
    flat_mask = mask.flatten()
    
    # Detect transitions between consecutive pixels (1 where value changes, 0 otherwise)
    # We add dummy values at start/end to catch runs at the edges
    transitions = np.concatenate([[1], np.diff(flat_mask), [1]])
    
    # Get indices where transitions occur
    transition_indices = np.where(transitions != 0)[0]
    
    # Calculate the length of each run by subtracting consecutive transition indices
    run_lengths = np.diff(transition_indices)
    
    # Convert to a regular Python list (required for COCO JSON serialization)
    return run_lengths.tolist()

Example Usage:

# Sample 2D binary mask
mask = np.array([
    [0, 0, 1, 1],
    [0, 1, 1, 0]
])

uncompressed_rle = mask_to_uncompressed_rle(mask)
print(uncompressed_rle)  # Output: [2, 2, 1, 2, 1]

This output corresponds to: 2 zeros → 2 ones → 1 zero → 2 ones → 1 zero, which matches the flattened mask [0,0,1,1,0,1,1,0].

Method 2: Using pycocotools (Indirectly)

If you already use pycocotools for other mask operations, you can convert your mask to compressed RLE first, then decode it back to a mask and generate uncompressed RLE using the method above. While this adds an extra step, it’s useful if you’re already working within the pycocotools ecosystem:

from pycocotools import mask as mask_utils
import numpy as np

def mask_to_uncompressed_rle_via_pycocotools(mask):
    # Convert mask to compressed RLE
    compressed_rle = mask_utils.encode(np.asfortranarray(mask))
    
    # Decode compressed RLE back to a binary mask (to reuse our earlier function)
    decoded_mask = mask_utils.decode(compressed_rle).astype(np.uint8)
    
    # Generate uncompressed RLE from the decoded mask
    return mask_to_uncompressed_rle(decoded_mask)

Important Notes for COCO JSON

When adding this to your COCO dataset JSON, the uncompressed RLE entry should look like this:

{
  "id": 1,
  "image_id": 123,
  "category_id": 4,
  "segmentation": {
    "size": [height, width],
    "counts": [2, 2, 1, 2, 1]  // Your uncompressed RLE list
  },
  "area": 4,  // Total number of 1s in the mask
  "bbox": [x, y, width, height],
  "iscrowd": 0
}

Make sure to calculate the area (sum of all 1-run lengths) and bbox correctly for each annotation.

内容的提问来源于stack exchange，提问作者waspinator