PyTorch新手求助：图像分割训练集（杂草与作物）标注方法

阿华AIGC实验室

2026-5-22

Hey there! As someone who’s worked through image segmentation projects with PyTorch before, I totally get how tricky the annotation step can be when you’re just starting out. Let’s break down exactly how to label your weed-and-crop images for training models like ResNet-based segmentation networks.

Step 1: Choose a Beginner-Friendly Annotation Tool

You don’t need fancy paid tools to get started—these free, open-source options are perfect for your two-class segmentation task:

LabelMe: Super intuitive, runs in your browser or as a desktop app. It’s built for semantic segmentation, so you can draw polygons around each crop/weed instance or create full pixel-wise masks. Plus, it’s easy to export annotations in formats that work with PyTorch.
VGG Image Annotator (VIA): Another web-based tool that’s great for quick, precise labeling. It supports polygon drawing and lets you save annotations in simple formats you can convert easily.

Step 2: Annotate Your Images with Consistency

Once you’ve picked your tool, follow these steps for every training image:

Set up your classes first: Define two clear labels—crop and weed—and stick to them across all images. Consistency here is key for your model to learn correctly.
Draw accurate boundaries: For each plant, trace its full outline with a polygon. If your tool supports it (like LabelMe’s fill feature), you can color in the entire area of the crop or weed to create a pixel-perfect mask. Zoom in on small patches to avoid missing weeds hiding between crops!
Review as you go: After labeling a few images, flip back to check for mistakes. A single mislabeled patch can throw off your model’s training, so it’s worth taking the extra time to be precise.

Step 3: Convert Annotations to PyTorch-Ready Format

Most tools export annotations as JSON (LabelMe) or XML (VIA). You’ll need to turn these into binary masks where each pixel corresponds to a class (e.g., 0 for background, 1 for crop, 2 for weed). Here’s a quick Python snippet to convert LabelMe JSON files to masks:

import json
import numpy as np
from PIL import Image, ImageDraw

def labelme_to_mask(json_file_path, original_image_path, output_mask_path):
    # Load the LabelMe annotation data
    with open(json_file_path, 'r') as f:
        annotation_data = json.load(f)
    
    # Load the original image to get its size
    original_img = Image.open(original_image_path)
    mask = Image.new('L', original_img.size, 0)  # 'L' mode for grayscale (single channel)
    draw = ImageDraw.Draw(mask)
    
    # Iterate over each labeled shape
    for shape in annotation_data['shapes']:
        label = shape['label']
        # Map label to pixel value: crop = 1, weed = 2
        pixel_value = 1 if label == 'crop' else 2
        # Draw and fill the polygon on the mask
        draw.polygon([tuple(point) for point in shape['points']], fill=pixel_value)
    
    # Save the mask
    mask.save(output_mask_path)

If you use a tool that exports directly to mask images (like some paid options), you can skip this step—just ensure the pixel values match your class mapping.

Step 4: Organize Your Dataset for PyTorch Training

Structure your data folder like this to make loading easy with PyTorch’s Dataset class:

weed_crop_dataset/
    train/
        images/
            img_001.jpg
            img_002.jpg
            ...
        masks/
            img_001_mask.png
            img_002_mask.png
            ...
    validation/
        images/
            ...
        masks/
            ...

This setup lets you write a simple custom dataset class that loads an image and its corresponding mask together during training.

Quick Pro Tips to Make Annotation Easier

Batch your labeling sessions: Don’t try to label 100 images in one go—split it into 10-15 image batches to avoid fatigue.
Use keyboard shortcuts: Most tools have shortcuts for switching between classes, undoing mistakes, or zooming—learn these to speed up your workflow.
Label diverse scenarios: Include images taken in different lighting, angles, and weed densities. This helps your model generalize better to real-world conditions.

内容的提问来源于stack exchange，提问作者DukeLover