You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

PyTorch中对ImageFolder对象执行多变换以扩充DCGAN自定义数据集

How to Expand ImageFolder Dataset with Multiple Transforms in PyTorch

Got it, let's tackle this problem step by step. You want to take your existing ImageFolder dataset, apply 5 different transforms, and append each transformed version back to the original dataset to expand its size—here's exactly how to do it efficiently with PyTorch:

Core Idea

We'll create multiple copies of your original dataset, each using a unique augmentation transform (plus your base preprocessing), then concatenate all these datasets (original + 5 augmented versions) into one large dataset using ConcatDataset. This approach is clean, memory-efficient (since ImageFolder loads images lazily), and works seamlessly with standard PyTorch DataLoaders.

Step-by-Step Implementation

1. Import Required Libraries

First, pull in the necessary modules:

import torch
from torchvision.datasets import ImageFolder
from torchvision import transforms
from torch.utils.data import ConcatDataset

2. Define Transforms

Separate your base preprocessing (transforms every image needs, like resizing, converting to tensor, normalization) from your augmentation transforms (the 5 unique variations you want to apply):

# Base transforms: Applied to all images (original and augmented)
base_transform = transforms.Compose([
    transforms.Resize((64, 64)),  # Match DCGAN's typical input size
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))  # Standard DCGAN normalization
])

# 5 unique augmentation pipelines (each adds one augmentation before base transforms)
augment_pipelines = [
    # Augmentation 1: Random horizontal flip
    transforms.Compose([
        transforms.RandomHorizontalFlip(p=0.5),
        base_transform
    ]),
    # Augmentation 2: Random resized crop
    transforms.Compose([
        transforms.RandomResizedCrop(64, scale=(0.8, 1.0)),
        base_transform
    ]),
    # Augmentation 3: Random rotation
    transforms.Compose([
        transforms.RandomRotation(degrees=15),
        base_transform
    ]),
    # Augmentation 4: Color jitter (brightness/contrast adjustment)
    transforms.Compose([
        transforms.ColorJitter(brightness=0.2, contrast=0.2),
        base_transform
    ]),
    # Augmentation 5: Random vertical flip (adjust based on your dataset's suitability)
    transforms.Compose([
        transforms.RandomVerticalFlip(p=0.5),
        base_transform
    ])
]

3. Create and Concatenate Datasets

Now, create the original dataset, plus a dataset for each augmentation pipeline, then merge them all:

# Path to your custom dataset folder (structured per ImageFolder requirements)
dataset_root = "path/to/your/custom/dataset"

# Original dataset (no extra augmentation, just base transforms)
original_dataset = ImageFolder(root=dataset_root, transform=base_transform)

# Initialize a list with the original dataset
expanded_datasets = [original_dataset]

# Add each augmented dataset to the list
for pipeline in augment_pipelines:
    augmented_dataset = ImageFolder(root=dataset_root, transform=pipeline)
    expanded_datasets.append(augmented_dataset)

# Concatenate all datasets into one large expanded dataset
final_expanded_dataset = ConcatDataset(expanded_datasets)

4. Use with DataLoader

You can now use this expanded dataset with a standard DataLoader just like your original dataset:

from torch.utils.data import DataLoader

dataloader = DataLoader(final_expanded_dataset, batch_size=64, shuffle=True, num_workers=4)

Key Notes

  • Memory Efficiency: ImageFolder loads images on-demand (lazy loading), so creating multiple dataset instances won't duplicate images in memory—each dataset just applies a different transform when fetching samples.
  • Customization: Adjust the augmentation transforms to fit your dataset. For example, avoid vertical flips if you're working with face data, or tweak rotation degrees based on your objects' orientation constraints.
  • Scale: The final dataset will be 6x the size of your original (1x original + 5x augmented), which is perfect for boosting DCGAN training on small datasets.

内容的提问来源于stack exchange,提问作者Stefan Radonjic

火山引擎 最新活动