为何我的CNN脚本无法正确预测？新手图像识别项目排障求助

阿华AIGC实验室

2026-5-26

Hey there! Let’s walk through troubleshooting your multi-class image classification script step by step. Since you’re adapting a binary cat/dog tutorial, there are a few key areas that often cause issues when switching to 7 classes—let’s start with the most likely culprits.

1. First: Check Your Data (The Foundation!)

Your dataset size is a big red flag here—10-15 images per class is way too small for a deep learning model to generalize. Here’s how to fix and verify this:

Add data augmentation immediately: This artificially expands your dataset by applying random transformations to training images. For Keras, use ImageDataGenerator like this:

from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True,
    zoom_range=0.15
)

Make sure you only apply augmentation to training data, not validation/test data.

Confirm preprocessing consistency: If your training pipeline scales images to [0,1] (with rescale=1./255), you must do the exact same scaling when loading images for prediction. Skipping this will make your model’s outputs meaningless.
Verify label mapping: When using flow_from_directory, Keras assigns labels based on folder name order. Print train_generator.class_indices to check which folder maps to which integer label—you might be misinterpreting prediction results if this mapping is off.

2. Fix Model Architecture for Multi-Class Classification

Binary and multi-class models have critical differences—double-check these:

Output layer activation: Replace the binary sigmoid activation with softmax (since it outputs a probability distribution across all 7 classes, summing to 1).
Output layer size: Change the output layer’s neuron count from 1 to 7 (one per class).
Loss function: Swap binary_crossentropy for either categorical_crossentropy (if you’re using one-hot encoded labels) or sparse_categorical_crossentropy (if you’re using integer labels). Using the wrong loss will break training entirely.

3. Diagnose Training Behavior

Track training metrics: If your accuracy stays around 14% (1/7, random guess), your model isn’t learning. Try reducing your batch size (since data is small, use 2-4) or lowering the learning rate. If training accuracy is high but validation accuracy is garbage, you’re overfitting—add Dropout layers (e.g., Dropout(0.5) before your final dense layer) or use transfer learning (pre-trained models like VGG16 or MobileNet work wonders for small datasets).
Check model summary: Run model.summary() to confirm your output layer has 7 neurons with softmax activation—this is an easy mistake to make when adapting a binary model.

4. Debug the Prediction Pipeline

Even if your model trains well, prediction can fail due to small oversights:

Resize and preprocess input images: Ensure your prediction images match the input size your model expects (e.g., (224, 224, 3) for most pre-trained models). Don’t forget to scale them to [0,1] just like training data.
Parse predictions correctly: model.predict() returns a probability array for each class. Use np.argmax() to get the index of the highest probability, then map it back to your class names:
```
import numpy as np

predictions = model.predict(processed_img)
predicted_idx = np.argmax(predictions[0])
class_names = list(train_generator.class_indices.keys())
predicted_class = class_names[predicted_idx]
```
Don’t confuse the raw probability values with class labels!

5. Quick Sanity Check

Grab an image from your training set, run it through your prediction pipeline. If it predicts the wrong class, your preprocessing or model architecture is broken. If it predicts correctly but new images fail, your model is overfitting and needs more data/augmentation/regularization.

内容的提问来源于stack exchange，提问作者Christoffer