训练手写数字识别DNN遇样本数不匹配错误，求解决方法

训练手写数字识别DNN遇样本数不匹配错误，求解决方法

阿华AIGC实验室

2026-5-7

解决MNIST DNN训练时的Input/Target样本数不匹配问题

嘿，我一眼就看到了你代码里的几个关键问题，正是这些问题导致了那个恼人的ValueError，咱们一步步来修复：

核心错误分析

你遇到的Input arrays should have the same number of samples as target arrays错误，本质是两个致命问题叠加：

类别数完全设错了：你把num_classes设成了60000（这是训练集的样本数量），但MNIST是识别0-9共10个数字，所以num_classes必须是10。
重复加载MNIST数据：你在预处理完特征和标签后，又重新加载了一次原始数据，直接覆盖了之前处理好的X_train和Y_train——导致训练时用的是未reshape的28x28图像，以及未做one-hot编码的原始标签，和模型输出完全不匹配。

其他需要修复的小问题

除了核心错误，还有几个代码细节需要调整：

你用了Flatten()层，但已经提前把图像reshape成784维的特征向量了，不需要再加Flatten
代码里用到了Flatten和init但没有导入，会导致额外报错
最后一层激活函数应该用softmax（配合categorical_crossentropy损失），而不是sigmoid
可视化代码应该放在数据加载后、reshape之前，否则你可视化的是已经reshape成一维的数组，看不到图像

修复后的完整代码

# Imports - 补全缺失的导入
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Flatten
from keras.utils import to_categorical
from keras.initializers import RandomNormal # 替换你用的init，这里用RandomNormal举例

# Configuration options - 修正类别数
feature_vector_length = 784
num_classes = 10 # MNIST是0-9共10类

# Load the data - 只加载一次！
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

# 先可视化样本（在reshape之前，否则看不到图像）
import matplotlib.pyplot as plt
plt.imshow(X_train[0], cmap='Greys')
plt.show()

# Reshape the data - 转成MLP需要的一维特征向量
X_train = X_train.reshape(X_train.shape[0], feature_vector_length)
X_test = X_test.reshape(X_test.shape[0], feature_vector_length)

# 归一化处理
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255

# Convert target classes to categorical one-hot编码（用正确的num_classes）
Y_train = to_categorical(Y_train, num_classes)
Y_test = to_categorical(Y_test, num_classes)

# Set the input shape
input_shape = (feature_vector_length,)
print(f'Feature shape: {input_shape}')

# Create the model - 修正结构
init = RandomNormal(mean=0.0, stddev=0.05) # 初始化权重
model = Sequential()
# 不需要Flatten层，因为已经reshape过了
model.add(Dense(350, input_shape=input_shape, activation="sigmoid", kernel_initializer=init))
model.add(Dense(50, activation="sigmoid", kernel_initializer=init))
model.add(Dense(num_classes, activation="softmax", kernel_initializer=init)) # 最后一层用softmax

# Configure the model and start training
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, Y_train, epochs=10, batch_size=250, verbose=1, validation_split=0.2)

# Test the model after training
test_results = model.evaluate(X_test, Y_test, verbose=1)
print(f'Test results - Loss: {test_results[0]} - Accuracy: {test_results[1]*100}%') # 转成百分比更直观

为什么这样修复？

修正num_classes后，one-hot编码的标签维度是(60000,10)，和输入样本数60000匹配
去掉重复的数据加载，确保训练用的是预处理好的特征和标签
调整模型结构和激活函数，符合分类任务的最佳实践

内容的提问来源于stack exchange，提问作者Bernardo Augusto

火山引擎最新活动

方舟 Coding Plan

模型自由，工具不限，免费解锁 ArkClaw，7*24 小时在线的专属智能伙伴

一键部署 OpenClaw

分钟级部署，云服务器包月低至￥9.9，与 CodingPlan 组合购买仅需19.8元

Seedance2.0 体验中心上线

注册即享免费500万Tokens，抢先领略新一代AI视频技术跃迁

新用户特惠专场

大模型19元起，Al应用9.9元畅享，新人首购爆款尽享优惠