MNIST分类器代码中使用X[0]为何触发KeyError错误？

阿华AIGC实验室

2026-5-11

Hey there, let's fix that frustrating KeyError you're facing with the MNIST dataset!

问题根源

The issue here is that when you use fetch_openml to load the MNIST dataset, the data returned is a pandas DataFrame, not a numpy array. When you try to access X[0], pandas interprets this as trying to fetch a column named "0"—but your DataFrame's columns are actually labeled pixel1, pixel2, ..., pixel784, so there's no column with the name 0, hence the KeyError.

两种解决方法

You have two straightforward ways to resolve this:

1. Convert the DataFrame to a numpy array

If you prefer working with numpy arrays (standard for most machine learning tasks), just add .values (or .to_numpy()) when extracting the dataset:

from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784', version=1)
# Convert DataFrame to numpy array for array-style indexing
X, y = mnist["data"].values, mnist["target"]
# Now you can access the first sample with X[0]
print(X[0].shape)  # Output: (784,)

2. Use pandas' positional indexing (keep DataFrame)

If you want to retain the pandas DataFrame format, use .iloc[] to access rows by their numerical position:

from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784', version=1)
X, y = mnist["data"], mnist["target"]
# Access the first row using positional indexing with iloc
first_sample = X.iloc[0]
print(first_sample.shape)  # Output: (784,)

Either approach will let you access individual samples without hitting that KeyError. Feel free to reach out if you run into any other roadblocks!

内容的提问来源于stack exchange，提问作者Arjun Deshwal