MNIST图像分类错误排查：如何可视化神经网络的感知内容？

阿华AIGC实验室

2026-5-7

Got it, let's fix this misclassification visualization step by step. You’ve got a MNIST model that’s labeling a '1' as a '2', and you want to see what the network is actually "perceiving" by turning activated nodes into images, then compare that to your test image. Here’s how to extend your existing code to make this happen:

First, Tweak Your Network Class to Capture Hidden Layer Activations

Right now, your query method only returns the final output. We need to grab the hidden layer’s activation values too, since those are key to seeing what the network picks up from your image:

import numpy as np

class Network():
    def __init__(self, inn, hidd, outt, lr):
        self.InNodes = inn
        self.HiddenNodes = hidd
        self.OutNodes = outt
        self.LearningRate = lr
        self.wih = np.random.normal(0.0, pow(self.InNodes,-0.5), (self.HiddenNodes,self.InNodes))
        self.woh = np.random.normal(0.0, pow(self.OutNodes,-0.5), (self.OutNodes,self.HiddenNodes))
        
    def train(self, inputs_list, targets_list):
        inputs = np.array(inputs_list, ndmin=2).T
        targets = np.array(targets_list, ndmin=2).T
        hidden_inputs = np.dot(self.wih, inputs)
        hidden_outputs = sigmoid(hidden_inputs)
        final_inputs = np.dot(self.woh, hidden_outputs)
        final_outputs = sigmoid(final_inputs)
        output_errors = targets - final_outputs
        hidden_errors = np.dot(self.woh.T, output_errors)
        self.woh += self.LearningRate * np.dot((output_errors * final_outputs * (1 - final_outputs)), np.transpose(hidden_outputs))
        self.wih += self.LearningRate * np.dot((hidden_errors * hidden_outputs * (1 - hidden_outputs)), np.transpose(inputs))
        
    def query(self, inputs_list):
        inputs = np.array(inputs_list, ndmin=2).T
        hidden_inputs = np.dot(self.wih, inputs)
        hidden_outputs = sigmoid(hidden_inputs)
        final_inputs = np.dot(self.woh, hidden_outputs)
        final_outputs = sigmoid(final_inputs)
        return final_outputs, hidden_outputs  # Return both outputs and hidden activations

def sigmoid(z):
    """The sigmoid function."""
    return 1.0/(1.0+np.exp(-z))

Preprocess Your Test Image to Match Training Data

Your raw image needs to be normalized the same way as the MNIST training data (0.01 to 1.0 range) — mismatched preprocessing is a common cause of weird predictions, so don’t skip this:

import imageio
import matplotlib.pyplot as plt

# Load and prep your test image
img_array = imageio.imread(r"\IMG\output-onlinepngtools.png", as_gray=True)
# Reverse colors if your image is white-on-black (MNIST uses black-on-white)
img_data = 255.0 - img_array.reshape(784)
# Normalize to match MNIST's input range
img_data = (img_data / 255.0 * 0.99) + 0.01

Run the Query and Grab Activation Data

Now get the model’s prediction and the hidden layer activations we need for visualization:

# Load training data (assuming mnist_loader is the standard Nielsen implementation)
import mnist_loader
training_data, validation_data, test_data = mnist_loader.load_data_wrapper()

# Initialize and train the network (you'll want to train it properly first!)
input_nodes = 784
hidden_nodes = 10
output_nodes = 10
learning = 0.1
Net = Network(input_nodes, hidden_nodes, output_nodes, learning)

# Train for a few epochs (adjust based on your needs)
for epoch in range(5):
    for x, y in training_data:
        Net.train(x, y)

# Get model outputs and hidden layer activations
outputs, hidden_outputs = Net.query(img_data)
predicted_label = np.argmax(outputs)
print(f"Predicted Label: {predicted_label} | Actual Label: 1")

Visualize: Test Image vs. Network's "Perception"

We’ll plot three things: your original test image, the hidden layer’s learned features (with their activation levels), and what the network associates with the incorrect label ('2').

Full Visualization Code

plt.figure(figsize=(14, 6))

# Plot your original test image
plt.subplot(2, 6, 1)
plt.imshow(img_array, cmap='gray')
plt.title("Test Image\n(Actual: 1)")
plt.axis('off')

# Plot each hidden node's learned feature + activation level
for i in range(Net.HiddenNodes):
    plt.subplot(2, 6, i+2)
    # Reshape the hidden node's weights into a 28x28 image
    weight_img = Net.wih[i].reshape(28, 28)
    plt.imshow(weight_img, cmap='gray')
    # Show how active this node was for your test image
    plt.title(f"Hidden Node {i}\nActivation: {hidden_outputs[i][0]:.2f}")
    plt.axis('off')

plt.tight_layout()
plt.show()

# Now plot what the network associates with the incorrect label ('2')
plt.figure(figsize=(6, 6))
# Combine output layer weights (for label 2) with hidden layer weights to see the "perceived" image
perceived_img = np.dot(Net.woh[2], Net.wih).reshape(28, 28)
plt.imshow(perceived_img, cmap='gray')
plt.title(f"Network's Perception of '2'")
plt.axis('off')
plt.show()

What This Shows You

The hidden node images reveal the low-level features the network learned (like edges, curves, or stroke patterns).
The activation values tell you which of those features the network detected in your test image. If nodes tied to "2"-like features are highly activated, that’s exactly why the model misclassified your '1'.
The "perceived '2'" image shows the composite feature the network uses to identify a '2' — you’ll probably see it matches some part of your test image that tricked the model.

内容的提问来源于stack exchange，提问作者mavish