如何使用Keras融合CNN、LSTM、DNN三种模型的预测结果？

阿华AIGC实验室

2026-5-21

Hey there! Fusing predictions from CNN, LSTM, and DNN models in Keras is totally doable—there are a few go-to methods depending on whether you're working on classification or regression, and how much complexity you want to add. Let's walk through the most practical approaches:

1. Weighted Average (Great for Regression & Probabilistic Classification)

This is the simplest method: you combine each model's predictions with a set of weights (either manually tuned or learned via a small neural network).

Manual Weighted Average

If you have a good sense of each model's performance (e.g., your LSTM outperformed the others), you can assign higher weights to better models:

import numpy as np

# Manually set weights (adjust based on your model validation metrics)
weights = [0.25, 0.5, 0.25]  # LSTM gets higher weight here
ensemble_pred = weights[0] * model1_pred + weights[1] * model2_pred + weights[2] * model3_pred

Learned Weighted Average

If you want the model to automatically learn optimal weights, you can build a tiny Keras model for this:

import tensorflow as tf
from tensorflow.keras.layers import Input, Dense, Concatenate
from tensorflow.keras.models import Model

# Define inputs for each model's predictions
input_cnn = Input(shape=(model1_pred.shape[1],))
input_lstm = Input(shape=(model2_pred.shape[1],))
input_dnn = Input(shape=(model3_pred.shape[1],))

# Concatenate the predictions
concat_preds = Concatenate()([input_cnn, input_lstm, input_dnn])

# Learn weights (softmax ensures weights sum to 1)
weight_layer = Dense(3, activation='softmax')(concat_preds)

# Compute weighted sum
ensemble_output = tf.keras.layers.Dot(axes=1)([concat_preds, weight_layer])

# Build and compile the fusion model
ensemble_model = Model(inputs=[input_cnn, input_lstm, input_dnn], outputs=ensemble_output)
# Use 'mse' for regression, 'categorical_crossentropy' for classification
ensemble_model.compile(optimizer='adam', loss='mse')

# Train on your validation set (you'll need the true labels y_val)
ensemble_model.fit(
    [model1_pred, model2_pred, model3_pred], 
    y_val, 
    epochs=10, 
    batch_size=32,
    validation_split=0.1
)

# Generate final fused predictions
ensemble_pred = ensemble_model.predict([model1_pred, model2_pred, model3_pred])

2. Voting (Ideal for Classification Tasks)

For classification, voting works by combining model predictions to pick the most confident class. There are two types:

Hard Voting

Pick the class that the majority of models predict:

import numpy as np

# Convert probability predictions to class labels
cnn_classes = np.argmax(model1_pred, axis=1)
lstm_classes = np.argmax(model2_pred, axis=1)
dnn_classes = np.argmax(model3_pred, axis=1)

# For each sample, select the most frequent class
ensemble_classes = np.array([
    np.bincount([c, l, d]).argmax() 
    for c, l, d in zip(cnn_classes, lstm_classes, dnn_classes)
])

Soft Voting

Average the probability predictions across models, then pick the class with the highest average probability:

# Average the probability distributions
avg_probs = (model1_pred + model2_pred + model3_pred) / 3
ensemble_classes = np.argmax(avg_probs, axis=1)

3. Stacking (Advanced, Often Higher Performance)

Stacking uses a "meta-model" (another neural network, usually a simple DNN) to learn how to combine the predictions of your base models. Here's how to implement it:

from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
import numpy as np

# Note: You'll need predictions from your base models on the TRAINING set too
# (e.g., model1_pred_train, model2_pred_train, model3_pred_train)

# Concatenate training predictions for the meta-model
train_meta_input = np.concatenate([model1_pred_train, model2_pred_train, model3_pred_train], axis=1)
val_meta_input = np.concatenate([model1_pred, model2_pred, model3_pred], axis=1)

# Build a simple meta-model
input_meta = Input(shape=(train_meta_input.shape[1],))
x = Dense(64, activation='relu')(input_meta)
x = Dense(32, activation='relu')(x)
# Use 'softmax' for classification, 'linear' for regression
output_meta = Dense(model1_pred.shape[1], activation='softmax')(x)

meta_model = Model(inputs=input_meta, outputs=output_meta)
meta_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the meta-model on the training predictions and true labels
meta_model.fit(
    train_meta_input, 
    y_train, 
    epochs=15, 
    batch_size=32,
    validation_split=0.1
)

# Generate final fused predictions
ensemble_pred = meta_model.predict(val_meta_input)

Bonus: Model-Level Fusion (If You Haven't Finished Training Yet)

If you haven't fully trained your base models yet, you can fuse their architectures directly instead of using precomputed predictions. This lets you train everything end-to-end:

from tensorflow.keras.layers import Concatenate, Dense
from tensorflow.keras.models import Model

# Assume nn_model1 (CNN), nn_model2 (LSTM), nn_model3 (DNN) are already defined
# Get the output layers of each base model
cnn_output = nn_model1.output
lstm_output = nn_model2.output
dnn_output = nn_model3.output

# Concatenate the outputs
concat_output = Concatenate()([cnn_output, lstm_output, dnn_output])

# Add a few dense layers to learn the fusion
x = Dense(128, activation='relu')(concat_output)
x = Dense(64, activation='relu')(x)
# Adjust output layer based on your task
final_output = Dense(num_classes, activation='softmax')(x)

# Build the fused model
ensemble_model = Model(
    inputs=[nn_model1.input, nn_model2.input, nn_model3.input], 
    outputs=final_output
)

# Optional: Freeze base model weights to avoid retraining them
for layer in nn_model1.layers:
    layer.trainable = False
for layer in nn_model2.layers:
    layer.trainable = False
for layer in nn_model3.layers:
    layer.trainable = False

# Compile and train
ensemble_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
ensemble_model.fit(
    [x_train, x_train, x_train],  # All models take the same input here
    y_train,
    epochs=20,
    batch_size=32,
    validation_data=([x_val, x_val, x_val], y_val)
)

Pick the method that fits your task and workflow—weighted average/voting are great for quick wins, while stacking or model-level fusion can give better results if you have the compute to spare.

内容的提问来源于stack exchange，提问作者deep