如何使用Keras融合CNN、LSTM、DNN三种模型的预测结果?
Hey there! Fusing predictions from CNN, LSTM, and DNN models in Keras is totally doable—there are a few go-to methods depending on whether you're working on classification or regression, and how much complexity you want to add. Let's walk through the most practical approaches:
1. Weighted Average (Great for Regression & Probabilistic Classification)
This is the simplest method: you combine each model's predictions with a set of weights (either manually tuned or learned via a small neural network).
Manual Weighted Average
If you have a good sense of each model's performance (e.g., your LSTM outperformed the others), you can assign higher weights to better models:
import numpy as np # Manually set weights (adjust based on your model validation metrics) weights = [0.25, 0.5, 0.25] # LSTM gets higher weight here ensemble_pred = weights[0] * model1_pred + weights[1] * model2_pred + weights[2] * model3_pred
Learned Weighted Average
If you want the model to automatically learn optimal weights, you can build a tiny Keras model for this:
import tensorflow as tf from tensorflow.keras.layers import Input, Dense, Concatenate from tensorflow.keras.models import Model # Define inputs for each model's predictions input_cnn = Input(shape=(model1_pred.shape[1],)) input_lstm = Input(shape=(model2_pred.shape[1],)) input_dnn = Input(shape=(model3_pred.shape[1],)) # Concatenate the predictions concat_preds = Concatenate()([input_cnn, input_lstm, input_dnn]) # Learn weights (softmax ensures weights sum to 1) weight_layer = Dense(3, activation='softmax')(concat_preds) # Compute weighted sum ensemble_output = tf.keras.layers.Dot(axes=1)([concat_preds, weight_layer]) # Build and compile the fusion model ensemble_model = Model(inputs=[input_cnn, input_lstm, input_dnn], outputs=ensemble_output) # Use 'mse' for regression, 'categorical_crossentropy' for classification ensemble_model.compile(optimizer='adam', loss='mse') # Train on your validation set (you'll need the true labels y_val) ensemble_model.fit( [model1_pred, model2_pred, model3_pred], y_val, epochs=10, batch_size=32, validation_split=0.1 ) # Generate final fused predictions ensemble_pred = ensemble_model.predict([model1_pred, model2_pred, model3_pred])
2. Voting (Ideal for Classification Tasks)
For classification, voting works by combining model predictions to pick the most confident class. There are two types:
Hard Voting
Pick the class that the majority of models predict:
import numpy as np # Convert probability predictions to class labels cnn_classes = np.argmax(model1_pred, axis=1) lstm_classes = np.argmax(model2_pred, axis=1) dnn_classes = np.argmax(model3_pred, axis=1) # For each sample, select the most frequent class ensemble_classes = np.array([ np.bincount([c, l, d]).argmax() for c, l, d in zip(cnn_classes, lstm_classes, dnn_classes) ])
Soft Voting
Average the probability predictions across models, then pick the class with the highest average probability:
# Average the probability distributions avg_probs = (model1_pred + model2_pred + model3_pred) / 3 ensemble_classes = np.argmax(avg_probs, axis=1)
3. Stacking (Advanced, Often Higher Performance)
Stacking uses a "meta-model" (another neural network, usually a simple DNN) to learn how to combine the predictions of your base models. Here's how to implement it:
from tensorflow.keras.layers import Input, Dense from tensorflow.keras.models import Model import numpy as np # Note: You'll need predictions from your base models on the TRAINING set too # (e.g., model1_pred_train, model2_pred_train, model3_pred_train) # Concatenate training predictions for the meta-model train_meta_input = np.concatenate([model1_pred_train, model2_pred_train, model3_pred_train], axis=1) val_meta_input = np.concatenate([model1_pred, model2_pred, model3_pred], axis=1) # Build a simple meta-model input_meta = Input(shape=(train_meta_input.shape[1],)) x = Dense(64, activation='relu')(input_meta) x = Dense(32, activation='relu')(x) # Use 'softmax' for classification, 'linear' for regression output_meta = Dense(model1_pred.shape[1], activation='softmax')(x) meta_model = Model(inputs=input_meta, outputs=output_meta) meta_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) # Train the meta-model on the training predictions and true labels meta_model.fit( train_meta_input, y_train, epochs=15, batch_size=32, validation_split=0.1 ) # Generate final fused predictions ensemble_pred = meta_model.predict(val_meta_input)
Bonus: Model-Level Fusion (If You Haven't Finished Training Yet)
If you haven't fully trained your base models yet, you can fuse their architectures directly instead of using precomputed predictions. This lets you train everything end-to-end:
from tensorflow.keras.layers import Concatenate, Dense from tensorflow.keras.models import Model # Assume nn_model1 (CNN), nn_model2 (LSTM), nn_model3 (DNN) are already defined # Get the output layers of each base model cnn_output = nn_model1.output lstm_output = nn_model2.output dnn_output = nn_model3.output # Concatenate the outputs concat_output = Concatenate()([cnn_output, lstm_output, dnn_output]) # Add a few dense layers to learn the fusion x = Dense(128, activation='relu')(concat_output) x = Dense(64, activation='relu')(x) # Adjust output layer based on your task final_output = Dense(num_classes, activation='softmax')(x) # Build the fused model ensemble_model = Model( inputs=[nn_model1.input, nn_model2.input, nn_model3.input], outputs=final_output ) # Optional: Freeze base model weights to avoid retraining them for layer in nn_model1.layers: layer.trainable = False for layer in nn_model2.layers: layer.trainable = False for layer in nn_model3.layers: layer.trainable = False # Compile and train ensemble_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) ensemble_model.fit( [x_train, x_train, x_train], # All models take the same input here y_train, epochs=20, batch_size=32, validation_data=([x_val, x_val, x_val], y_val) )
Pick the method that fits your task and workflow—weighted average/voting are great for quick wins, while stacking or model-level fusion can give better results if you have the compute to spare.
内容的提问来源于stack exchange,提问作者deep




