在R中测试连续输出神经网络时遇预测值全部相同问题
Looks like you’ve hit a super common pitfall with neural networks in R’s neuralnet package—when all predictions come out identical, the root cause is almost always unscaled input features. Neural networks are extremely sensitive to the scale of your data; if your features have wildly different ranges (e.g., one feature hovers between 0-1 and another jumps from 1000-10000), the model can’t learn meaningful patterns and defaults to predicting a constant value (usually the mean of your target variable).
Let’s walk through the fix step by step, including proper scaling and model adjustments to get your predictions back on track:
Step 1: Standardize Your Training Data (Critical!)
First, we’ll normalize both your input features and the target variable (SalaryNormalized) using z-score scaling (subtract the mean, divide by standard deviation). Crucially, we’ll save the training set’s mean and standard deviation to apply the exact same transformation to the test set—never use test set stats for scaling, this causes data leakage and ruins your results.
# Define columns we'll work with feature_cols <- c("factor1", "factor2", "factor3") target_col <- "SalaryNormalized" # Save training set statistics for later test set use train_stats <- data.frame( mean = sapply(GC_train[, c(feature_cols, target_col)], mean), sd = sapply(GC_train[, c(feature_cols, target_col)], sd) ) # Scale training data GC_train_scaled <- GC_train for (col in feature_cols) { GC_train_scaled[[col]] <- (GC_train_scaled[[col]] - train_stats$mean[col]) / train_stats$sd[col] } GC_train_scaled[[target_col]] <- (GC_train_scaled[[target_col]] - train_stats$mean[target_col]) / train_stats$sd[target_col]
Step 2: Retrain the Neural Network
Now use the scaled training data to train your model. You might also want to tweak the number of hidden units—hidden=2 is very small and might not have enough capacity to learn complex patterns in your data. Try 5 or 10 units first:
m1 <- neuralnet( SalaryNormalized ~ factor1 + factor2 + factor3, data = GC_train_scaled, hidden = 5, # Adjust this based on your data's complexity err.fct = "sse", linear.output = TRUE, stepmax = 1e6 )
Step 3: Prepare the Test Set and Generate Predictions
Apply the same scaling rules from the training set to your test features, then generate predictions. We’ll also convert the scaled predictions back to the original SalaryNormalized scale so they’re meaningful:
# Scale test features using training set stats GC_test1_scaled <- GC_test1 for (col in feature_cols) { GC_test1_scaled[[col]] <- (GC_test1_scaled[[col]] - train_stats$mean[col]) / train_stats$sd[col] } # Generate scaled predictions scaled_predictions <- compute(m1, GC_test1_scaled)$net.result # Convert predictions back to the original SalaryNormalized scale original_predictions <- scaled_predictions * train_stats$sd[target_col] + train_stats$mean[target_col]
Additional Checks If You Still See Issues
If you still get identical predictions after scaling, try these quick checks:
- Verify feature-target correlation: If your features have no correlation with
SalaryNormalized, the model can’t learn anything useful. Runcor(GC_train[, feature_cols], GC_train[, target_col])to confirm there’s a relationship. - Increase stepmax: Even with 1e6 steps, the model might not have fully converged. Check
m1$result.matrixto see if the error is still decreasing—if so, bump upstepmaxto 2e6 or higher. - Try more hidden layers: For complex data, a single hidden layer with 2 units might be too simple. Experiment with
hidden=c(5,3)for a two-layer network to add more modeling capacity.
内容的提问来源于stack exchange,提问作者xyn




