如何基于逻辑回归模型实现3D曲面+散点可视化及4D可视化?
如何基于逻辑回归模型实现3D曲面+散点可视化及4D可视化?
嗨,你已经成功搞定了单变量和预测概率的逻辑回归可视化,现在想升级到3D(双自变量+预测概率曲面+实际结果散点)和4D(三自变量+预测概率颜色渐变)?完全没问题!我来给你一步步演示具体实现方法,用R的plotly包就能做出交互式的清晰可视化效果~
一、3D可视化:双自变量+预测概率曲面+实际结果散点
我们用plotly来做交互式3D图,既能展示模型预测的概率曲面,又能叠加实际数据的散点(用实际Outcome区分颜色)。
步骤1:准备依赖包和数据
首先确保你安装了需要的包,然后生成用于预测的网格数据(因为逻辑回归是多变量模型,未选中的变量要固定为均值,才能单独看两个变量的影响):
# 安装并加载包 if (!require(plotly)) install.packages("plotly") if (!require(dplyr)) install.packages("dplyr") library(plotly) library(dplyr) # 假设你的训练好的模型是final_model,数据集是dataset # 选择两个要可视化的自变量,比如Glucose和BMI var1 <- "Glucose" var2 <- "BMI" # 生成覆盖两个变量取值范围的网格数据,其他变量固定为均值 grid <- expand.grid( Glucose = seq(min(dataset[[var1]]), max(dataset[[var1]]), length.out = 50), BMI = seq(min(dataset[[var2]]), max(dataset[[var2]]), length.out = 50), Pregnancies = mean(dataset$Pregnancies), Age = mean(dataset$Age) ) # 用模型预测网格点的概率 grid$PP_model <- predict(final_model, newdata = grid, type = "response")
步骤2:绘制3D曲面+散点
# 初始化plotly对象,先加预测概率曲面(半透明方便看散点) p_3d <- plot_ly() %>% add_surface( x = ~grid[[var1]], y = ~grid[[var2]], z = ~grid$PP_model, colorscale = "Viridis", opacity = 0.7, name = "Predicted Probability Surface" ) %>% # 添加实际数据的散点,用Outcome区分颜色 add_markers( data = dataset, x = ~Glucose, y = ~BMI, z = ~Outcome, # 实际结果0/1作为z轴 color = ~factor(Outcome), colors = c("#1f77b4", "#ff7f0e"), size = I(5), name = "Actual Outcome" ) %>% # 设置布局和标题 layout( title = paste("3D Logistic Regression: ", var1, " vs ", var2), scene = list( xaxis = list(title = var1), yaxis = list(title = var2), zaxis = list(title = "Outcome / Predicted Probability") ) ) # 显示交互式图 p_3d
你可以旋转、缩放这个图,很直观地看到模型的预测曲面和实际数据的分布关系~
二、4D可视化:三自变量+预测概率颜色渐变
4D可视化其实就是在3D散点的基础上,用颜色梯度来表示预测概率,这样能同时展示三个自变量和预测概率的关系。
基础版实现代码
# 4D可视化:3个自变量散点,颜色映射到预测概率 p_4d <- plot_ly( data = dataset, x = ~Pregnancies, y = ~Glucose, z = ~BMI, color = ~PP_model, colors = "Viridis", # 渐变颜色,从蓝到黄表示概率从低到高 size = I(5), marker = list(opacity = 0.8) ) %>% add_markers() %>% layout( title = "4D Visualization: 3 Variables + Predicted Probability (Color Gradient)", scene = list( xaxis = list(title = "Pregnancies"), yaxis = list(title = "Glucose"), zaxis = list(title = "BMI") ), colorbar = list(title = "Predicted Probability") ) # 显示图 p_4d
进阶版:同时区分实际Outcome和预测概率
如果你想同时展示实际结果和预测概率,可以用符号区分Outcome,颜色还是表示预测概率:
p_4d_advanced <- plot_ly( data = dataset, x = ~Pregnancies, y = ~Glucose, z = ~BMI, color = ~PP_model, colors = "Viridis", symbol = ~factor(Outcome), symbols = c("circle", "square"), # 用圆形和方形区分0和1 size = I(6), marker = list(opacity = 0.8) ) %>% add_markers() %>% layout( title = "4D Visualization: 3 Variables + Predicted Probability + Actual Outcome", scene = list( xaxis = list(title = "Pregnancies"), yaxis = list(title = "Glucose"), zaxis = list(title = "BMI") ), colorbar = list(title = "Predicted Probability"), legend = list(title = list(text = "Actual Outcome")) ) p_4d_advanced
一些注意事项
- 如果你的模型中还有其他被排除的变量(比如BloodPressure、Insulin等),在生成3D网格数据时,也要把这些变量固定为均值或中位数,这样预测的概率才符合模型的真实输出。
plotly的交互式图支持拖拽旋转、缩放,方便你从不同角度观察数据和模型的关系,比静态图更实用。- 如果你需要静态图,可以考虑
rgl包,但交互式体验不如plotly。
备注:内容来源于stack exchange,提问作者Libellule




