基于Bootstrap法估算R平方95%置信区间的R语言循环实现求助

阿华AIGC实验室

2026-5-22

给每个自变量循环计算Bootstrap R²置信区间的解决方案

Hey there! 我明白你刚接触R，需要给数据框里的每个自变量（X1、X2...Xn）分别和因变量Dependant做Bootstrap，估算R²的95%置信区间，而且已经有了单变量的基础代码，接下来咱们把循环逻辑补上就行～

整体思路

先提取所有自变量的列名，然后逐个遍历每个自变量：动态构建线性模型公式、运行Bootstrap抽样、计算置信区间，最后把所有变量的结果整理到一个数据框里方便查看。

完整代码示例

# 先加载需要的boot包（如果没安装先运行install.packages("boot")）
library(boot)

# 提取所有自变量的列名：排除因变量Dependant，自动识别所有X变量
predictor_cols <- setdiff(names(INs3), "Dependant")

# 创建一个空数据框，用来存储每个变量的结果
results_df <- data.frame(
  Predictor = character(),  # 自变量名称
  R2_mean = numeric(),      # Bootstrap抽样得到的R²平均值
  CI_lower = numeric(),     # 95%置信区间下限
  CI_upper = numeric(),     # 95%置信区间上限
  stringsAsFactors = FALSE
)

# 循环遍历每个自变量
for (x_col in predictor_cols) {
  # 动态构建线性模型的公式：比如"Dependant ~ X1"、"Dependant ~ X2"
  model_formula <- as.formula(paste("Dependant ~", x_col))
  
  # 定义Bootstrap要计算的统计量：这里就是模型的R平方
  boot_stat <- function(data, indices) {
    # 根据抽样索引取出对应的样本
    sampled_data <- data[indices, ]
    # 拟合线性模型
    model <- lm(model_formula, data = sampled_data)
    # 返回该模型的R平方
    return(summary(model)$r.squared)
  }
  
  # 运行Bootstrap抽样：R是抽样次数，这里设1000次（可以按需调整，次数越多结果越稳但越慢）
  boot_obj <- boot(data = INs3, statistic = boot_stat, R = 1000)
  
  # 计算95%置信区间：用百分位数法（perc），这是Bootstrap最常用的方法
  boot_ci <- boot.ci(boot_obj, type = "perc")
  
  # 从结果里提取需要的数值：平均R²、置信区间上下限
  r2_avg <- mean(boot_obj$t)
  ci_low <- boot_ci$perc[4]
  ci_high <- boot_ci$perc[5]
  
  # 把当前变量的结果添加到数据框里
  results_df <- rbind(results_df, data.frame(
    Predictor = x_col,
    R2_mean = r2_avg,
    CI_lower = ci_low,
    CI_upper = ci_high,
    stringsAsFactors = FALSE
  ))
}

# 查看最终整理好的结果
print(results_df)