使用gtsummary进行加权分析后,将带双层表头的gt对象转为数据框并合并表头为拼接列名的问题
使用gtsummary进行加权分析后,将带双层表头的gt对象转为数据框并合并表头为拼接列名的问题
嘿,我懂你现在的困扰!用tbl_strata和tbl_svysummary做加权分析后,生成的gt表格明明有双层表头,但转成数据框就只剩下层的popgroup列名,上层的性别分层信息全丢了对吧?别着急,咱们有两种简单的方法能把双层表头拼接成你想要的Female A, N = XX这种格式,一起来看看:
方法一:用modify_header直接修改gt表头(推荐)
这个方法最贴合gtsummary的使用逻辑,不需要额外计算样本量,直接利用内置变量就能把分层信息、分组和样本量拼接成目标列名:
library(dplyr) library(gtsummary) library(srvyr) # 你的原始数据 data <- structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25), strata = c(10, 20, 30, 10, 20, 20, 10, 20, 30, 30, 10, 30, 30, 20, 10, 20, 20, 20, 10, 20, 20, 30, 30, 20, 30), weight = c(10, 8, 17, 15, 9, 10, 25, 8, 8, 13, 17, 24, 12, 15, 3, 12, 16, 17, 24, 12, 3, 2, 8, 14, 4), popgroup = c("A", "B", "A", "A", "A", "A", "B", "B", "B", "A", "A", "B", "A", "B", "A", "A", "B", "A", "A", "B", "A", "B", "B", "B", "B"), gender = c("Male", "Female", "Female", "Male", "Female", "Female", "Male", "Male", "Female", "Male", "Female", "Female", "Male", "Female", "Male", "Female", "Female", "Male", "Female", "Female", "Male", "Female", "Male", "Female", "Male"), inc_01 = c(1500, 1200, 130, 500, 750, 2000, 10000, 1500, 1050, 400, 360, 490, 250, 400, 2500, 1300, 800, 540, 690, 520, 600, 700, 700, 600, 400), inc_02 = c(360, 450, 120, 300, 900, 560, 450, 280, 720, 360, 1000, 900, 530, 820, 640, 520, 130, 140, 150, 650, 240, 130, 200, 300, 500)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -25L)) # 生成带分层的gt表格对象(先不转数据框) tbl_gt <- data |> as_survey_design(strata = strata, weights = weight) %>% tbl_strata( strata = gender, .tbl_fun = ~ .x %>% tbl_svysummary( by = popgroup, type = where(is.numeric) ~ "continuous", statistic = list(c(inc_01, inc_02) ~ "{mean} ({mean.std.error})"), missing = "no", digits = list(c(inc_01, inc_02) ~ c(4, 4)), include = c(inc_01, inc_02) ) ) # 修改表头,拼接分层信息、分组和样本量 tbl_gt_modified <- tbl_gt |> modify_header( all_stat_cols() ~ "{strata} {level}, N = {n}" ) # 转成数据框,此时列名已经是拼接好的格式 final_tbl_df <- tbl_gt_modified |> as.data.frame() # 查看结果 final_tbl_df
方法二:手动计算样本量后重命名列
如果你想更灵活地控制样本量的显示(比如保留小数或调整格式),可以先手动计算每个分层-分组组合的加权样本量,再给数据框重命名:
# 计算每个gender-popgroup组合的加权样本量 n_table <- data |> as_survey_design(strata = strata, weights = weight) |> group_by(gender, popgroup) |> summarize(n = survey_total(vartype = "none")) |> ungroup() |> mutate(n_label = paste0(gender, " ", popgroup, ", N = ", round(n))) # 转成数据框 tbl_df <- tbl_gt |> as.data.frame() # 替换列名:第一列保留"Characteristic",其余用拼接好的标签 colnames(tbl_df) <- c("Characteristic", n_table$n_label) # 查看结果 tbl_df
两种方法都能得到你想要的列名格式,第一种更简洁高效,推荐优先使用~
备注:内容来源于stack exchange,提问作者Stephen Okiya




