求助:基于NES正负双向排序条件绘制代谢通路富集分数柱状图的R语言实现问题
解决富集通路柱状图的自定义排序问题
我来帮你搞定这个排序的问题!你的需求是让正NES的通路按降序排列,负NES的按升序排列,但之前用abs(NES)作为排序依据的方法没达到预期,原因在于这个逻辑没有准确对应你想要的排序规则。
问题分析
你原来的代码里,对负NES的通路用了abs(NES)作为排序变量:
mutate(NAME=fct_reorder(NAME, ifelse(NES>0, NES, abs(NES))))
这会让负NES的通路按绝对值降序排列(比如-2.67的绝对值是2.67,比-2.64的2.64大,会排在负组的后面),但你想要的是负NES按本身升序排列(也就是-2.67 < -2.64 < -2.55...,绝对值大的在前),所以这个逻辑和需求不符。
解决方案
我们可以用case_when创建一个精准的排序键:
- 对于正NES的通路,用
-NES作为排序键(这样升序排列时,大的NES会排在前面,实现降序效果) - 对于负NES的通路,直接用
NES作为排序键(升序排列时,更小的NES(更负)会排在前面,实现升序效果)
修改后的完整代码如下:
library(tidyverse) # 你的数据框 df <- structure(list(NAME = c("KEGG_BETA_ALANINE_METABOLISM", "KEGG_PEROXISOME", "KEGG_VALINE_LEUCINE_AND_ISOLEUCINE_DEGRADATION", "KEGG_DRUG_METABOLISM_OTHER_ENZYMES", "KEGG_ALANINE_ASPARTATE_AND_GLUTAMATE_METABOLISM", "KEGG_OXIDATIVE_PHOSPHORYLATION", "KEGG_AMINOACYL_TRNA_BIOSYNTHESIS", "KEGG_GLYOXYLATE_AND_DICARBOXYLATE_METABOLISM", "KEGG_GLYCINE_SERINE_AND_THREONINE_METABOLISM", "KEGG_FOCAL_ADHESION", "KEGG_ECM_RECEPTOR_INTERACTION", "KEGG_CELL_ADHESION_MOLECULES_CAMS", "KEGG_LEUKOCYTE_TRANSENDOTHELIAL_MIGRATION", "KEGG_HEMATOPOIETIC_CELL_LINEAGE", "KEGG_LEISHMANIA_INFECTION"), SIZE = c(22L, 77L, 39L, 25L, 27L, 98L, 22L, 16L, 30L, 192L, 83L, 105L, 112L, 67L, 52L), ES = c(0.6333836, 0.4741722, 0.54287475, 0.53727466, 0.52466995, 0.39599127, 0.5367668, 0.57810646, 0.47952536, -0.63034177, -0.6984717, -0.65617377, -0.638508, -0.6932509, -0.69873965), NES = c(1.9397553, 1.8766444, 1.8667365, 1.7000551, 1.6511785, 1.6349672, 1.6263499, 1.5636431, 1.5489777, -2.6781485, -2.64518, -2.5497398, -2.4806988, -2.4616463, -2.4117668), NOM.p.val = c(0, 0, 0, 0.0093240095, 0.023148147, 0.0024271845, 0.011520738, 0.046620045, 0.028697573, 0, 0, 0, 0, 0, 0), FDR.q.val = c(0.018558597, 0.01772788, 0.013456956, 0.059479948, 0.06941477, 0.066045, 0.060804494, 0.08560156, 0.0844615, 0, 0, 0, 0, 0, 0), FWER.p.val = c(0.031, 0.058, 0.067, 0.341, 0.454, 0.501, 0.53, 0.711, 0.753, 0, 0, 0, 0, 0, 0), RANK.AT.MAX = c(8546L, 6704L, 6861L, 3072L, 6397L, 11642L, 10387L, 5387L, 6522L, 5367L, 3928L, 5004L, 5034L, 3514L, 5809L)), row.names = c(NA, 15L), class = "data.frame") # 修改后的绘图代码 df %>% filter(FDR.q.val < 0.05) %>% select(NAME, NES) %>% mutate(sign = ifelse(NES > 0, "CondA", "CondB")) %>% # 自定义排序逻辑 mutate(NAME = fct_reorder(NAME, case_when( NES > 0 ~ -NES, # 正NES按降序排 TRUE ~ NES # 负NES按升序排 ))) %>% ggplot(aes(x = NAME, y = NES, fill = sign)) + geom_bar(stat = "identity") + coord_flip() + xlab("") + ylab("Normalized Enrichment Score")
效果验证
运行这段代码后,你会看到:
- 正NES的通路(CondA)从上到下依次是
KEGG_BETA_ALANINE_METABOLISM(1.94)、KEGG_PEROXISOME(1.88)、KEGG_VALINE_LEUCINE_AND_ISOLEUCINE_DEGRADATION(1.87),符合降序要求 - 负NES的通路(CondB)从上到下依次是
KEGG_FOCAL_ADHESION(-2.68)、KEGG_ECM_RECEPTOR_INTERACTION(-2.65)、KEGG_CELL_ADHESION_MOLECULES_CAMS(-2.55)、KEGG_LEUKOCYTE_TRANSENDOTHELIAL_MIGRATION(-2.48)、KEGG_HEMATOPOIETIC_CELL_LINEAGE(-2.46)、KEGG_LEISHMANIA_INFECTION(-2.41),符合升序要求
内容的提问来源于stack exchange,提问作者Dos Santos Alexandre




