如何在R的ggplot中添加R²与P值?附绘图代码求助
Got it, let's tackle this problem step by step. First, I'll provide a reproducible dataset that matches your variable structure, then walk you through two reliable ways to add the correlation stats (with p-value prioritized) to your plot.
Reproducible Dataset
First, let's create simulated data you can test with:
library(tidyverse) set.seed(123) # Ensures results are consistent every time you run the code sim_data <- tibble( Year = rep(2010:2020, each = 30), Position = rep(c("Manager", "Staff", "Intern"), times = 11*10), Score = case_when( Position == "Manager" ~ rnorm(330, mean = 70 + 0.8*(Year-2010), sd = 5), Position == "Staff" ~ rnorm(330, mean = 65 + 0.5*(Year-2010), sd = 6), Position == "Intern" ~ rnorm(330, mean = 60 + 0.3*(Year-2010), sd = 7) ) )
Method 1: Manual Calculation + Custom Annotations
This method gives you full control over where to place your labels. We'll first compute the Pearson correlation stats for each Position group, then add them to the plot.
Step 1: Compute Grouped Correlation Stats
cor_stats <- sim_data %>% group_by(Position) %>% summarize( cor_test = list(cor.test(Score, Year, method = "pearson")), r = round(cor_test[[1]]$estimate, 3), p_val = round(cor_test[[1]]$p.value, 4), # Format label to prioritize p-value, include r for context label = str_glue("p = {p_val}\nr = {r}") )
Step 2: Plot with Custom Labels
We'll place labels on the right side of the plot (adjust x/y values to fit your actual data):
ggplot(sim_data, aes(x = Score, y = Year, color = Position)) + geom_smooth(method = "lm", se = FALSE) + # Add precomputed stats as text annotations geom_text( data = cor_stats, aes(x = quantile(sim_data$Score, 0.95), y = median(sim_data$Year), label = label), hjust = 0, vjust = 0.5, size = 4, show.legend = FALSE ) + theme_minimal() + labs(title = "Score vs. Year by Position", x = "Performance Score", y = "Year")
Method 2: Automated Labeling with ggpmisc
For a more streamlined workflow, use the ggpmisc package to auto-generate and place the stats directly on your linear smooth lines.
Step 1: Install & Load the Package
install.packages("ggpmisc") library(ggpmisc)
Step 2: Plot with Auto-Generated Stats
We'll configure the label to show p-value first, followed by r:
ggplot(sim_data, aes(x = Score, y = Year, color = Position)) + geom_smooth(method = "lm", se = FALSE) + # Add correlation stats to the plot stat_poly_eq( aes(label = paste(..p.value.label.., ..rr.label.., sep = "~~~")), formula = y ~ x, parse = TRUE, size = 4, label.x = "right" # Place labels on the right edge ) + theme_minimal() + labs(title = "Score vs. Year by Position", x = "Performance Score", y = "Year")
Quick Notes:
- The
~~~in the label string creates a line break between p-value and r. - Adjust
label.x/label.yinstat_poly_eqto move labels to your preferred position. - If you only want to display p-value, simplify the label to
..p.value.label...
内容的提问来源于stack exchange,提问作者Evan




