You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

如何在R的ggplot中添加R²与P值?附绘图代码求助

Solution: Add Pearson r & P-values to ggplot Linear Smooth Plots

Got it, let's tackle this problem step by step. First, I'll provide a reproducible dataset that matches your variable structure, then walk you through two reliable ways to add the correlation stats (with p-value prioritized) to your plot.

Reproducible Dataset

First, let's create simulated data you can test with:

library(tidyverse)
set.seed(123) # Ensures results are consistent every time you run the code

sim_data <- tibble(
  Year = rep(2010:2020, each = 30),
  Position = rep(c("Manager", "Staff", "Intern"), times = 11*10),
  Score = case_when(
    Position == "Manager" ~ rnorm(330, mean = 70 + 0.8*(Year-2010), sd = 5),
    Position == "Staff" ~ rnorm(330, mean = 65 + 0.5*(Year-2010), sd = 6),
    Position == "Intern" ~ rnorm(330, mean = 60 + 0.3*(Year-2010), sd = 7)
  )
)

Method 1: Manual Calculation + Custom Annotations

This method gives you full control over where to place your labels. We'll first compute the Pearson correlation stats for each Position group, then add them to the plot.

Step 1: Compute Grouped Correlation Stats

cor_stats <- sim_data %>%
  group_by(Position) %>%
  summarize(
    cor_test = list(cor.test(Score, Year, method = "pearson")),
    r = round(cor_test[[1]]$estimate, 3),
    p_val = round(cor_test[[1]]$p.value, 4),
    # Format label to prioritize p-value, include r for context
    label = str_glue("p = {p_val}\nr = {r}")
  )

Step 2: Plot with Custom Labels

We'll place labels on the right side of the plot (adjust x/y values to fit your actual data):

ggplot(sim_data, aes(x = Score, y = Year, color = Position)) +
  geom_smooth(method = "lm", se = FALSE) +
  # Add precomputed stats as text annotations
  geom_text(
    data = cor_stats,
    aes(x = quantile(sim_data$Score, 0.95), y = median(sim_data$Year), label = label),
    hjust = 0, vjust = 0.5, size = 4, show.legend = FALSE
  ) +
  theme_minimal() +
  labs(title = "Score vs. Year by Position", x = "Performance Score", y = "Year")

Method 2: Automated Labeling with ggpmisc

For a more streamlined workflow, use the ggpmisc package to auto-generate and place the stats directly on your linear smooth lines.

Step 1: Install & Load the Package

install.packages("ggpmisc")
library(ggpmisc)

Step 2: Plot with Auto-Generated Stats

We'll configure the label to show p-value first, followed by r:

ggplot(sim_data, aes(x = Score, y = Year, color = Position)) +
  geom_smooth(method = "lm", se = FALSE) +
  # Add correlation stats to the plot
  stat_poly_eq(
    aes(label = paste(..p.value.label.., ..rr.label.., sep = "~~~")),
    formula = y ~ x,
    parse = TRUE,
    size = 4,
    label.x = "right" # Place labels on the right edge
  ) +
  theme_minimal() +
  labs(title = "Score vs. Year by Position", x = "Performance Score", y = "Year")

Quick Notes:

  • The ~~~ in the label string creates a line break between p-value and r.
  • Adjust label.x/label.y in stat_poly_eq to move labels to your preferred position.
  • If you only want to display p-value, simplify the label to ..p.value.label...

内容的提问来源于stack exchange,提问作者Evan

火山引擎 最新活动