如何用R代码统计c2列值全部相同的农场主数量？

阿华AIGC实验室

2026-5-9

Solution: Count Farmers with All Identical c2 Values

Hey there! Let's tackle this problem together. First, let's confirm the data frame we're working with:

df <- data.frame(farmer=c("F1","F1","F1","F2","F2","F2","F3","F4","F4"), 
                 c2=c(4,4,5,3,3,3,1,2,1))

Our goal is to count how many farmers have all identical values in the c2 column. Here are two reliable approaches to get the result (which you already know is 2):

Approach 1: Using `dplyr` (Tidyverse Style)

This method is super readable and intuitive, especially if you're used to tidy data workflows:

# Load the dplyr package (install first if you haven't: install.packages("dplyr"))
library(dplyr)

# Calculate the count
valid_farmer_count <- df %>%
  group_by(farmer) %>%  # Split data into groups by each farmer
  summarise(all_c2_same = n_distinct(c2) == 1) %>%  # Check if group has only 1 unique c2 value
  filter(all_c2_same) %>%  # Keep only farmers with identical c2 values
  nrow()  # Count the number of valid farmers

valid_farmer_count
# Output: 2

Key Functions Used:

group_by(): Splits the data frame into groups based on the farmer column.
summarise(): Creates a summary column (all_c2_same) that returns TRUE if all c2 values in the group are identical.
n_distinct(): Counts the number of unique values in a vector (we check if this equals 1).
filter(): Retains only rows where all_c2_same is TRUE.
nrow(): Counts the number of rows in the filtered result, which gives our final count.

Approach 2: Base R (No External Packages)

If you prefer not to load additional packages, this base R method works perfectly:

# Check for each farmer if all c2 values are identical
same_c2_check <- aggregate(c2 ~ farmer, data = df, function(x) length(unique(x)) == 1)

# Sum the number of TRUE results (since TRUE = 1 in R)
valid_farmer_count <- sum(same_c2_check$c2)

valid_farmer_count
# Output: 2

Key Functions Used:

aggregate(): Groups the data by farmer and applies a custom function to the c2 column.
unique(): Extracts all unique values from the c2 vector of each group.
length(): Counts how many unique values exist (if it's 1, all values are identical).
sum(): Adds up the TRUE values (treated as 1) to get the total count of valid farmers.

Another base R alternative using by():

# Check each farmer's c2 values match the first value in their group
same_c2_list <- by(df$c2, df$farmer, function(x) all(x == x[1]))

# Convert list to vector and sum the TRUE values
valid_farmer_count <- sum(unlist(same_c2_list))

valid_farmer_count
# Output: 2

Here, by() applies the check to each farmer's c2 data, unlist() converts the result list to a numeric vector, and sum() tallies up the valid farmers.

内容的提问来源于stack exchange，提问作者Martini Sun