在RStudio中按性别填充身高列NA值失败,求助解决方案
Hey there! Let’s work through why your code isn’t updating the NA values and get it sorted out.
The Core Issue: Logical Operators (&& vs &)
First off, the biggest problem here is using && instead of & in your subsetting condition.
In R:
&&is a short-circuit logical operator — it only checks the first element of your vectors. So if the first row of your data isn’t a female with a missing height, the entire condition returnsFALSE, and no rows get selected for updating. That’s why your data table stayed unchanged!&is the element-wise logical operator — it checks every row individually, which is exactly what you need for vector-based operations in R.
The Data Type Error
The error you saw (argument is not numeric or logical: returning NA) means your height column isn’t stored as a numeric type. It’s probably a character vector (maybe because there was non-numeric text in the column when you imported the data). Before calculating means, you’ll need to convert it to numeric:
# Check the current type of the height column class(data$height) # Convert to numeric (this will turn non-numeric values into NA, which is expected) data$height <- as.numeric(data$height)
Correct Code to Fill NAs
Now let’s rewrite your code with the right operator and fixed data type:
- Calculate the mean height for each gender, excluding NA values:
female_mean <- mean(data$height[data$gender == "female"], na.rm = TRUE) male_mean <- mean(data$height[data$gender == "male"], na.rm = TRUE)
- Use
&to target the correct rows and assign the means:
# Fill NAs for females data$height[is.na(data$height) & data$gender == "female"] <- female_mean # Fill NAs for males data$height[is.na(data$height) & data$gender == "male"] <- male_mean
A Cleaner Alternative with dplyr
If you’re open to using the dplyr package (super common for data manipulation in R), this approach is more readable and handles both genders in one go:
library(dplyr) data <- data %>% group_by(gender) %>% mutate(height = ifelse(is.na(height), mean(height, na.rm = TRUE), height)) %>% ungroup()
This groups the data by gender, then replaces NA heights with the mean of that group’s heights.
Quick Check to Verify
After running the code, you can confirm the NAs are gone with:
# Check how many NAs are left in height sum(is.na(data$height))
内容的提问来源于stack exchange,提问作者LouC




