基于R与栅格数据的行政级别气温/降水计算及代码优化求助

阿华AIGC实验室

2026-5-8

Hey Roberto, let's break down solutions for your two needs—starting with optimizing your precipitation workflow since that's where you've shared your current code and pain points.

1. Extract Daily Administrative-Level Temperatures (Long Time Series)

The core logic here mirrors the optimized precipitation workflow we'll cover below. Here's a straightforward, efficient approach:

Key Steps

Use the terra package (a faster, more memory-efficient successor to raster) to handle your temperature raster data (e.g., ERA5, CRU, or other daily temperature products).
Load administrative boundaries for your target countries (you can batch import them or use a global admin boundary dataset).
Crop and mask the temperature raster stack to your administrative region in one go.
Calculate daily average temperatures per administrative unit using terra::zonal().
Reshape the output into a tidy data frame for easy analysis.

Example Code

pacman::p_load(terra, sf, tidyverse, lubridate)

# Load multi-country admin boundaries (example: China, USA, India)
china <- terra::vect("https://geodata.ucdavis.edu/gadm/gadm4.1/shp/gadm41_CHN_shp.zip", layer = "gadm41_CHN_1")
usa <- terra::vect("https://geodata.ucdavis.edu/gadm/gadm4.1/shp/gadm41_USA_shp.zip", layer = "gadm41_USA_1")
india <- terra::vect("https://geodata.ucdavis.edu/gadm/gadm4.1/shp/gadm41_IND_shp.zip", layer = "gadm41_IND_1")
multi_admin <- terra::vect(list(china, usa, india))

# Read all daily temperature rasters (adjust file path/pattern to match your data)
temp_files <- list.files("./data/era5_daily/", pattern = "\\.tif$", full.names = TRUE)
temp_stack <- terra::rast(temp_files)

# Extract dates from filenames (adjust format to match your naming convention)
dates <- basename(temp_files) %>%
  str_extract("\\d{8}") %>%
  as.Date(format = "%Y%m%d")
names(temp_stack) <- dates

# Crop + mask to admin boundaries (one operation = faster processing)
temp_admin <- terra::crop(temp_stack, multi_admin) %>% terra::mask(., multi_admin)

# Calculate daily average temp per administrative unit
admin_daily_temp <- terra::zonal(temp_admin, multi_admin, fun = "mean", na.rm = TRUE)

# Convert to tidy format for analysis/visualization
admin_daily_temp_tidy <- admin_daily_temp %>%
  as.data.frame() %>%
  pivot_longer(cols = -c(NAME_1, GID_1, COUNTRY), names_to = "date", values_to = "mean_temp") %>%
  mutate(date = as.Date(date))

2. Optimize Administrative-Level Precipitation Calculations

Your current code has a few key bottlenecks: unnecessary format conversion (.tif.gz → .asc), repetitive cropping/masking, and manual month extraction. Let's fix these with a streamlined, faster workflow.

2.1 Ditch `raster` for `terra`

terra is designed for large raster datasets—it’s faster, uses memory more efficiently, and supports direct reading of .tif.gz files (no need to convert to .asc!). This alone will drastically speed up your processing.

2.2 Optimized Precipitation Processing Workflow

pacman::p_load(terra, sf, tidyverse, lubridate)

# Load China's level-1 admin boundaries (use same method for multi-country)
china <- terra::vect("https://geodata.ucdavis.edu/gadm/gadm4.1/shp/gadm41_CHN_shp.zip", layer = "gadm41_CHN_1")

# Read all CHIRPS .tif.gz files directly (no conversion needed!)
chirps_files <- list.files("./data/chirps/", pattern = "\\.tif\\.gz$", full.names = TRUE)
chirps_stack <- terra::rast(chirps_files)

# Extract dates from CHIRPS filenames (matches your 2020.01.01.tif.gz format)
dates <- basename(chirps_files) %>%
  str_remove("\\.tif\\.gz$") %>%
  as.Date(format = "%Y.%m.%d")
names(chirps_stack) <- dates

# Crop + mask to China's boundaries (single operation = no redundant work)
chirps_china <- terra::crop(chirps_stack, china) %>% terra::mask(., china)

# Calculate DAILY average precipitation per province
admin_daily_precip <- terra::zonal(chirps_china, china, fun = "mean", na.rm = TRUE)

# Convert to tidy format
admin_daily_precip_tidy <- admin_daily_precip %>%
  as.data.frame() %>%
  pivot_longer(cols = -c(NAME_1, GID_1), names_to = "date", values_to = "mean_precip") %>%
  mutate(date = as.Date(date))

# If you need MONTHLY averages, aggregate with dplyr
admin_monthly_precip <- admin_daily_precip_tidy %>%
  mutate(year = year(date),
         month = month(date, label = TRUE)) %>%
  group_by(NAME_1, GID_1, year, month) %>%
  summarise(mean_precip_monthly = mean(mean_precip, na.rm = TRUE)) %>%
  ungroup()

2.3 Key Improvements Over Your Original Code

No more .asc conversion: terra reads .tif.gz directly, saving hours of processing time and disk space.
Batch processing: All rasters are loaded and processed in one go, avoiding repetitive loops and cropping/masking.
Automatic date handling: Dates are extracted directly from filenames, so you don’t need to manually filter files by month.
Faster zonal stats: terra::zonal() computes averages for all time steps at once, eliminating the need to loop through each month.
Tidy data output: The final data frame is easy to use for visualization, modeling, or further analysis.

Bonus Tips for Large Datasets

If you’re working with decades of daily data, process it in yearly chunks to avoid memory issues.
Use terra::writeRaster() to save cropped/masked stacks as compressed .tif files for quick access later.
For multi-country analysis, combine all admin boundaries into a single SpatVector object before processing.

内容的提问来源于stack exchange，提问作者RobertoSupe