基于R与栅格数据的行政级别气温/降水计算及代码优化求助
Hey Roberto, let's break down solutions for your two needs—starting with optimizing your precipitation workflow since that's where you've shared your current code and pain points.
The core logic here mirrors the optimized precipitation workflow we'll cover below. Here's a straightforward, efficient approach:
Key Steps
- Use the
terrapackage (a faster, more memory-efficient successor toraster) to handle your temperature raster data (e.g., ERA5, CRU, or other daily temperature products). - Load administrative boundaries for your target countries (you can batch import them or use a global admin boundary dataset).
- Crop and mask the temperature raster stack to your administrative region in one go.
- Calculate daily average temperatures per administrative unit using
terra::zonal(). - Reshape the output into a tidy data frame for easy analysis.
Example Code
pacman::p_load(terra, sf, tidyverse, lubridate) # Load multi-country admin boundaries (example: China, USA, India) china <- terra::vect("https://geodata.ucdavis.edu/gadm/gadm4.1/shp/gadm41_CHN_shp.zip", layer = "gadm41_CHN_1") usa <- terra::vect("https://geodata.ucdavis.edu/gadm/gadm4.1/shp/gadm41_USA_shp.zip", layer = "gadm41_USA_1") india <- terra::vect("https://geodata.ucdavis.edu/gadm/gadm4.1/shp/gadm41_IND_shp.zip", layer = "gadm41_IND_1") multi_admin <- terra::vect(list(china, usa, india)) # Read all daily temperature rasters (adjust file path/pattern to match your data) temp_files <- list.files("./data/era5_daily/", pattern = "\\.tif$", full.names = TRUE) temp_stack <- terra::rast(temp_files) # Extract dates from filenames (adjust format to match your naming convention) dates <- basename(temp_files) %>% str_extract("\\d{8}") %>% as.Date(format = "%Y%m%d") names(temp_stack) <- dates # Crop + mask to admin boundaries (one operation = faster processing) temp_admin <- terra::crop(temp_stack, multi_admin) %>% terra::mask(., multi_admin) # Calculate daily average temp per administrative unit admin_daily_temp <- terra::zonal(temp_admin, multi_admin, fun = "mean", na.rm = TRUE) # Convert to tidy format for analysis/visualization admin_daily_temp_tidy <- admin_daily_temp %>% as.data.frame() %>% pivot_longer(cols = -c(NAME_1, GID_1, COUNTRY), names_to = "date", values_to = "mean_temp") %>% mutate(date = as.Date(date))
Your current code has a few key bottlenecks: unnecessary format conversion (.tif.gz → .asc), repetitive cropping/masking, and manual month extraction. Let's fix these with a streamlined, faster workflow.
2.1 Ditch raster for terra
terra is designed for large raster datasets—it’s faster, uses memory more efficiently, and supports direct reading of .tif.gz files (no need to convert to .asc!). This alone will drastically speed up your processing.
2.2 Optimized Precipitation Processing Workflow
pacman::p_load(terra, sf, tidyverse, lubridate) # Load China's level-1 admin boundaries (use same method for multi-country) china <- terra::vect("https://geodata.ucdavis.edu/gadm/gadm4.1/shp/gadm41_CHN_shp.zip", layer = "gadm41_CHN_1") # Read all CHIRPS .tif.gz files directly (no conversion needed!) chirps_files <- list.files("./data/chirps/", pattern = "\\.tif\\.gz$", full.names = TRUE) chirps_stack <- terra::rast(chirps_files) # Extract dates from CHIRPS filenames (matches your 2020.01.01.tif.gz format) dates <- basename(chirps_files) %>% str_remove("\\.tif\\.gz$") %>% as.Date(format = "%Y.%m.%d") names(chirps_stack) <- dates # Crop + mask to China's boundaries (single operation = no redundant work) chirps_china <- terra::crop(chirps_stack, china) %>% terra::mask(., china) # Calculate DAILY average precipitation per province admin_daily_precip <- terra::zonal(chirps_china, china, fun = "mean", na.rm = TRUE) # Convert to tidy format admin_daily_precip_tidy <- admin_daily_precip %>% as.data.frame() %>% pivot_longer(cols = -c(NAME_1, GID_1), names_to = "date", values_to = "mean_precip") %>% mutate(date = as.Date(date)) # If you need MONTHLY averages, aggregate with dplyr admin_monthly_precip <- admin_daily_precip_tidy %>% mutate(year = year(date), month = month(date, label = TRUE)) %>% group_by(NAME_1, GID_1, year, month) %>% summarise(mean_precip_monthly = mean(mean_precip, na.rm = TRUE)) %>% ungroup()
2.3 Key Improvements Over Your Original Code
- No more
.ascconversion:terrareads.tif.gzdirectly, saving hours of processing time and disk space. - Batch processing: All rasters are loaded and processed in one go, avoiding repetitive loops and cropping/masking.
- Automatic date handling: Dates are extracted directly from filenames, so you don’t need to manually filter files by month.
- Faster zonal stats:
terra::zonal()computes averages for all time steps at once, eliminating the need to loop through each month. - Tidy data output: The final data frame is easy to use for visualization, modeling, or further analysis.
Bonus Tips for Large Datasets
- If you’re working with decades of daily data, process it in yearly chunks to avoid memory issues.
- Use
terra::writeRaster()to save cropped/masked stacks as compressed.tiffiles for quick access later. - For multi-country analysis, combine all admin boundaries into a single
SpatVectorobject before processing.
内容的提问来源于stack exchange,提问作者RobertoSupe




