R语言如何延迟代码执行？解决下载后文件读取失败问题

阿华AIGC实验室

2026-5-27

解决R函数下载文件后立即读取导致的文件未找到问题

问题描述

我编写了一个从NSE印度官网读取历史股票数据CSV文件的R函数，首次运行时总是提示file.string not found，但间隔几秒再运行就能正常工作。推测是download.file执行后程序立刻调用read.csv，此时文件还没完全保存到本地。想让函数首次运行就成功，有没有办法延迟read.csv直到文件完全保存好？（忽略download.file的destfile和file.string路径不同的问题，这是Windows 7系统的特殊情况）

原函数代码：

function(){ 
  s <- 1; #first get the bhav copy 
  today <- c();ty <- c();tm <- c();tmu <- c();td <- c(); 
  # get the URL first 
  today <- Sys.Date() 
  ty <- format(today, format = "%Y") 
  tm <- format(today, format = "%b") 
  tmu <- toupper(tm) 
  td <- format(today, format = "%d") 
  dynamic.URL <- paste("https://www.nseindia.com/content/historical/EQUITIES/",ty,"/",tmu,"/cm",td,tmu,ty,"bhav.csv.zip", sep = "") 
  file.string <- paste("C:/Users/user/AppData/Local/Temp/cm",td,tmu,ty,"bhav.csv") 
  download.file(dynamic.URL, "C:/Users/user/Desktop/bhav.csv.zip") 
  bhav.copy <- read.csv(file.string) 
  return(bhav.copy) 
}

解决方案

核心思路是等待目标文件完全生成后再执行read.csv，这里提供两种可靠的实现方式：

1. 循环检查文件存在性（最通用）

在download.file之后，添加一个循环，不断检查目标文件是否存在，直到文件出现再继续执行。同时加入短暂的休眠，避免过度占用CPU：

get_bhav_copy <- function(){ 
  # 初始化变量
  today <- Sys.Date() 
  ty <- format(today, format = "%Y") 
  tm <- format(today, format = "%b") 
  tmu <- toupper(tm) 
  td <- format(today, format = "%d") 
  
  # 构建URL和文件路径
  dynamic.URL <- paste("https://www.nseindia.com/content/historical/EQUITIES/",ty,"/",tmu,"/cm",td,tmu,ty,"bhav.csv.zip", sep = "") 
  zip_dest <- "C:/Users/user/Desktop/bhav.csv.zip"
  file.string <- paste("C:/Users/user/AppData/Local/Temp/cm",td,tmu,ty,"bhav.csv") 
  
  # 下载文件
  download.file(dynamic.URL, zip_dest) 
  
  # 等待目标CSV文件生成
  start_time <- Sys.time()
  while(!file.exists(file.string)){
    Sys.sleep(0.5) # 每0.5秒检查一次
    # 添加超时机制，防止无限等待（可选）
    if(Sys.time() - start_time > 30){
      stop("文件下载/解压超时，请检查网络或文件路径")
    }
  }
  
  # 读取文件
  bhav.copy <- read.csv(file.string) 
  return(bhav.copy) 
}

2. 手动解压并等待（更可控）

既然你下载的是ZIP包，其实可以手动调用unzip函数解压，这样能明确控制解压完成后再读取文件，避免依赖系统自动解压的延迟：

get_bhav_copy <- function(){ 
  # 初始化变量
  today <- Sys.Date() 
  ty <- format(today, format = "%Y") 
  tm <- format(today, format = "%b") 
  tmu <- toupper(tm) 
  td <- format(today, format = "%d") 
  
  # 构建URL和文件路径
  dynamic.URL <- paste("https://www.nseindia.com/content/historical/EQUITIES/",ty,"/",tmu,"/cm",td,tmu,ty,"bhav.csv.zip", sep = "") 
  zip_dest <- "C:/Users/user/Desktop/bhav.csv.zip"
  temp_dir <- "C:/Users/user/AppData/Local/Temp/"
  
  # 下载并手动解压
  download.file(dynamic.URL, zip_dest) 
  unzip(zip_dest, exdir = temp_dir) # 明确解压到目标目录
  
  # 读取文件
  file.string <- paste(temp_dir, "cm",td,tmu,ty,"bhav.csv", sep = "")
  bhav.copy <- read.csv(file.string) 
  return(bhav.copy) 
}

这种方式更可靠，因为你完全掌控了解压的时机，不用依赖Windows自动解压ZIP的后台进程。

内容的提问来源于stack exchange，提问作者AKshayKulkarni