如何无需解压直接读取tar.gz压缩包中子文件夹内的表格文件?
Great question! You absolutely can read a specific nested file (like firstf/secondf/table.txt) inside a tar.gz archive directly in R—no need to extract the entire archive or delete the original compressed file afterward. Here are three reliable, straightforward methods:
Method 1: Use Base R's tar() Function
Base R has a built-in tar() function that lets you target specific files in an archive and stream their content directly into a connection. This works without any extra packages:
# Create a connection to the nested file inside the tar.gz file_conn <- tar( tarfile = "myFile.tar.gz", files = "firstf/secondf/table.txt", stdout = TRUE ) # Read the table from the connection myData <- read.table(file_conn) # Always close the connection when finished to free resources close(file_conn)
Method 2: Tidyverse-Friendly Approach with readr
If you’re already using the tidyverse, the readr package has a convenient tar_connection() function that simplifies this workflow. It handles text formatting nicely and integrates smoothly with other tidyverse tools:
library(readr) # Create a direct connection to the nested file tar_conn <- tar_connection( archive = "myFile.tar.gz", file = "firstf/secondf/table.txt" ) # Read the table (use read_csv() if it's a CSV instead of a plain table) myData <- read_table(tar_conn) # Clean up the connection close(tar_conn)
Method 3: Flexible Handling with the archive Package
For broader support of different compression formats (tar, zip, 7z, etc.), the archive package is a great choice. It lets you read the target file in one concise line:
library(archive) # Read the specific nested file directly into your data frame myData <- read.table(archive_read( "myFile.tar.gz", file = "firstf/secondf/table.txt" ))
Quick Notes:
- Double-check the file path inside the archive—paths are case-sensitive on Linux/macOS, so make sure it matches exactly what’s in the tar.gz.
- All these methods stream the file content instead of extracting the entire archive, which is much more memory-efficient for large files.
内容的提问来源于stack exchange,提问作者Ahdee




