为何Linux命令cp未采用零拷贝机制？——基于strace追踪结果的技术疑问

阿华AIGC实验室

2026-4-29

Why Doesn't the cp Command Use Zero-Copy Syscalls Like sendfile()?

Great question! When you trace cp a.txt b.txt with strace and see it relying on read() and write(), it’s natural to wonder why it doesn’t leverage zero-copy mechanisms to skip user-space data copying. The answer comes down to compatibility, flexibility, and the full scope of what cp is designed to do—not just raw byte transfer. Let’s break it down:

Broad Compatibility & Universal Support
Zero-copy syscalls like sendfile() have limitations in where they work. Early Linux kernels only allowed sendfile() for transferring data between a file and a socket (think web servers serving static content), not between two regular files. While later kernels added file-to-file support, it still doesn’t work across all filesystem types (e.g., some network filesystems or specialized storage lack the required kernel hooks). cp needs to run reliably everywhere—from ext4 to NFS to tmpfs—so sticking with read()/write() (which are supported by every filesystem and kernel version) avoids compatibility headaches that would break the tool for some users.
cp Does More Than Just Move Bytes
Copying a file isn’t just about transferring data—it’s about preserving the file’s identity. cp is responsible for replicating permissions, ownership, timestamps, extended attributes, and even handling sparse files correctly. All these tasks require user-space logic: you read metadata from the source file, validate it, then apply it to the destination. Zero-copy syscalls only handle data transfer; they don’t integrate with metadata operations seamlessly. With read()/write(), the workflow is straightforward: read a data chunk, interleave metadata checks or updates, write the chunk, and repeat—no need to split the process into disconnected data and metadata steps.
Flexibility for Edge Cases
cp has to handle a surprising number of edge scenarios: resuming interrupted transfers, copying partial files, skipping special files (like device nodes or fifos by default), and even modifying data in rare cases (e.g., line ending conversion with --dos2unix). Zero-copy syscalls don’t expose the data to user space, which makes it impossible to verify, modify, or adjust the bytes mid-transfer. read()/write() keeps data in user space temporarily, letting cp handle these edge cases without a complete rewrite of its core logic.
Performance Tradeoffs Aren’t a Priority for General Use
For small files, the overhead of user-space copying is negligible compared to the cost of metadata operations (opening files, setting permissions, etc.). For large files, while zero-copy would save some CPU cycles, modern kernels optimize read()/write() with page caching, readahead, and copy-on-write tricks—so the performance gap isn’t as dramatic as you might expect. cp prioritizes reliability and consistency over raw speed; if you need maximum transfer throughput for large files, specialized tools (like some versions of dd with direct I/O) might use zero-copy, but cp is built to be a trustworthy general-purpose tool first.