Cloud Spanner：拆分过大的影响及硬限制问询

阿华AIGC实验室

2026-5-25

What Happens When a Cloud Spanner Split Grows to 20-30GB?

Great question—let’s break down the practical impacts of oversized splits and whether you need to tweak your primary key design.

Key Impacts of 20-30GB Splits

While Cloud Spanner doesn’t enforce a hard limit on split size, straying far beyond the recommended "few GB" range leads to tangible issues:

Increased split migration overhead: When Spanner needs to move splits between nodes (for load balancing, maintenance, or scaling), larger splits mean more data to transfer. This extends migration windows, leading to temporary elevated read/write latency, and in some cases, brief throttling as nodes coordinate replication and traffic handoffs.
Node-level performance bottlenecks: Each split is tied to a single node. A 20-30GB split with high read/write concurrency will monopolize that node’s CPU, memory, or IO resources. You’ll see reduced throughput, higher P99 latencies, and potentially overwhelmed nodes that can’t keep up with demand—especially for write operations, which require synchronous replication across multiple replicas.
Longer backup/restore cycles: Backing up a large split takes significantly more time, and restoring it will delay your recovery time objective (RTO) if you ever need to restore that segment of data.
Amplified hotspot risks: If your primary key design already causes write hotspots (e.g., using monotonic increasing IDs), a large split will make this problem worse. All new writes pile onto the same overloaded split, compounding latency and throughput issues.

Should You Adjust Your Primary Key Design?

Absolutely—if your splits are hitting 20-30GB and you’re seeing any of the above issues, optimizing your primary key is critical to keep Spanner running smoothly. Here’s how:

Add a sharding column to your primary key: Introduce a low-cardinality shard key (e.g., a hash prefix of a user ID, or a numeric range like 0-9) as the first column in your primary key. For example, change (user_id) to (shard_key, user_id)—this distributes data across multiple splits, keeping individual split sizes small.
Avoid monotonic primary key prefixes: Steer clear of using auto-incrementing IDs or timestamps as the first column in your primary key. These force all new writes into the same split, causing it to grow indefinitely. Instead, use a high-cardinality, evenly distributed column (like a region code or tenant ID) as the prefix to spread load across splits.
Monitor split metrics: Use Spanner’s built-in monitoring tools to track split sizes and node load. If you notice a split consistently growing beyond 10GB and carrying heavy traffic, that’s a clear sign your primary key needs adjustment.

内容的提问来源于stack exchange，提问作者Christian Gintenreiter