ByteHouse 唯一键表最佳实践--ByteHouse云数仓版-火山引擎

文档中心

ByteHouse云数仓版

最佳实践

ByteHouse 唯一键表最佳实践

ByteHouse 唯一键表主要用于实现 upsert 功能。该能力是 ByteHouse 团队自研的独有特性，既能保持高效的查询性能、又支持唯一键更新。ByteHouse 唯一键表主要解决了开源 ClickHouse 不能支持高效更新操作的痛点，帮助业务更简单地开发实时分析应用。您可通过指定唯一键 UNIQUE KEY 来实现 Upsert 更新写语义，查询自动返回每个唯一键的最新值。

功能概述

唯一键表即指定唯一键索引（UNIQUE KEY）的 CnchMergeTree 表，具有以下特点：

您通过 UNIQUE KEY 配置唯一键，支持 upsert 更新写语义，查询时自动返回每个唯一键的最新值。
在保证实时更新能力的情况下，依然保持较高的查询性能。
唯一键索引（UNIQUE KEY）支持多字段和表达式。
唯一键表支持多种去重粒度（如分区级去重、bucket 去重等）。
支持自定义版本字段，写入低版本数据时自动忽略。
支持根据 UNIQUE KEY 实时删除数据。
支持根据 UNIQUE KEY 进行部分列更新操作。

更多关于唯一键表的能力介绍、功能逻辑、使用限制等内容，请参见唯一键表。

建表指南

UNIQUE KEY 数量

建议 UNIQUE KEY 字段设置不超过 5 个；如果多个字段组合构成唯一键，可以使用它们的 toString(sipHash128()) 哈希作为唯一键。

去重粒度选择

如需去重，建议您优先选择使用分区级去重，再根据数据量级和写入性能需求 RPS（record per second）判断是否需要开启 bucket 去重优化。

Bucket 去重优化

对于大表，建议开启 bucket 去重。

什么是 bucket 去重

唯一键表默认使用分区内去重，新写入的数据需要和对应分区的所有存量数据进行去重。当分区的总行数达到千万～亿级别时，去重耗时会显著增加。Bucket 去重就是通过指定 CLUSTER BY，将分区内的数据切成 N 个唯一键不相交的分桶，此时每个分桶的数据行数变为原来的 1/N**，并且各个分桶的去重可以独立&并行进行，从而显著提升写入性能。**
表级去重（设置了 partition_level_unique_keys = 0）的表，也支持 bucket 去重。

开启 bucket 去重

开启 bucket 去重需满足两个条件：

表同时指定了 UNIQUE KEY 和 CLUSTER BY。CLUSTER BY 的推荐写法如下：

-- c1, c2 是分桶字段
-- N 是分桶数
CLUSTER BY EXPRESSION cityHash64V2(c1, c2) % N INTO N BUCKETS

CLUSTER BY 引用的表字段是 UNIQUE KEY 引用表字段的子集。示例如下：

-- 示例 1：UNIQUE KEY 与 CLUSTER BY 字段相同，满足 bucket 去重
UNIQUE KEY c1
CLUSTER BY EXPRESSION cityHash64V2(c1) % 4 INTO 4 BUCKETS

-- 示例 2：UNIQUE KEY 包含了所有 CLUSTER BY 的字段，满足 bucket 去重
UNIQUE KEY sipHash64(c1, c2)
CLUSTER BY EXPRESSION cityHash64V2(c2) % 4 INTO 4 BUCKETS

-- 示例 3：UNIQUE KEY 与 CLUSTER BY 字段相同，满足 bucket 去重
UNIQUE KEY sipHash64(c1, c2)
CLUSTER BY EXPRESSION cityHash64V2(c1, c2) % 4 INTO 4 BUCKETS

-- 示例 4：CLUSTER BY包 含不在 UNIQUE KEY 中的字段，不满足 bucket 去重
UNIQUE KEY c1
CLUSTER BY EXPRESSION cityHash64V2(c1, c2) % 4 INTO 4 BUCKETS

-- 示例 5：没有 CLUSTER BY，不满足 Bucket 去重
UNIQUE KEY c1

注意

为了避免 bucket 出现数据倾斜，通常需要选择基数高、分布均衡的唯一键字段作为分桶键。

Bucket 数量选择

通过两步计算，可以得到推荐的 bucket 数设置：

根据分区/表的行数，预估初始 bucket 数：
- 对于分区唯一，将单分区的总行数除以 2000 万，并向上取整。
- 对于表级唯一，将表的总行数除以 2000 万，并向上取整。
将上一步得到的 bucket 数，向上对齐到计算组 worker 数的整数倍。

参考上述步骤，bucket 数量计算示例如下：

例 1：分区唯一表，单分区预估 2 亿行，计算组包含 4 个节点
首先，200m / 20m 得到 10 个分桶，然后，对齐到 4 的倍数，最终 bucket 数量为 12 个分桶。
例 2：表级唯一表，单表预估30亿，计算组包含2个节点
首先，3000m / 20m 得到 150 个分桶，然后，对齐到 2 的倍数，最终 bucket 数量为 150 个分桶。

Bucket 表达式

推荐使用CLUSTER BY EXPRESSION cityHash64V2(cluster_column) % N INTO N BUCKETS 子句，具体使用方法可参考创建动态分区表。
CLUSTER BY 的列建议使用高基数列，否则容易出现 bucket 数据倾斜，从而影响写入和查询性能。

Bucket 表参数推荐配置

建议使用以下 SETTINGS 参数：

max_dedup_worker_number = [cnch_vw_write 的 pod 数], pick_dedup_worker_by_mod = 1, dedup_impl_version = 'dedup_in_txn_commit'

完整建表命令示例如下：

CREATE TABLE orders
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
CLUSTER BY EXPRESSION cityHash64V2(product_id) % 6 INTO 6 BUCKETS
UNIQUE KEY product_id
SETTINGS [..., ]max_dedup_worker_number = [cnch_vw_write 的 pod 数], pick_dedup_worker_by_mod = 1, dedup_impl_version = 'dedup_in_txn_commit';

参数名	参数类型	可选值	无配置默认值	参数说明
max_dedup_worker_number	UInt64	整数	1	Dedup worker 的数量，建议和 cnch_vw_write 设置的计算组 pod 数对齐。
pick_dedup_worker_by_mod	Bool	true/false	false	Dedup worker 的分布模式，配置为 true 时按计算组的节点总数取模可以保证分布均匀；默认为 false 时按照 hash 值分布，不一定均匀。
dedup_impl_version	DedupImplVersion	dedup_in_write_suffix/ dedup_in_txn_commit	dedup_in_write_suffix	去重实现，dedup_in_write_suffix 在写入阶段结束时执行去重逻辑；dedup_in_txn_commit 在事务预提交阶段执行去重逻辑，为优化版本。

在线扩容 Bucket 数

动态扩容是指您可根据数据表中数据量或负载的变化，调整分桶的数量或分布。当分桶表中的数据持续增长或查询负载增加时，原有的桶数量可能不足以高效处理数据，影响写入/查询性能。动态扩容功能支持通过调整桶的数量很好地解决这一问题，优化负载均衡，提升查询并行度。更多动态扩容场景及操作请参见Bucket Table 动态扩容最佳实践。
操作步骤：

修改分桶数量。新设定的桶数量必须是之前设定的桶数量的倍数。

-- 前置：确保表包含 enable_recluster_multiple_parts=1, enable_scale_bucket_optimization=1 这两个参数
ALTER TABLE t MODIFY SETTING enable_recluster_multiple_parts=1, enable_scale_bucket_optimization=1 -- 每个表仅需执行一次该命令
-- 比如表当前的分桶数是4，可以通过以下命令扩容到8个分桶
ALTER TABLE t MODIFY BUCKET NUMBER 8

（可选）扩容分桶后，表中的存量数据不会移动/重新分桶，但并不影响正常使用。如需修改存量数据的分桶数，可以执行如下的 RECLUSTER 命令。

-- 对执行分区进行 recluster
ALTER TABLE t RECLUSTER PARTITION '2025-04-14';

-- 对所有分区进行 recluster
ALTER TABLE t RECLUSTER PARTITION WHERE 1=1;

-- 对符合WHERE条件的多个分区进行 recluster
ALTER TABLE t RECLUSTER PARTITION WHERE ...;

-- 您可通过以下命令查看任务状态，确认重新分桶任务是否执行完成。
SELECT * FROM system.mutations WHERE database='...' and table = '...';

数据导入

攒批优先

ByteHouse 整体设计对于大批量数据友好，因此尽量优先选择在上游进行攒批。

非去重模式

注意

一般非去重模式在历史数据迁移阶段使用，正式生产环境慎用，否则会引入重复数据问题。

当您需要导入的数据满足如下两个条件时，可在 query 级别开启非去重导入：

导入的数据本身不存在重复唯一键；
导入的数据和历史数据不存在重复唯一键。

使用以下参数为 query 级别开启非去重导入：

dedup_key_mode='append', enable_unique_partial_update = 0

完整使用示例如下：

INSERT INTO target_table SELECT * FROM source_table SETTINGS dedup_key_mode='append', enable_unique_partial_update = 0;

参数说明

参数名	参数类型	可选值	无配置默认值	参数说明
dedup_key_mode	DedupKeyMode	replace/throw/append/ignore	replace	设置去重模式，不同参数对应不同的去重模式： replace 为 upsert 模式； throw 和 ignore 模式仅能在非 staging area 场景下使用，当出现相同 key 时抛异常或者忽略； append 模式没有去重阶段，性能最优。
enable_unique_partial_update	Bool	true/false	true	是否使用部分列更新模式。需要配合表级同名参数来使用，当且仅当该参数和表级都开启时部分列更新模式生效。

参数名

参数类型

可选值

无配置默认值

参数说明

dedup_key_mode

DedupKeyMode

replace/throw/append/ignore

replace

设置去重模式，不同参数对应不同的去重模式：

replace 为 upsert 模式；
throw 和 ignore 模式仅能在非 staging area 场景下使用，当出现相同 key 时抛异常或者忽略；
append 模式没有去重阶段，性能最优。

enable_unique_partial_update

Bool

true/false

true

是否使用部分列更新模式。需要配合表级同名参数来使用，当且仅当该参数和表级都开启时部分列更新模式生效。

并发写入

数据导入时如果存在并发写入，您可优先考虑上层对去重粒度进行写入对齐。在难以做到的情况下，可以在有去重瓶颈时开启并发写优化。ByteHouse 可通过设置表级参数，实现基于后台线程攒批去重，大批量去重性能更佳。
表级参数设置如下：

cloud_enable_staging_area=1, cloud_staging_area_wait_mode='wait_txn_visible'

完整建表命令示例如下：

CREATE TABLE orders
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
CLUSTER BY EXPRESSION cityHash64V2(product_id) % 6 INTO 6 BUCKETS
UNIQUE KEY product_id
SETTINGS [..., ]cloud_enable_staging_area=1, cloud_staging_area_wait_mode='wait_txn_visible';

参数说明

参数名	参数类型	可选值	无配置默认值	参数说明
cloud_enable_staging_area	Bool	true/false	false	是否开启暂存区，可支持异步写入或者并发写入优化场景。
cloud_staging_area_wait_mode	StagingAreaWaitMode	no_wait/wait_txn_visible	no_wait	开启暂存区后，数据写入模式。no_wait 表示异步写入模式，wait_txn_visible 表示同步写入模式。

大 ELT 任务

当一次 INSERT SELECT/DELETE/UPDATE 操作影响的数据超过 1 亿行时，可以认为本次任务是大 ELT 任务。ByteHouse 支持通过设置表级参数和 query 级别参数，充分利用多 worker 资源，中心化调度去重任务。参数设置示例如下：

# 表级参数设置
dedup_impl_version = 'dedup_in_txn_commit'

# query 级别参数设置
optimize_unique_table_write = 1, max_insert_threads = 8

完整使用示例如下：

表级参数设置

CREATE TABLE orders
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
CLUSTER BY EXPRESSION cityHash64V2(product_id) % 6 INTO 6 BUCKETS
UNIQUE KEY product_id
SETTINGS [..., ]dedup_impl_version = 'dedup_in_txn_commit';

Query 级别参数设置

INSERT INTO target_table SELECT * FROM source_table SETTINGS optimize_unique_table_write = 1, max_insert_threads = 8;

注意

若想提升性能，可同时调大 min_insert_block_size_bytes 和 min_insert_block_size_rows，可设置为当前值×2。
若遇到 OOM（内存溢出，Out Of Memory）的资源问题，可同时调小 min_insert_block_size_bytes 和 min_insert_block_size_rows，可设置为当前值/2。

参数说明

参数名	参数类型	可选值	无配置默认值	参数说明
dedup_impl_version	DedupImplVersion	dedup_in_write_suffix/ dedup_in_txn_commit	dedup_in_write_suffix	去重实现，dedup_in_write_suffix 在写入阶段结束时执行去重逻辑；dedup_in_txn_commit 在事务预提交阶段执行去重逻辑，为优化版本。
optimize_unique_table_write	Bool	true/false	false	唯一键表 insert select 导入优化，仅当未开启暂存区且 dedup_impl_version='dedup_in_txn_commit' 时建议开启。
max_insert_threads	UInt64	正整数	0	worker 执行 insert select 请求的写入线程池大小。
min_insert_block_size_bytes	UInt64	正整数	1073710080	`INSERT SELECT` 攒批 block 块的最小大小。
min_insert_block_size_rows	UInt64	正整数	1048545	`INSERT SELECT` 攒批 block 块的最小行数。

离在线导入共存

注意

该功能当前正在灰度发布中，若使用中遇到问题，可提交工单或联系 ByteHouse 团队。

在线导入任务与离线导入任务共存时，数据量大的离线任务会严重影响实时导入的写入延迟，导致后者有非常大的性能波动，该场景可开启优化解决。可通过通过表级参数设置，依赖暂存区，高优为实时任务分配去重带宽，减少离线大 ELT 任务对于实时写入延迟的影响。
表级参数设置示例如下：

cloud_enable_staging_area=1, cloud_staging_area_wait_mode='wait_txn_visible'，real_time_task_optimize_algo='strict_optimize'

完整建表命令示例如下：

CREATE TABLE orders
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
CLUSTER BY EXPRESSION cityHash64V2(product_id) % 6 INTO 6 BUCKETS
UNIQUE KEY product_id
SETTINGS [..., ]cloud_enable_staging_area=1, cloud_staging_area_wait_mode='wait_txn_visible'，real_time_task_optimize_algo='strict_optimize';

参数说明

参数名

参数类型

可选值

无配置默认值

参数说明

cloud_enable_staging_area

Bool

true/false

false

是否开启暂存区，可支持异步写入或者并发写入优化场景。

cloud_staging_area_wait_mode

StagingAreaWaitMode

no_wait/wait_txn_visible

no_wait

开启暂存区后，数据写入模式。no_wait 表示异步写入模式，wait_txn_visible 表示同步写入模式。

real_time_task_optimize_algo

RealTimeTaskOptimizeAlgo

disabled/strict_optimize/ loose_optimize

disabled

设置离线/在线导入优化算法，不同参数值对应的效果如下：

disabled 为关闭优化；
strict_optimize 算法会严格控制离线任务的量，优先执行实时导入任务；
loose_optimize 算法会在高优执行导入任务的同时尽可能提升离线任务性能。

使用示例

例1：分区级别唯一键

-- 引擎默认保证 unique key 在分区内的唯一性
CREATE TABLE t1
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY product_id;

INSERT INTO t1 VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100);

-- 写入相同 key 的数据可以实现更新(upsert语义)
INSERT INTO t1 VALUES
('2020-10-29 23:50:00', 10002, 'Beijing', '男装', 4, 400),
('2020-10-29 23:50:00', 10003, 'Beijing', '男装', 2, 200),
('2020-10-29 23:50:00', 10004, 'Beijing', '男装', 1, 100),
('2020-10-30 00:00:05', 10001, 'Beijing', '男装', 1, 100),
('2020-10-30 00:00:05', 10002, 'Beijing', '男装', 2, 200);

-- 查询自动返回每个key最新的数据
select * from t1 order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10001 │ Beijing │ 男装     │      5 │     500 │
│ 2020-10-29 23:50:00 │      10002 │ Beijing │ 男装     │      4 │     400 │
│ 2020-10-29 23:50:00 │      10003 │ Beijing │ 男装     │      2 │     200 │
│ 2020-10-29 23:50:00 │      10004 │ Beijing │ 男装     │      1 │     100 │
│ 2020-10-30 00:00:05 │      10001 │ Beijing │ 男装     │      1 │     100 │
│ 2020-10-30 00:00:05 │      10002 │ Beijing │ 男装     │      2 │     200 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

-- UNIQUE KEY 可以包含多个字段和表达式
-- sipHash64 是一种快速且低冲突率的哈希函数，使用 sipHash64 作为 unique key 需要考虑到可能的 hash 冲突
-- sipHash64：https://clickhouse.com/docs/en/sql-reference/functions/hash-functions#siphash64
CREATE TABLE t1m
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY (product_id, sipHash64(city));

INSERT INTO t1m VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100),
('2020-10-29 23:50:00', 10002, 'Shanghai', '男装', 4, 400),
('2020-10-29 23:50:00', 10003, 'Beijing', '男装', 2, 200),
('2020-10-29 23:50:00', 10004, 'Beijing', '男装', 1, 100);

select * from t1m;
┌──────────event_time─┬─product_id─┬─city─────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10001 │ Beijing  │ 男装     │      5 │     500 │
│ 2020-10-29 23:40:00 │      10002 │ Beijing  │ 男装     │      2 │     200 │
│ 2020-10-29 23:50:00 │      10003 │ Beijing  │ 男装     │      2 │     200 │
│ 2020-10-29 23:50:00 │      10004 │ Beijing  │ 男装     │      1 │     100 │
│ 2020-10-29 23:50:00 │      10002 │ Shanghai │ 男装     │      4 │     400 │
└─────────────────────┴────────────┴──────────┴──────────┴────────┴─────────┘

例2：自定义版本字段

说明

使用版本字段时有以下限制：

如需要使用整数作为版本字段，建议使用兼容UInt64的无符号整数
支持Date、Datetime等时间类型作为版本字段
不支持Float、Decimal等浮点数类型作为版本字段

默认情况下，相同 unique key 后写入的数据会覆盖已有的数据。这可能会带来以下问题

回溯上游数据时，老数据可能覆盖新数据，导致查询到的数据结果出现回退
Lambda 架构下，如果离线和实时任务同时写一个分区，最终保留哪条数据取决于任务的执行顺序

为了解决上面的问题，ByteHouse 唯一键表支持将表中的某个字段指定为版本字段。引擎保证写入相同 key 的数据时，只有数据版本 >= 已有版本时，才会进行覆盖。版本字段支持所有UInt类型和Data/DateTime，且不能为 Nullable。

-- CnchMergeTree 括号内参数为可选的版本字段
CREATE TABLE t3
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree(event_time)
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY product_id;

INSERT INTO t3 VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200),
('2020-10-29 23:50:00', 10001, 'Beijing', '男装', 8, 800),
('2020-10-29 23:50:00', 10002, 'Beijing', '男装', 5, 500);

-- 回溯前两条数据，由于版本 < 已有版本，写入时自动跳过
INSERT INTO t3 VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200);

-- 10001 和 10002 的版本没有回退
select * from t3 order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:50:00 │      10001 │ Beijing │ 男装     │      8 │     800 │
│ 2020-10-29 23:50:00 │      10002 │ Beijing │ 男装     │      5 │     500 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

-- 继续回溯后两条数据，并写入两条新版本数据
INSERT INTO t3 VALUES
('2020-10-29 23:50:00', 10001, 'Beijing', '男装', 8, 800),
('2020-10-29 23:50:00', 10002, 'Beijing', '男装', 5, 500),
('2020-10-29 23:55:00', 10001, 'Beijing', '男装', 10, 1000),
('2020-10-29 23:55:00', 10002, 'Beijing', '男装', 7, 700);

-- 查询自动返回最新版本的数据
select * from t3 order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:55:00 │      10001 │ Beijing │ 男装     │     10 │    1000 │
│ 2020-10-29 23:55:00 │      10002 │ Beijing │ 男装     │      7 │     700 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

例3：实时删除指定唯一键的数据

在某些应用场景下，用户希望在INSERT时加上一个字段来标识是否删除来扩展INSERT语义。
在 ByteHouse 唯一键表中，为每张表都添加了一个保留字段_delete_flag_，类型为UInt8， 0表示数据写入，非0表示数据删除。该字段不可在CREATE TABLE时指定，也不可查询该字段，仅可以在INSERT时指定，包括INSERT和INSERT SELECT。此外，ByteHouse 唯一键表基于保留字段_delete_flag_，支持了 DELETE FROM 子句。
用法示例如下：

INSERT

-- 引擎默认保证 unique key 在分区内的唯一性
CREATE TABLE t5
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY product_id;

INSERT INTO t5 VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100);

-- 指定删除字段进行数据删除，删除字段设置非0时表示删除，设置为0时表示正常的upsert操作
INSERT INTO t5 (event_time, product_id, city, category, amount, revenue, _delete_flag_) VALUES
('2020-10-29 23:50:00', 10001, 'Beijing', '男装', 4, 400, 5),
('2020-10-29 23:50:00', 10002, 'Beijing', '男装', 2, 200, 1),
('2020-10-29 23:50:00', 10004, 'Beijing', '男装', 1, 100, 0);

-- 查询结果中包含了新加入的一行数据，并删除了两行旧数据
select * from t5 order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10003 │ Beijing │ 男装     │      1 │     100 │
│ 2020-10-29 23:50:00 │      10004 │ Beijing │ 男装     │      1 │     100 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

INSERT WITH VERSION

-- 引擎默认保证 unique key 在分区内的唯一性

-- 指定版本号
CREATE TABLE t5m
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64,
  `version` UInt64
)
ENGINE = CnchMergeTree(version)
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY product_id;

INSERT INTO t5m VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500, 10),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200, 10),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100, 10);

-- 指定删除字段并指定版本号，版本号小于查询结果中相应行的版本号，删除操作不会起作用
INSERT INTO t5m (event_time, product_id, city, category, amount, revenue, version, _delete_flag_) VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 4, 400, 5, 1),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200, 5, 1),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100, 5, 1);

-- 查询结果不变，没有任何数据被删除
select * from t5m order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┬─version─┐
│ 2020-10-29 23:40:00 │      10001 │ Beijing │ 男装     │      5 │     500 │      10 │
│ 2020-10-29 23:40:00 │      10002 │ Beijing │ 男装     │      2 │     200 │      10 │
│ 2020-10-29 23:40:00 │      10003 │ Beijing │ 男装     │      1 │     100 │      10 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┴─────────┘

-- 指定删除字段进行数据删除，不指定版本号或者版本号设置为0，删除操作会跳过版本检查，直接执行
INSERT INTO t5m (event_time, product_id, city, category, amount, revenue, _delete_flag_) VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 4, 400, 1);
INSERT INTO t5m (event_time, product_id, city, category, amount, revenue, version, _delete_flag_) VALUES
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100, 0, 1);

-- 查询结果删除了两行旧数据
select * from t5m order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┬─version─┐
│ 2020-10-29 23:40:00 │      10002 │ Beijing │ 男装     │      2 │     200 │      10 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┴─────────┘

INSERT SELECT

-- 引擎默认保证 unique key 在分区内的唯一性

CREATE TABLE t5x
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY product_id;

INSERT INTO t5x VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100);

-- 通过INSERT SELECT 来删除revenue >= 200的数据
INSERT INTO t5x (event_time, product_id, city, category, amount, revenue, _delete_flag_) SELECT *, 1 as _delete_flag_ from t5x where revenue >= 200;

-- 查询结果中已删除revenue >= 200的数据
select * from t5x order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10003 │ Beijing │ 男装     │      1 │     100 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

DELETE FROM

-- 引擎默认保证 unique key 在分区内的唯一性

CREATE TABLE t5y
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY product_id;

INSERT INTO t5y VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100);

-- 通过 DELETE FROM 来删除revenue == 500的数据
DELETE FROM t5y WHERE revenue = 500;
-- 通过 DELETE FROM 来删除revenue >= 200的数据
DELETE FROM t5y WHERE revenue >= 200;

-- 查询结果中已删除revenue >= 200的数据
select * from t5y order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10003 │ Beijing │ 男装     │      1 │     100 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

例4：实时更新指定唯一键的数据

-- 引擎默认保证 unique key 在分区内的唯一性

CREATE TABLE t6
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY product_id;

INSERT INTO t6 VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100);

-- 通过 UPDATE 语句来进行更新
UPDATE t6 SET amount = 10, revenue = 1000 WHERE event_time = '2020-10-29 23:40:00' and product_id=10001;

-- 查询结果中 10001 数据行进行了 UPDATE 变更
select * from t6 order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10001 │ Beijing │ 男装     │     10 │    1000 │
│ 2020-10-29 23:40:00 │      10002 │ Beijing │ 男装     │      2 │     200 │
│ 2020-10-29 23:40:00 │      10003 │ Beijing │ 男装     │      1 │     100 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

-- 通过 UPDATE 语句来进行更新，使用 limit 限定范围
UPDATE t6 SET category = concat('新',category) WHERE event_time = '2020-10-29 23:40:00' order by product_id limit 1;

-- 查询结果中仅 10001 数据行进行了 UPDATE 变更
select * from t6 order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10001 │ Beijing │ 新男装   │     10 │    1000 │
│ 2020-10-29 23:40:00 │      10002 │ Beijing │ 男装     │      2 │     200 │
│ 2020-10-29 23:40:00 │      10003 │ Beijing │ 男装     │      1 │     100 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

-- 通过 UPDATE 语句交换列
UPDATE t6 SET amount=revenue, revenue=amount  WHERE event_time = '2020-10-29 23:40:00';

-- 查询结果 amount 、revenue 列进行了交换
select * from t6 order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10001 │ Beijing │ 新男装   │   1000 │      10 │
│ 2020-10-29 23:40:00 │      10002 │ Beijing │ 男装     │    200 │       2 │
│ 2020-10-29 23:40:00 │      10003 │ Beijing │ 男装     │    100 │       1 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

例5：Bucket table

Bucket table 是 ByteHouse 在建表的时候的一种性能优化选项，在ByteHouse 中使用 Bucket table 时，系统会依据用户建表语句中提供的一个或者多个列、表达式整理表数据，将相同值的数据聚簇在同一个 bucket number 下，从而在查询计算中获得更好的性能。

非 bucket 级别唯一
当不使用 enable_bucket_level_unique_keys 指定 bucket 级别唯一时。ByteHouse 唯一键表使用表引擎指定的去重范围，默认为分区级别唯一，此时同分区下不同 bucket 不会存在重复数据。

-- 引擎默认保证 unique key 在分区内的唯一性
CREATE TABLE t7
(
    `d` Date,
    `id` Int32,
    `s` String
)
ENGINE = CnchMergeTree
PARTITION BY d
CLUSTER BY s INTO 10 BUCKETS
ORDER BY s
UNIQUE KEY id;

INSERT INTO t7 VALUES 
('2023-06-26', 1, '1a'), ('2023-06-26', 2, '2a'), ('2023-06-26', 3, '3a'), ('2023-06-26', 3, '3b'), ('2023-06-26', 3, '3c');

select * from t7 order by id;
┌──────────d─┬─id─┬─s──┐
│ 2023-06-26 │  1 │ 1a │
│ 2023-06-26 │  2 │ 2a │
│ 2023-06-26 │  3 │ 3c │
└────────────┴────┴────┘

Bucket 级别唯一
当 cluster by 所需的列在 unique key 字段里都包含时，相同 unique key 的数据一定会落到固定的 bucket 中，可以优化为 bucket 去重；此时 partition_level_unique_keys 用于指定 bucket 的去重范围（分区级/表级）。
Bucket 唯一可以降低去重的数据量，从而提升写入 rps 和 P99 延时；enable_bucket_level_unique_keys = 1可以跳过 part 的 bucket 校验，进一步提升写入效率；

-- 引擎默认保证 unique key 在分区内的唯一性
CREATE TABLE t7m
(
    `d` Date,
    `id` Int32,
    `s` String
)
ENGINE = CnchMergeTree
PARTITION BY d
CLUSTER BY id INTO 10 BUCKETS
ORDER BY s
UNIQUE KEY id
SETTINGS enable_bucket_level_unique_keys = 1;

INSERT INTO t7m VALUES 
('2023-06-26', 1, '1a'), ('2023-06-26', 2, '2a'), ('2023-06-26', 3, '3a'), ('2023-06-26', 3, '3b'), ('2023-06-26', 3, '3c');

select * from t7m order by id;
┌──────────d─┬─id─┬─s──┐
│ 2023-06-26 │  1 │ 1a │
│ 2023-06-26 │  2 │ 2a │
│ 2023-06-26 │  3 │ 3c │
└────────────┴────┴────┘

INSERT INTO t7m VALUES ('2023-06-26', 3, '3d');

select * from t7m order by id;
┌──────────d─┬─id─┬─s──┐
│ 2023-06-26 │  1 │ 1a │
│ 2023-06-26 │  2 │ 2a │
│ 2023-06-26 │  3 │ 3d │
└────────────┴────┴────┘

例6：部分列更新

ByteHouse云数仓版支持部分列更新模式。
在行更新模式时，缺省列采用默认值填充。而在列更新模式下，缺省列如果有原值会保留，否则填充默认值。详情请参见部分列更新。

例7：MySQL DML 对标 & dedup_key_mode

ByteHouse 唯一键表对标并支持了 MySQL 的不同写入模式。

MySQL DML 语义	MySQL	ByteHouse 唯一键表
DDL 示例	`CREATE TABLE mysql_test ( a int NOT NULL, b int DEFAULT NULL, PRIMARY KEY (a) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci`	CREATE TABLE mysql_test( `a` Int32, `b` Nullable(Int32) ) ENGINE = CnchMergeTree ORDER BY a UNIQUE KEY a
insert into 检查唯一键是否存在，如果存在则写入报错	`MySQL [db]> insert into mysql_test values (1,1), (1,1); ERROR 1062 (23000): Duplicate entry '1' for key 'mysql_test.PRIMARY'`	`insert into mysql_test FORMAT Values SETTINGS dedup_key_mode='throw' (1,1), (1,1) ; DB::Exception: Found duplication in the block when insert with setting dedup_key_mode=DedupKeyMode::THROW SQLSTATE: 22000.`
insert ignore 检查唯一键是否存在，如果存在则忽略当前写入的新数据	`MySQL [db]> insert ignore into mysql_test values (1,1), (1,1); Query OK, 1 row affected, 1 warning (0.07 sec) Records: 2 Duplicates: 1 Warnings: 1 MySQL [db]> select * from mysql_test;` ┌─a─┬─b─┐ │ 1 │ 1 │ └───┴───┘ 1 row in set (0.00 sec)	`insert ignore into mysql_test values (1,1), (1,1) ; select * from mysql_test` ┌─a─┬─b─┐ │ 1 │ 1 │ └───┴───┘ 1 rows in set.
replace into 如果唯一键存在，使用写入数据进行替换；如果唯一键不存在，效果同 insert into；	`MySQL [db]> select * from mysql_test;` ┌─a─┬─b─┐ │ 1 │ 1 │ └───┴───┘ 1 rows in set. 1 row in set (0.00 sec) `MySQL [db]> replace into mysql_test values (1, 2);` Query OK, 2 rows affected (0.11 sec) `MySQL [db]> select * from mysql_test;` ┌─a─┬─b─┐ │ 1 │ 2 │ └───┴───┘ 1 rows in set. 1 row in set (0.00 sec)	`select * from mysql_test` ┌─a─┬─b─┐ │ 1 │ 1 │ └───┴───┘ 1 rows in set. `insert into mysql_test values (1, 2) ; select * from mysql_test` ┌─a─┬─b─┐ │ 1 │ 2 │ └───┴───┘

最近更新时间：2026.02.13 18:01:23

这个页面对您有帮助吗？

有用

无用

ByteHouse云数仓版

UNIQUE KEY 数量 #

去重粒度选择 #

Bucket 去重优化 #

什么是 bucket 去重 #

开启 bucket 去重 #

Bucket 数量选择 #

Bucket 表达式 #

Bucket 表参数推荐配置 #

在线扩容 Bucket 数 #

攒批优先 #

非去重模式 #

并发写入 #

大 ELT 任务 #

离在线导入共存 #

例1：分区级别唯一键 #

例2：自定义版本字段 #

例3：实时删除指定唯一键的数据 #

例4：实时更新指定唯一键的数据 #

例5：Bucket table #

例6：部分列更新 #

例7：MySQL DML 对标 & dedup_key_mode #

UNIQUE KEY 数量

去重粒度选择

Bucket 去重优化

什么是 bucket 去重

开启 bucket 去重

Bucket 数量选择

Bucket 表达式

Bucket 表参数推荐配置

在线扩容 Bucket 数

攒批优先

非去重模式

并发写入

大 ELT 任务

离在线导入共存

例1：分区级别唯一键

例2：自定义版本字段

例3：实时删除指定唯一键的数据

例4：实时更新指定唯一键的数据

例5：Bucket table

例6：部分列更新

例7：MySQL DML 对标 & dedup_key_mode