ByteHouse 唯一键表最佳实践--ByteHouse云数仓版-火山引擎

文档中心

导航

ByteHouse 唯一键表最佳实践

最近更新时间：2025.09.08 11:08:42首次发布时间：2022.10.06 11:03:34

ByteHouse 唯一键表主要用于实现 upsert 功能。该能力是 ByteHouse 团队自研的独有特性，既能保持高效的查询性能、又支持唯一键更新。ByteHouse 唯一键表主要解决了开源 ClickHouse 不能支持高效更新操作的痛点，帮助业务更简单地开发实时分析应用。您可通过指定唯一键 UNIQUE KEY 来实现 Upsert 更新写语义，查询自动返回每个唯一键的最新值。

功能概述

唯一键表即指定唯一键索引（UNIQUE KEY）的 CnchMergeTree 表，具有以下特点：

您通过 UNIQUE KEY 配置唯一键，支持 upsert 更新写语义，查询时自动返回每个唯一键的最新值。
在保证实时更新能力的情况下，依然保持较高的查询性能。
唯一键索引（UNIQUE KEY）支持多字段和表达式。
唯一键表支持多种去重粒度（如分区级去重、bucket 去重等）。
支持自定义版本字段，写入低版本数据时自动忽略。
支持根据 UNIQUE KEY 实时删除数据。
支持根据 UNIQUE KEY 进行部分列更新操作。

更多关于唯一键表的能力介绍、功能逻辑、使用限制等内容，请参见唯一键表。

建表指南

UNIQUE KEY 数量

建议 UNIQUE KEY 字段设置不超过 5 个；如果多个字段组合构成唯一键，可以使用它们的 toString(sipHash128()) 哈希作为唯一键。

去重粒度选择

如需去重，建议您优先选择使用分区级去重，再根据数据量级和写入性能需求 RPS（record per second）判断是否需要开启 bucket 去重优化。

Bucket 去重优化

出现如下任意一种情况时，建议开启 bucket 去重优化：

单去重粒度数据量超过 2 千万（20M）；
单去重粒度写入 RPS 超过 2 万（20K）/s。

可通过在建表时设置 CLUSTER BY 子句所需列为 UNIQUE KEY 所需列子集，进而开启 bucket 去重优化。

Bucket 数量选择

写入侧

当去重粒度数据量（拆分 bucket 前）为 m，需求峰值 RPS 为 n 时。以 max{⌈m / 20M⌉, ⌈n / 20K⌉} 取 bucket 数量（number）。如果您需要对数据量/RPS 预留一定余量，以 2 * max{⌈m /20M⌉, ⌈n / 20K⌉} 取 bucket 数量。

查询侧

对于查询来说，相同 bucket 数量的 parts 在查询时会发送到同一个 worker 节点上计算。因此选择一个合适的 bucket 数量对于查询有较大影响，一般遵循如下原则：

确保 bucket 数量为 worker 数量的倍数，可以保证查询负载均衡。
不建议使用小于 worker 数量的 bucket 数量，会出现负载倾斜。

最佳实践

满足写入侧需求后，使用 2 * max{⌈m / 20M⌉, ⌈n / 20K⌉} 为 write_bucket_num，bucket 数量可以根据 write_bucket_num 和 worker 数量设置：

当 write_bucket_num ≤ worker 数量时，可直接使用 worker 数量。若考虑长期横向扩容，可使用 worker 数量 * 2作为最终 bucket 数量。
当 write_bucket_num > worker 数量，向上取 worker 整数倍作为最终 bucket 数量。

Bucket 表达式

推荐使用CLUSTER BY EXPRESSION cityHash64V2(cluster_column) % N INTO N BUCKETS 子句，具体使用方法可参考创建动态分区表。
CLUSTER BY 的列建议使用高基数列，否则容易出现 bucket 数据倾斜，从而影响写入和查询性能。

Bucket 表参数推荐配置

建议使用以下 SETTINGS 参数：

max_dedup_worker_number = [cnch_vw_write 的 pod 数], pick_dedup_worker_by_mod = 1, dedup_impl_version = 'dedup_in_txn_commit'

完整建表命令示例如下：

CREATE TABLE orders
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
CLUSTER BY EXPRESSION cityHash64V2(product_id) % 6 INTO 6 BUCKETS
UNIQUE KEY product_id
SETTINGS [..., ]max_dedup_worker_number = [cnch_vw_write 的 pod 数], pick_dedup_worker_by_mod = 1, dedup_impl_version = 'dedup_in_txn_commit';

参数名	参数类型	可选值	无配置默认值	参数说明
max_dedup_worker_number	UInt64	整数	1	Dedup worker 的数量，建议和 cnch_vw_write 设置的计算组 pod 数对齐。
pick_dedup_worker_by_mod	Bool	true/false	false	Dedup worker 的分布模式，配置为 true 时按计算组的节点总数取模可以保证分布均匀；默认为 false 时按照 hash 值分布，不一定均匀。
dedup_impl_version	DedupImplVersion	dedup_in_write_suffix/ dedup_in_txn_commit	dedup_in_write_suffix	去重实现，dedup_in_write_suffix 在写入阶段结束时执行去重逻辑；dedup_in_txn_commit 在事务预提交阶段执行去重逻辑，为优化版本。

数据导入

攒批优先

ByteHouse 整体设计对于大批量数据友好，因此尽量优先选择在上游进行攒批。

非去重模式

注意

一般非去重模式在历史数据迁移阶段使用，正式生产环境慎用，否则会引入重复数据问题。

当您需要导入的数据满足如下两个条件时，可在 query 级别开启非去重导入：

导入的数据本身不存在重复唯一键；
导入的数据和历史数据不存在重复唯一键。

使用以下参数为 query 级别开启非去重导入：

dedup_key_mode='append', enable_unique_partial_update = 0

完整使用示例如下：

INSERT INTO target_table SELECT * FROM source_table SETTINGS dedup_key_mode='append', enable_unique_partial_update = 0;

参数说明

参数名	参数类型	可选值	无配置默认值	参数说明
dedup_key_mode	DedupKeyMode	replace/throw/append/ignore	replace	设置去重模式，不同参数对应不同的去重模式： replace 为 upsert 模式； throw 和 ignore 模式仅能在非 staging area 场景下使用，当出现相同 key 时抛异常或者忽略； append 模式没有去重阶段，性能最优。
enable_unique_partial_update	Bool	true/false	true	是否使用部分列更新模式。需要配合表级同名参数来使用，当且仅当该参数和表级都开启时部分列更新模式生效。

参数名

参数类型

可选值

无配置默认值

参数说明

dedup_key_mode

DedupKeyMode

replace/throw/append/ignore

replace

设置去重模式，不同参数对应不同的去重模式：

replace 为 upsert 模式；
throw 和 ignore 模式仅能在非 staging area 场景下使用，当出现相同 key 时抛异常或者忽略；
append 模式没有去重阶段，性能最优。

enable_unique_partial_update

Bool

true/false

true

是否使用部分列更新模式。需要配合表级同名参数来使用，当且仅当该参数和表级都开启时部分列更新模式生效。

并发写入

数据导入时如果存在并发写入，您可优先考虑上层对去重粒度进行写入对齐。在难以做到的情况下，可以在有去重瓶颈时开启并发写优化。ByteHouse 可通过设置表级参数，实现基于后台线程攒批去重，大批量去重性能更佳。
表级参数设置如下：

cloud_enable_staging_area=1, cloud_staging_area_wait_mode='wait_txn_visible'

完整建表命令示例如下：

CREATE TABLE orders
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
CLUSTER BY EXPRESSION cityHash64V2(product_id) % 6 INTO 6 BUCKETS
UNIQUE KEY product_id
SETTINGS [..., ]cloud_enable_staging_area=1, cloud_staging_area_wait_mode='wait_txn_visible';

参数说明

参数名	参数类型	可选值	无配置默认值	参数说明
cloud_enable_staging_area	Bool	true/false	false	是否开启暂存区，可支持异步写入或者并发写入优化场景。
cloud_staging_area_wait_mode	StagingAreaWaitMode	no_wait/wait_txn_visible	no_wait	开启暂存区后，数据写入模式。no_wait 表示异步写入模式，wait_txn_visible 表示同步写入模式。

大 ELT 任务

当一次 INSERT SELECT/DELETE/UPDATE 操作影响的数据超过 1 亿行时，可以认为本次任务是大 ELT 任务。ByteHouse 支持通过设置表级参数和 query 级别参数，充分利用多 worker 资源，中心化调度去重任务。参数设置示例如下：

# 表级参数设置
dedup_impl_version = 'dedup_in_txn_commit'

# query 级别参数设置
optimize_unique_table_write = 1, max_insert_threads = 8

完整使用示例如下：

表级参数设置

CREATE TABLE orders
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
CLUSTER BY EXPRESSION cityHash64V2(product_id) % 6 INTO 6 BUCKETS
UNIQUE KEY product_id
SETTINGS [..., ]dedup_impl_version = 'dedup_in_txn_commit';

Query 级别参数设置

INSERT INTO target_table SELECT * FROM source_table SETTINGS optimize_unique_table_write = 1, max_insert_threads = 8;

注意

若想提升性能，可同时调大 min_insert_block_size_bytes 和 min_insert_block_size_rows，可设置为当前值×2。
若遇到 OOM（内存溢出，Out Of Memory）的资源问题，可同时调小 min_insert_block_size_bytes 和 min_insert_block_size_rows，可设置为当前值/2。

参数说明

参数名	参数类型	可选值	无配置默认值	参数说明
dedup_impl_version	DedupImplVersion	dedup_in_write_suffix/ dedup_in_txn_commit	dedup_in_write_suffix	去重实现，dedup_in_write_suffix 在写入阶段结束时执行去重逻辑；dedup_in_txn_commit 在事务预提交阶段执行去重逻辑，为优化版本。
optimize_unique_table_write	Bool	true/false	false	唯一键表 insert select 导入优化，仅当未开启暂存区且 dedup_impl_version='dedup_in_txn_commit' 时建议开启。
max_insert_threads	UInt64	正整数	0	worker 执行 insert select 请求的写入线程池大小。
min_insert_block_size_bytes	UInt64	正整数	1073710080	`INSERT SELECT` 攒批 block 块的最小大小。
min_insert_block_size_rows	UInt64	正整数	1048545	`INSERT SELECT` 攒批 block 块的最小行数。

离在线导入共存

注意

该功能当前正在灰度发布中，若使用中遇到问题，可提交工单或联系 ByteHouse 团队。

在线导入任务与离线导入任务共存时，数据量大的离线任务会严重影响实时导入的写入延迟，导致后者有非常大的性能波动，该场景可开启优化解决。可通过通过表级参数设置，依赖暂存区，高优为实时任务分配去重带宽，减少离线大 ELT 任务对于实时写入延迟的影响。
表级参数设置示例如下：

cloud_enable_staging_area=1, cloud_staging_area_wait_mode='wait_txn_visible'，real_time_task_optimize_algo='strict_optimize'

完整建表命令示例如下：

CREATE TABLE orders
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
CLUSTER BY EXPRESSION cityHash64V2(product_id) % 6 INTO 6 BUCKETS
UNIQUE KEY product_id
SETTINGS [..., ]cloud_enable_staging_area=1, cloud_staging_area_wait_mode='wait_txn_visible'，real_time_task_optimize_algo='strict_optimize';

参数说明

参数名

参数类型

可选值

无配置默认值

参数说明

cloud_enable_staging_area

Bool

true/false

false

是否开启暂存区，可支持异步写入或者并发写入优化场景。

cloud_staging_area_wait_mode

StagingAreaWaitMode

no_wait/wait_txn_visible

no_wait

开启暂存区后，数据写入模式。no_wait 表示异步写入模式，wait_txn_visible 表示同步写入模式。

real_time_task_optimize_algo

RealTimeTaskOptimizeAlgo

disabled/strict_optimize/ loose_optimize

disabled

设置离线/在线导入优化算法，不同参数值对应的效果如下：

disabled 为关闭优化；
strict_optimize 算法会严格控制离线任务的量，优先执行实时导入任务；
loose_optimize 算法会在高优执行导入任务的同时尽可能提升离线任务性能。

使用示例

例1：分区级别唯一键

-- 引擎默认保证 unique key 在分区内的唯一性
-- 注：UNIQUE KEY 不支持 Nullable
CREATE TABLE t1
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY product_id;

INSERT INTO t1 VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100);

-- 写入相同 key 的数据可以实现更新(upsert语义)
INSERT INTO t1 VALUES
('2020-10-29 23:50:00', 10002, 'Beijing', '男装', 4, 400),
('2020-10-29 23:50:00', 10003, 'Beijing', '男装', 2, 200),
('2020-10-29 23:50:00', 10004, 'Beijing', '男装', 1, 100),
('2020-10-30 00:00:05', 10001, 'Beijing', '男装', 1, 100),
('2020-10-30 00:00:05', 10002, 'Beijing', '男装', 2, 200);

-- 查询自动返回每个key最新的数据
select * from t1 order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10001 │ Beijing │ 男装     │      5 │     500 │
│ 2020-10-29 23:50:00 │      10002 │ Beijing │ 男装     │      4 │     400 │
│ 2020-10-29 23:50:00 │      10003 │ Beijing │ 男装     │      2 │     200 │
│ 2020-10-29 23:50:00 │      10004 │ Beijing │ 男装     │      1 │     100 │
│ 2020-10-30 00:00:05 │      10001 │ Beijing │ 男装     │      1 │     100 │
│ 2020-10-30 00:00:05 │      10002 │ Beijing │ 男装     │      2 │     200 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

-- UNIQUE KEY 可以包含多个字段和表达式
-- sipHash64 是一种快速且低冲突率的哈希函数，使用 sipHash64 作为 unique key 需要考虑到可能的 hash 冲突
-- sipHash64：https://clickhouse.com/docs/en/sql-reference/functions/hash-functions#siphash64
CREATE TABLE t1m
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY (product_id, sipHash64(city));

INSERT INTO t1m VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100),
('2020-10-29 23:50:00', 10002, 'Shanghai', '男装', 4, 400),
('2020-10-29 23:50:00', 10003, 'Beijing', '男装', 2, 200),
('2020-10-29 23:50:00', 10004, 'Beijing', '男装', 1, 100);

select * from t1m;
┌──────────event_time─┬─product_id─┬─city─────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10001 │ Beijing  │ 男装     │      5 │     500 │
│ 2020-10-29 23:40:00 │      10002 │ Beijing  │ 男装     │      2 │     200 │
│ 2020-10-29 23:50:00 │      10003 │ Beijing  │ 男装     │      2 │     200 │
│ 2020-10-29 23:50:00 │      10004 │ Beijing  │ 男装     │      1 │     100 │
│ 2020-10-29 23:50:00 │      10002 │ Shanghai │ 男装     │      4 │     400 │
└─────────────────────┴────────────┴──────────┴──────────┴────────┴─────────┘

例2：自定义版本字段

说明

使用版本字段时有以下限制：

如需要使用整数作为版本字段，建议使用兼容UInt64的无符号整数
支持Date、Datetime等时间类型作为版本字段
不支持Float、Decimal等浮点数类型作为版本字段

默认情况下，相同 unique key 后写入的数据会覆盖已有的数据。这可能会带来以下问题

回溯上游数据时，老数据可能覆盖新数据，导致查询到的数据结果出现回退
Lambda 架构下，如果离线和实时任务同时写一个分区，最终保留哪条数据取决于任务的执行顺序

为了解决上面的问题，ByteHouse 唯一键表支持将表中的某个字段指定为版本字段。引擎保证写入相同 key 的数据时，只有数据版本 >= 已有版本时，才会进行覆盖。版本字段支持所有UInt类型和Data/DateTime，且不能为 Nullable。

-- CnchMergeTree 括号内参数为可选的版本字段
CREATE TABLE t3
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree(event_time)
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY product_id;

INSERT INTO t3 VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200),
('2020-10-29 23:50:00', 10001, 'Beijing', '男装', 8, 800),
('2020-10-29 23:50:00', 10002, 'Beijing', '男装', 5, 500);

-- 回溯前两条数据，由于版本 < 已有版本，写入时自动跳过
INSERT INTO t3 VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200);

-- 10001 和 10002 的版本没有回退
select * from t3 order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:50:00 │      10001 │ Beijing │ 男装     │      8 │     800 │
│ 2020-10-29 23:50:00 │      10002 │ Beijing │ 男装     │      5 │     500 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

-- 继续回溯后两条数据，并写入两条新版本数据
INSERT INTO t3 VALUES
('2020-10-29 23:50:00', 10001, 'Beijing', '男装', 8, 800),
('2020-10-29 23:50:00', 10002, 'Beijing', '男装', 5, 500),
('2020-10-29 23:55:00', 10001, 'Beijing', '男装', 10, 1000),
('2020-10-29 23:55:00', 10002, 'Beijing', '男装', 7, 700);

-- 查询自动返回最新版本的数据
select * from t3 order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:55:00 │      10001 │ Beijing │ 男装     │     10 │    1000 │
│ 2020-10-29 23:55:00 │      10002 │ Beijing │ 男装     │      7 │     700 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

例3：实时删除指定唯一键的数据

在某些应用场景下，用户希望在INSERT时加上一个字段来标识是否删除来扩展INSERT语义。
在 ByteHouse 唯一键表中，为每张表都添加了一个保留字段_delete_flag_，类型为UInt8， 0表示数据写入，非0表示数据删除。该字段不可在CREATE TABLE时指定，也不可查询该字段，仅可以在INSERT时指定，包括INSERT和INSERT SELECT。此外，ByteHouse 唯一键表基于保留字段_delete_flag_，支持了 DELETE FROM 子句。
用法示例如下：

INSERT

-- 引擎默认保证 unique key 在分区内的唯一性
-- 注：UNIQUE KEY 不支持 Nullable
CREATE TABLE t5
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY product_id;

INSERT INTO t5 VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100);

-- 指定删除字段进行数据删除，删除字段设置非0时表示删除，设置为0时表示正常的upsert操作
INSERT INTO t5 (event_time, product_id, city, category, amount, revenue, _delete_flag_) VALUES
('2020-10-29 23:50:00', 10001, 'Beijing', '男装', 4, 400, 5),
('2020-10-29 23:50:00', 10002, 'Beijing', '男装', 2, 200, 1),
('2020-10-29 23:50:00', 10004, 'Beijing', '男装', 1, 100, 0);

-- 查询结果中包含了新加入的一行数据，并删除了两行旧数据
select * from t5 order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10003 │ Beijing │ 男装     │      1 │     100 │
│ 2020-10-29 23:50:00 │      10004 │ Beijing │ 男装     │      1 │     100 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

INSERT WITH VERSION

-- 引擎默认保证 unique key 在分区内的唯一性
-- 注：UNIQUE KEY 不支持 Nullable

-- 指定版本号
CREATE TABLE t5m
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64,
  `version` UInt64
)
ENGINE = CnchMergeTree(version)
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY product_id;

INSERT INTO t5m VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500, 10),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200, 10),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100, 10);

-- 指定删除字段并指定版本号，版本号小于查询结果中相应行的版本号，删除操作不会起作用
INSERT INTO t5m (event_time, product_id, city, category, amount, revenue, version, _delete_flag_) VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 4, 400, 5, 1),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200, 5, 1),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100, 5, 1);

-- 查询结果不变，没有任何数据被删除
select * from t5m order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┬─version─┐
│ 2020-10-29 23:40:00 │      10001 │ Beijing │ 男装     │      5 │     500 │      10 │
│ 2020-10-29 23:40:00 │      10002 │ Beijing │ 男装     │      2 │     200 │      10 │
│ 2020-10-29 23:40:00 │      10003 │ Beijing │ 男装     │      1 │     100 │      10 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┴─────────┘

-- 指定删除字段进行数据删除，不指定版本号或者版本号设置为0，删除操作会跳过版本检查，直接执行
INSERT INTO t5m (event_time, product_id, city, category, amount, revenue, _delete_flag_) VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 4, 400, 1);
INSERT INTO t5m (event_time, product_id, city, category, amount, revenue, version, _delete_flag_) VALUES
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100, 0, 1);

-- 查询结果删除了两行旧数据
select * from t5m order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┬─version─┐
│ 2020-10-29 23:40:00 │      10002 │ Beijing │ 男装     │      2 │     200 │      10 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┴─────────┘

INSERT SELECT

-- 引擎默认保证 unique key 在分区内的唯一性
-- 注：UNIQUE KEY 不支持 Nullable

CREATE TABLE t5x
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY product_id;

INSERT INTO t5x VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100);

-- 通过INSERT SELECT 来删除revenue >= 200的数据
INSERT INTO t5x (event_time, product_id, city, category, amount, revenue, _delete_flag_) SELECT *, 1 as _delete_flag_ from t5x where revenue >= 200;

-- 查询结果中已删除revenue >= 200的数据
select * from t5x order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10003 │ Beijing │ 男装     │      1 │     100 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

DELETE FROM

-- 引擎默认保证 unique key 在分区内的唯一性
-- 注：UNIQUE KEY 不支持 Nullable

CREATE TABLE t5y
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY product_id;

INSERT INTO t5y VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100);

-- 通过 DELETE FROM 来删除revenue == 500的数据
DELETE FROM t5y WHERE revenue = 500;
-- 通过 DELETE FROM 来删除revenue >= 200的数据
DELETE FROM t5y WHERE revenue >= 200;

-- 查询结果中已删除revenue >= 200的数据
select * from t5y order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10003 │ Beijing │ 男装     │      1 │     100 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

例4：实时更新指定唯一键的数据

-- 引擎默认保证 unique key 在分区内的唯一性
-- 注：UNIQUE KEY 不支持 Nullable

CREATE TABLE t6
(
  `event_time` DateTime,
  `product_id` UInt64,
  `city` String,
  `category` String,
  `amount` UInt32,
  `revenue` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY toDate(event_time)
ORDER BY (city, category)
UNIQUE KEY product_id;

INSERT INTO t6 VALUES
('2020-10-29 23:40:00', 10001, 'Beijing', '男装', 5, 500),
('2020-10-29 23:40:00', 10002, 'Beijing', '男装', 2, 200),
('2020-10-29 23:40:00', 10003, 'Beijing', '男装', 1, 100);

-- 通过 UPDATE 语句来进行更新
UPDATE t6 SET amount = 10, revenue = 1000 WHERE event_time = '2020-10-29 23:40:00' and product_id=10001;

-- 查询结果中 10001 数据行进行了 UPDATE 变更
select * from t6 order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10001 │ Beijing │ 男装     │     10 │    1000 │
│ 2020-10-29 23:40:00 │      10002 │ Beijing │ 男装     │      2 │     200 │
│ 2020-10-29 23:40:00 │      10003 │ Beijing │ 男装     │      1 │     100 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

-- 通过 UPDATE 语句来进行更新，使用 limit 限定范围
UPDATE t6 SET category = concat('新',category) WHERE event_time = '2020-10-29 23:40:00' order by product_id limit 1;

-- 查询结果中仅 10001 数据行进行了 UPDATE 变更
select * from t6 order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10001 │ Beijing │ 新男装   │     10 │    1000 │
│ 2020-10-29 23:40:00 │      10002 │ Beijing │ 男装     │      2 │     200 │
│ 2020-10-29 23:40:00 │      10003 │ Beijing │ 男装     │      1 │     100 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

-- 通过 UPDATE 语句交换列
UPDATE t6 SET amount=revenue, revenue=amount  WHERE event_time = '2020-10-29 23:40:00';

-- 查询结果 amount 、revenue 列进行了交换
select * from t6 order by toDate(event_time), product_id;
┌──────────event_time─┬─product_id─┬─city────┬─category─┬─amount─┬─revenue─┐
│ 2020-10-29 23:40:00 │      10001 │ Beijing │ 新男装   │   1000 │      10 │
│ 2020-10-29 23:40:00 │      10002 │ Beijing │ 男装     │    200 │       2 │
│ 2020-10-29 23:40:00 │      10003 │ Beijing │ 男装     │    100 │       1 │
└─────────────────────┴────────────┴─────────┴──────────┴────────┴─────────┘

例5：Bucket table

Bucket table 是 ByteHouse 在建表的时候的一种性能优化选项，在ByteHouse 中使用 Bucket table 时，系统会依据用户建表语句中提供的一个或者多个列、表达式整理表数据，将相同值的数据聚簇在同一个 bucket number 下，从而在查询计算中获得更好的性能。

非 bucket 级别唯一
当不使用 enable_bucket_level_unique_keys 指定 bucket 级别唯一时。ByteHouse 唯一键表使用表引擎指定的去重范围，默认为分区级别唯一，此时同分区下不同 bucket 不会存在重复数据。

-- 引擎默认保证 unique key 在分区内的唯一性
-- 注：UNIQUE KEY 不支持 Nullable
CREATE TABLE t7
(
    `d` Date,
    `id` Int32,
    `s` String
)
ENGINE = CnchMergeTree
PARTITION BY d
CLUSTER BY s INTO 10 BUCKETS
ORDER BY s
UNIQUE KEY id;

INSERT INTO t7 VALUES 
('2023-06-26', 1, '1a'), ('2023-06-26', 2, '2a'), ('2023-06-26', 3, '3a'), ('2023-06-26', 3, '3b'), ('2023-06-26', 3, '3c');

select * from t7 order by id;
┌──────────d─┬─id─┬─s──┐
│ 2023-06-26 │  1 │ 1a │
│ 2023-06-26 │  2 │ 2a │
│ 2023-06-26 │  3 │ 3c │
└────────────┴────┴────┘

Bucket 级别唯一
当 cluster by 所需的列在 unique key 字段里都包含时，相同 unique key 的数据一定会落到固定的 bucket 中，可以优化为 bucket 去重；此时 partition_level_unique_keys 用于指定 bucket 的去重范围（分区级/表级）。
Bucket 唯一可以降低去重的数据量，从而提升写入 rps 和 P99 延时；enable_bucket_level_unique_keys = 1可以跳过 part 的 bucket 校验，进一步提升写入效率；

-- 引擎默认保证 unique key 在分区内的唯一性
-- 注：UNIQUE KEY 不支持 Nullable
CREATE TABLE t7m
(
    `d` Date,
    `id` Int32,
    `s` String
)
ENGINE = CnchMergeTree
PARTITION BY d
CLUSTER BY id INTO 10 BUCKETS
ORDER BY s
UNIQUE KEY id
SETTINGS enable_bucket_level_unique_keys = 1;

INSERT INTO t7m VALUES 
('2023-06-26', 1, '1a'), ('2023-06-26', 2, '2a'), ('2023-06-26', 3, '3a'), ('2023-06-26', 3, '3b'), ('2023-06-26', 3, '3c');

select * from t7m order by id;
┌──────────d─┬─id─┬─s──┐
│ 2023-06-26 │  1 │ 1a │
│ 2023-06-26 │  2 │ 2a │
│ 2023-06-26 │  3 │ 3c │
└────────────┴────┴────┘

INSERT INTO t7m VALUES ('2023-06-26', 3, '3d');

select * from t7m order by id;
┌──────────d─┬─id─┬─s──┐
│ 2023-06-26 │  1 │ 1a │
│ 2023-06-26 │  2 │ 2a │
│ 2023-06-26 │  3 │ 3d │
└────────────┴────┴────┘

例6：部分列更新

ByteHouse云数仓版支持部分列更新模式。
在行更新模式时，缺省列采用默认值填充。而在列更新模式下，缺省列如果有原值会保留，否则填充默认值。详情请参见部分列更新。

例7：MySQL DML 对标 & dedup_key_mode

ByteHouse 唯一键表对标并支持了 MySQL 的不同写入模式。

MySQL DML 语义	MySQL	ByteHouse 唯一键表
DDL 示例	`CREATE TABLE mysql_test ( a int NOT NULL, b int DEFAULT NULL, PRIMARY KEY (a) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci`	CREATE TABLE mysql_test( `a` Int32, `b` Nullable(Int32) ) ENGINE = CnchMergeTree ORDER BY a UNIQUE KEY a
insert into 检查唯一键是否存在，如果存在则写入报错	`MySQL [db]> insert into mysql_test values (1,1), (1,1); ERROR 1062 (23000): Duplicate entry '1' for key 'mysql_test.PRIMARY'`	`insert into mysql_test FORMAT Values SETTINGS dedup_key_mode='throw' (1,1), (1,1) ; DB::Exception: Found duplication in the block when insert with setting dedup_key_mode=DedupKeyMode::THROW SQLSTATE: 22000.`
insert ignore 检查唯一键是否存在，如果存在则忽略当前写入的新数据	`MySQL [db]> insert ignore into mysql_test values (1,1), (1,1); Query OK, 1 row affected, 1 warning (0.07 sec) Records: 2 Duplicates: 1 Warnings: 1 MySQL [db]> select * from mysql_test;` ┌─a─┬─b─┐ │ 1 │ 1 │ └───┴───┘ 1 row in set (0.00 sec)	`insert ignore into mysql_test values (1,1), (1,1) ; select * from mysql_test` ┌─a─┬─b─┐ │ 1 │ 1 │ └───┴───┘ 1 rows in set.
replace into 如果唯一键存在，使用写入数据进行替换；如果唯一键不存在，效果同 insert into；	`MySQL [db]> select * from mysql_test;` ┌─a─┬─b─┐ │ 1 │ 1 │ └───┴───┘ 1 rows in set. 1 row in set (0.00 sec) `MySQL [db]> replace into mysql_test values (1, 2);` Query OK, 2 rows affected (0.11 sec) `MySQL [db]> select * from mysql_test;` ┌─a─┬─b─┐ │ 1 │ 2 │ └───┴───┘ 1 rows in set. 1 row in set (0.00 sec)	`select * from mysql_test` ┌─a─┬─b─┐ │ 1 │ 1 │ └───┴───┘ 1 rows in set. `insert into mysql_test values (1, 2) ; select * from mysql_test` ┌─a─┬─b─┐ │ 1 │ 2 │ └───┴───┘