ByteHouse 作为一款 MPP 架构的分布式数据库,为了实现数据的均衡存储和查询的均衡分发,每张表的定义必须在每个节点上都存在且一致。
本文将详细介绍在 ByteHouse 中,当表状态出现“缺损”和“冲突”时的解决步骤。
表状态反映了表在集群中的分布状态,具体描述如下:
您可以在 ByteHouse 控制台”数据管理 “页面,查看到库表的状态。
表缺损状态产生的原因,常见于建表时未带上“on cluster”语法,致使未能将表同步至部分节点。
找到表缺失的节点,可参考下面的语句。
SELECT table_name, create_table_query, host_address from system.clusters as clusters left join ( SELECT name as table_name, create_table_query, host() AS node FROM cluster(<cluster_name>,system,tables) where database = '<table>' and (name = '<table_name>' or name = '<table_name_local>') ) as systable on clusters.host_address = systable.node;
在缺表的某个节点上再执行带上“on cluster”语法的建表语句,即可补齐每个节点上的表定义。
例如,defalut库中,存在一张test表为缺损状态。
查询表缺失的节点。参考上面的SQL语句执行后,可以看到控制台返回结果test表在某节点(192.18.0.232)缺失。
SELECT table_name, create_table_query, host_address from system.clusters as clusters left join ( SELECT name as table_name, create_table_query, host() AS node FROM cluster(your_cluster_name,system,tables) where database = 'default' and (name = 'test') ) as systable on clusters.host_address = systable.node;
您可复制查询结果中“create_table_query”中的建表语句至SQL语句窗口中,并添加“on cluster your_cluster_name”,单击自动分配节点,在下拉列表中单击选择一个节点,选择指定节点 192.18.0.232,单击运行,即可在 192.18.0.232 节点上执行建表语句,补全缺损的表。
您也可以通过执行 show create table 的结果查看建表语句,指定缺损表的节点,在该节点上再次执行建表语句,补全缺损表。
建表语句示例如下,实际执行时请以您具体的语句为准:
CREATE TABLE default.test ON CLUSTER your_cluster_name ( `id` String) ENGINE = MergeTree;
SELECT other_nodes.name as table_name, if(other_node_sql=first_node_sql,'same','diff') AS status, other_nodes.create_table_query AS other_node_sql, first_node.create_table_query AS first_node_sql, other_nodes.node FROM ( SELECT name, create_table_query, host() AS node FROM cluster('bytehouse_cluster_test', 'system', 'tables') WHERE database = '<db_name>' AND (name = '<table_name_local>' or name = '<table_name>') ) AS other_nodes LEFT JOIN ( SELECT name, create_table_query FROM system.tables WHERE database = '<db_name>' AND (name = '<table_name_local>' or name = '<table_name>') ) AS first_node ON other_nodes.name = first_node.name ORDER BY name;
如果有不一致,则:
SELECT name AS column_name, type AS column_type, host() AS node_name FROM cluster('bytehouse_cluster_test','system','columns') WHERE database = '<db_name>' AND (table = '<table_name>' or table = '<table_name_local>') ORDER BY column_name, node_name;
如果有不一致,则:
通过以上步骤,我们可以解决 ByteHouse 中表状态异常的问题,确保表在集群中的分布正常,从而提高数据库的性能和稳定性。