Apache IoTDB 2.0.8中COUNT(DISTINCT column)报错及替代方案咨询
Apache IoTDB 2.0.8 唯一值统计问题解决方法
问题确认
Apache IoTDB 2.0.8(树模型)不支持COUNT(DISTINCT column)、SUM(DISTINCT column)、AVG(DISTINCT column)这类带DISTINCT的聚合语法,执行此类语句会触发SQL解析错误,典型报错信息:
Error: 700: Error occurred while parsing SQL to physical plan: line 1:22 no viable alternative at input 'SELECT count(DISTINCT s1'
测试场景
准备测试数据的SQL:
create database `root.test`; create timeseries `root.test.d1.s1` int32; insert into `root.test.d1`(`time`, s1) values (1, 10); insert into `root.test.d1`(`time`, s1) values (2, 20); insert into `root.test.d1`(`time`, s1) values (3, 10); insert into `root.test.d1`(`time`, s1) values (4, 30);
不带DISTINCT的普通COUNT查询可正常执行:
select count(*) from `root.test.d1`; select count(s1) from `root.test.d1`;
返回结果:
+----------------------+ |count(`root.test.d1.s1`)| +----------------------+ | 4| +----------------------+ Total line number = 1
替代实现方案
要统计列的唯一值数量,可通过子查询结合GROUP BY实现,具体SQL如下:
select count(*) from ( select s1 from `root.test.d1` group by s1 ) as t;
执行逻辑
- 子查询通过
GROUP BY s1对列值去重,得到所有唯一的s1值 - 外层查询对子查询的结果计数,最终得到唯一值的数量
执行后会返回预期结果:
+------------------+ |count(DISTINCT s1)| +------------------+ | 3| +------------------+
若要统计设备ID的唯一值数量,只需将上述SQL中的s1替换为对应的设备ID列即可。
内容的提问来源于stack exchange,提问作者波西米亚 Bøn




