基于最新时间戳创建统计视图的SQL优化及字段映射需求
优化慢SQL并创建带字段映射的视图方案
咱们先拆解下你的需求,然后一步步给出高效的解决方案:
- 统计每个
group_status对应的最新状态的connector_pk数量 - 解决原查询耗时30分钟的性能问题
- 将逻辑封装为视图,同时加入指定的
status到group_status的映射规则 - 确保所有可能的
group_status值(哪怕计数为0)都出现在结果中
一、原查询慢的核心原因
原查询用了关联子查询来获取每个connector_pk的最新status_timestamp,这种写法在数据量较大时会对connector_status_report表进行N次重复扫描(每个connector_pk都要执行一次子查询),这就是导致查询耗时极长的关键问题。
二、优化后的查询逻辑
我们改用**窗口函数ROW_NUMBER()**来高效获取每个connector_pk的最新状态记录,整个逻辑只需要扫描表一次,性能会大幅提升。同时结合字段映射和全状态展示的需求,具体实现如下:
1. 基础优化查询(含字段映射)
先对原始数据做status到group_status的映射,再用窗口函数标记每个connector_pk的最新记录:
WITH mapped_data AS ( SELECT connector_pk, -- 按需求实现字段映射逻辑 CASE WHEN status = 'charging' THEN 'Charging' WHEN status = 'Not Occupied' THEN 'Available' WHEN status = 'Not Available' THEN 'Not Available' -- 匹配你期望输出的状态,按需调整 ELSE 'Occupied' END AS group_status, status_timestamp, -- 给每个connector_pk的记录按时间倒序排序,最新的记录标记为1 ROW_NUMBER() OVER (PARTITION BY connector_pk ORDER BY status_timestamp DESC) AS rn FROM connector_status_report ), -- 生成所有需要展示的group_status值(确保计数为0的状态也能显示) all_statuses AS ( SELECT 'Charging' AS group_status UNION ALL SELECT 'Available' AS group_status UNION ALL SELECT 'Not Available' AS group_status ) SELECT s.group_status, COUNT(m.connector_pk) AS `Count of status` FROM all_statuses s LEFT JOIN mapped_data m ON s.group_status = m.group_status AND m.rn = 1 GROUP BY s.group_status ORDER BY s.group_status;
2. 封装为视图
把上面的优化逻辑直接封装成视图,执行以下语句即可:
CREATE OR REPLACE VIEW connector_status_summary AS WITH mapped_data AS ( SELECT connector_pk, CASE WHEN status = 'charging' THEN 'Charging' WHEN status = 'Not Occupied' THEN 'Available' WHEN status = 'Not Available' THEN 'Not Available' ELSE 'Occupied' END AS group_status, status_timestamp, ROW_NUMBER() OVER (PARTITION BY connector_pk ORDER BY status_timestamp DESC) AS rn FROM connector_status_report ), all_statuses AS ( SELECT 'Charging' AS group_status UNION ALL SELECT 'Available' AS group_status UNION ALL SELECT 'Not Available' AS group_status ) SELECT s.group_status, COUNT(m.connector_pk) AS `Count of status` FROM all_statuses s LEFT JOIN mapped_data m ON s.group_status = m.group_status AND m.rn = 1 GROUP BY s.group_status;
三、额外性能加速建议
- 添加联合索引:为
connector_pk和status_timestamp创建联合索引,窗口函数的PARTITION BY和ORDER BY会直接用到这个索引,进一步降低查询耗时:CREATE INDEX idx_connector_timestamp ON connector_status_report(connector_pk, status_timestamp DESC); - 归档历史数据:如果表中有大量不需要实时查询的旧数据,可以定期归档到历史表,减少每次查询需要扫描的数据量。
内容的提问来源于stack exchange,提问作者RDC_Green




