如何使用SQL计算聚合后行的中位数?
嘿,我来帮你搞定这个SQL计算中位数的需求!先把你的数据清晰列出来,再给你不同场景下的解决方案~
初始原始表数据
| Name | Date | Hour | Score |
|---|---|---|---|
| Bob | 01 jan 2018 | 9 | 0 |
| Bob | 01 jan 2018 | 10 | 0 |
| Bob | 01 jan 2018 | 11 | 3.03 |
| Bob | 01 jan 2018 | 11 | 3.43 |
| Bob | 01 jan 2018 | 12 | 0 |
| Bob | 01 jan 2018 | 13 | 0 |
程序包聚合后的目标表(每小时一行)
| Name | Date | Hour | Score |
|---|---|---|---|
| Bob | 01 jan 2018 | 9 | 0 |
| Bob | 01 jan 2018 | 10 | 0 |
| Bob | 01 jan 2018 | 11 | 3.23 |
| Bob | 01 jan 2018 | 12 | 0 |
| Bob | 01 jan 2018 | 13 | 0 |
场景1:已经有聚合后的表,计算Score列的中位数
如果你的聚合后数据已经存在一个表(比如叫aggregated_scores),可以用下面的方法计算中位数:
通用SQL方案(适配大多数数据库)
如果你的数据库没有内置中位数函数,用窗口函数就能实现,兼容大多数SQL环境:
WITH ordered_scores AS ( SELECT Score, -- 给每个分数按大小排序 ROW_NUMBER() OVER (ORDER BY Score) AS row_num, -- 统计总行数 COUNT(*) OVER () AS total_rows FROM aggregated_scores ) SELECT AVG(Score) AS median_score FROM ordered_scores -- 选中中间位置的行:奇数行取中间1行,偶数行取中间2行的平均值 WHERE row_num IN ( FLOOR((total_rows + 1)/2), CEIL((total_rows + 1)/2) );
数据库专属快捷方案(如果支持)
很多现代数据库都内置了中位数相关函数,写起来更简洁:
- PostgreSQL:
SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY Score) AS median_score FROM aggregated_scores; - MySQL 8.0+:
-- 方案1:用PERCENTILE_CONT SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY Score) AS median_score FROM aggregated_scores; -- 方案2:MySQL 8.0.22+直接用MEDIAN函数 SELECT MEDIAN(Score) AS median_score FROM aggregated_scores; - SQL Server:
SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY Score) OVER () AS median_score FROM aggregated_scores GROUP BY (); -- 确保只返回一行结果
场景2:从原始表直接完成聚合+中位数计算
如果不想先单独聚合,想一步到位从原始表算出最终的中位数,可以把聚合步骤和中位数计算结合:
WITH hourly_agg AS ( -- 第一步:按Name、Date、Hour聚合,计算每小时的平均分数(和你程序包的结果一致) SELECT Name, Date, Hour, AVG(Score) AS Score FROM original_scores -- 这里替换成你的原始表名 GROUP BY Name, Date, Hour ), ordered_scores AS ( -- 第二步:给聚合后的分数排序并统计总行数 SELECT Score, ROW_NUMBER() OVER (ORDER BY Score) AS row_num, COUNT(*) OVER () AS total_rows FROM hourly_agg ) -- 第三步:计算中位数 SELECT AVG(Score) AS median_score FROM ordered_scores WHERE row_num IN ( FLOOR((total_rows + 1)/2), CEIL((total_rows + 1)/2) );
内容的提问来源于stack exchange,提问作者Jess8766




