You need to enable JavaScript to run this app.
优惠活动
大模型
产品
解决方案
定价
更多
文档控制台
免费开始使用

如何按日期范围进行分组?基于id、date、treatment字段的患者数据分组需求

按日期范围分组的实现方案

看起来你需要对患者的治疗数据按日期范围进行分组,同时确保所有患者ID都被覆盖——哪怕某个患者在特定日期范围内没有治疗记录。下面我会用SQL来给出具体的实现方案,这是处理这类数据分组最常用的方式。

一、先明确日期范围的定义

首先得确定你想要的日期范围类型,常见的有两种:

  • 固定自定义范围:比如预设的季度、半年,或者特定时间段(比如2023-01-012023-03-31
  • 动态周期范围:比如按周、月、自然季度自动划分日期

二、固定自定义日期范围的分组实现

假设我们要把日期分成三个固定季度范围,同时确保所有患者ID都出现在结果里,可以这样做:

-- 第一步:定义所有要划分的日期范围
WITH date_ranges AS (
    SELECT '2023Q1' AS range_name, '2023-01-01' AS start_date, '2023-03-31' AS end_date
    UNION ALL
    SELECT '2023Q2' AS range_name, '2023-04-01' AS start_date, '2023-06-30' AS end_date
    UNION ALL
    SELECT '2023Q3' AS range_name, '2023-07-01' AS start_date, '2023-09-30' AS end_date
),
-- 第二步:提取所有唯一的患者ID,确保不遗漏任何患者
all_patients AS (
    SELECT DISTINCT id AS patient_id FROM your_table_name
)
-- 第三步:交叉连接生成「所有患者+所有日期范围」的组合,再左连接原数据统计治疗情况
SELECT 
    ap.patient_id,
    dr.range_name,
    -- 统计各治疗类型的次数
    COUNT(CASE WHEN t.treatment = 'Cold' THEN 1 END) AS cold_count,
    COUNT(CASE WHEN t.treatment = 'fever' THEN 1 END) AS fever_count,
    COUNT(CASE WHEN t.treatment = 'cholera' THEN 1 END) AS cholera_count,
    -- 也可以直接列出该患者在该范围接受的治疗类型
    GROUP_CONCAT(DISTINCT t.treatment SEPARATOR ', ') AS treatments_received
FROM all_patients ap
CROSS JOIN date_ranges dr
LEFT JOIN your_table_name t 
    ON ap.patient_id = t.id 
    AND t.date BETWEEN dr.start_date AND dr.end_date
GROUP BY ap.patient_id, dr.range_name
ORDER BY ap.patient_id, dr.range_name;

这个查询的输出会覆盖每一位患者,哪怕某个患者在某个日期范围没有治疗记录,对应的计数会显示为0treatments_received会显示为NULL

三、动态周期日期范围的分组实现

如果想要按动态周期(比如每月、每周)自动分组,不同数据库的语法略有差异,这里举两个常用例子:

MySQL按月分组(含所有患者)

WITH all_months AS (
    -- 提取数据中存在的所有月份
    SELECT DISTINCT DATE_FORMAT(date, '%Y-%m') AS month_range FROM your_table_name
),
all_patients AS (
    SELECT DISTINCT id AS patient_id FROM your_table_name
)
SELECT 
    ap.patient_id,
    am.month_range,
    COUNT(CASE WHEN t.treatment = 'Cold' THEN 1 END) AS cold_count,
    COUNT(CASE WHEN t.treatment = 'fever' THEN 1 END) AS fever_count,
    COUNT(CASE WHEN t.treatment = 'cholera' THEN 1 END) AS cholera_count
FROM all_patients ap
CROSS JOIN all_months am
LEFT JOIN your_table_name t 
    ON ap.patient_id = t.id 
    AND DATE_FORMAT(t.date, '%Y-%m') = am.month_range
GROUP BY ap.patient_id, am.month_range
ORDER BY ap.patient_id, am.month_range;

PostgreSQL按周分组(含所有患者)

WITH all_weeks AS (
    SELECT DISTINCT DATE_TRUNC('week', date)::DATE AS week_start FROM your_table_name
),
all_patients AS (
    SELECT DISTINCT id AS patient_id FROM your_table_name
)
SELECT 
    ap.patient_id,
    CONCAT('Week starting ', aw.week_start) AS week_range,
    COUNT(CASE WHEN t.treatment = 'Cold' THEN 1 END) AS cold_count,
    COUNT(CASE WHEN t.treatment = 'fever' THEN 1 END) AS fever_count,
    COUNT(CASE WHEN t.treatment = 'cholera' THEN 1 END) AS cholera_count
FROM all_patients ap
CROSS JOIN all_weeks aw
LEFT JOIN your_table_name t 
    ON ap.patient_id = t.id 
    AND DATE_TRUNC('week', t.date)::DATE = aw.week_start
GROUP BY ap.patient_id, aw.week_start
ORDER BY ap.patient_id, aw.week_start;

四、输出格式示例

以固定季度范围的查询为例,输出会类似这样:

patient_idrange_namecold_countfever_countcholera_counttreatments_received
12023Q1210Cold, fever
12023Q2001cholera
12023Q3000NULL
22023Q1100Cold
..................

如果你的需求是按日期范围分组后直接列出患者的治疗记录(而非统计),只需调整查询逻辑:

WITH date_ranges AS (
    SELECT '2023Q1' AS range_name, '2023-01-01' AS start_date, '2023-03-31' AS end_date
    UNION ALL
    SELECT '2023Q2' AS range_name, '2023-04-01' AS start_date, '2023-06-30' AS end_date
)
SELECT 
    dr.range_name,
    t.id AS patient_id,
    t.date,
    t.treatment
FROM date_ranges dr
LEFT JOIN your_table_name t 
    ON t.date BETWEEN dr.start_date AND dr.end_date
ORDER BY dr.range_name, t.id;

内容的提问来源于stack exchange,提问作者Cur123

火山引擎 最新活动