SQL Server递归CTE查询超时问题排查与优化求助
问题:SQL Server 2012递归CTE超时排查与查询优化
我需要在SQL Server 2012版本中构建查询,每日返回CATEGORY_3(别名DM)的结果。业务表NOTES仅在列变更时新增行,无每日数据,部分列值为空。使用日期表生成结果行数,但递归CTE执行10分钟超时,预期仅4000行。现有SQL代码如下:
WITH DAILY_DRIVER AS ( SELECT DD1.DRIVER, DD1.Date, DN.CATEGORY_3 AS DM, ROW_NUMBER() OVER (PARTITION BY DD1.DRIVER ORDER BY DD1.Date) AS RowNum FROM ( SELECT DISTINCT(SDN.DRIVER_ID) AS DRIVER, DD.Date FROM NOTES SDN JOIN DATES DD ON DD.Date BETWEEN '2022-01-01' AND '2022-11-28' ) DD1 LEFT JOIN NOTES DN ON DD1.DRIVER = DN.DRIVER_ID AND DD1.Date = CAST(DN.NOTES_DATE AS DATE) ), INITIAL_DM AS ( SELECT D.DRIVER, D.DM FROM ( SELECT SDN.DRIVER_ID AS DRIVER, SDN.CATEGORY_3 AS DM, MAX(SDN.NOTES_DATE) AS NDATE, ROW_NUMBER() OVER (PARTITION BY SDN.DRIVER_ID ORDER BY SDN.NOTES_DATE DESC) AS RowNum FROM NOTES SDN WHERE SDN.CATEGORY_3 IS NOT NULL AND SDN.NOTES_DATE < '2021-01-01' GROUP BY SDN.DRIVER_ID , SDN.CATEGORY_3 , SDN.NOTES_DATE ) D JOIN DRIVER P ON D.DRIVER = P.DRIVER_ID WHERE RowNum = 1 AND (P.TERMINATION_DATE IS NULL OR P.TERMINATION_DATE > '2021-12-31') ), SOMETHING (DRIVER, Date, DM, RowNum) AS ( SELECT DD.DRIVER, DD.Date, ISNULL(DD.DM, SD.DM) AS DM, DD.RowNum FROM DAILY_DRIVER DD RIGHT JOIN INITIAL_DM SD ON DD.DRIVER = SD.DRIVER WHERE DD.RowNum = 1 UNION ALL SELECT DD.DRIVER, DD.Date, ISNULL(DD.DM, S.DM) AS DM, DD.RowNum FROM DAILY_DRIVER AS DD INNER JOIN SOMETHING AS S ON DD.DRIVER = S.DRIVER AND DD.RowNum = S.RowNum + 1 WHERE DD.RowNum <= DATEDIFF(d,'2022-01-01','2021-11-28') +1 ) SELECT DRIVER, Date, DM FROM SOMETHING ORDER BY DRIVER, Date;
NOTES表样例数据
| DRIVER_ID | NOTES_DATE | NOTES_ID | CATEGORY_1 | CATEGORY_2 | CATEGORY_3 | CATEGORY_4 |
|---|---|---|---|---|---|---|
| ADAVIC | 2/8/2022 | 6838 | null | null | FORDON | TRAINEE |
| ADAVIC | 3/15/2022 | 6920 | null | null | FORDON | REGION03 |
| ADAVIC | 4/11/2022 | 6985 | null | null | null | null |
| ADAVIC | 4/2/2022 | 7009 | null | null | ANDBRI | REGION03 |
| ADAVIC | 4/26/2022 | 7050 | null | null | null | null |
| ALLZAN | 5/3/2022 | 7059 | null | null | HILJAS | TRAINEE |
| ADAVIC | 5/2/2022 | 7078 | null | null | PARJAM | REGION03 |
| ADAVIC | 5/12/2022 | 7099 | TERM | REVIEW | null | null |
| ALLZAN | 6/9/2022 | 7187 | null | null | null | OTR |
| AMUJOH | 7/6/2022 | 7219 | null | null | PARJAM | REGION03 |
| AMUJOH | 7/13/2022 | 7232 | null | null | null | null |
| ALLZAN | 8/15/2022 | 7308 | null | null | null | null |
| ALLZAN | 11/9/2022 | 7500 | TERM | REVIEW | null | null |
预期查询结果
| DRIVER | DATE | DM |
|---|---|---|
| ADAVIC | 4/1/2022 | FORDON |
| ADAVIC | 4/2/2022 | ANDBRI |
| ADAVIC | 4/3/2022 | ANDBRI |
| ADAVIC | 4/4/2022 | ANDBRI |
| Continued | ||
| ADAVIC | 5/1/2022 | ANDBRI |
| ADAVIC | 5/2/2022 | PARJAM |
| ADAVIC | 5/3/2022 | PARJAM |
超时原因排查
- 递归CTE逐行迭代低效:递归CTE是逐行循环关联
DAILY_DRIVER表,4000行结果需要执行4000次关联查询,开销极大。 DAILY_DRIVER数据冗余:该CTE是司机与日期的笛卡尔积,再左关联NOTES表,无索引支撑时每次递归都要全表扫描。- 终止条件逻辑错误:
DATEDIFF(d,'2022-01-01','2021-11-28')返回负数,导致DD.RowNum <= 负数+1的条件无效,无法正确终止递归,甚至可能引发无限循环。
优化方案(非递归实现)
这类"填充每日快照"的场景更适合用窗口函数+日期表关联的方式,避免逐行递归:
-- 1. 提取有效的DM变更记录(过滤CATEGORY_3为空的无效行) WITH DriverDMChanges AS ( SELECT DRIVER_ID, CAST(NOTES_DATE AS DATE) AS ChangeDate, CATEGORY_3 AS DM, -- 获取每条变更的下一次变更日期,确定当前DM的生效区间 LEAD(CAST(NOTES_DATE AS DATE), 1, '2022-11-29') OVER (PARTITION BY DRIVER_ID ORDER BY NOTES_DATE) AS NextChangeDate FROM NOTES WHERE CATEGORY_3 IS NOT NULL ), -- 2. 获取司机2022年初始DM(2022-01-01前最后一次有效变更) InitialDM AS ( SELECT DRIVER_ID, CATEGORY_3 AS DM FROM ( SELECT DRIVER_ID, CATEGORY_3, ROW_NUMBER() OVER (PARTITION BY DRIVER_ID ORDER BY NOTES_DATE DESC) AS rn FROM NOTES WHERE CATEGORY_3 IS NOT NULL AND NOTES_DATE < '2022-01-01' ) t WHERE rn = 1 ), -- 3. 生成所有司机的日期范围 DriverDates AS ( SELECT drivers.DRIVER_ID AS DRIVER, dd.Date FROM DATES dd CROSS JOIN ( SELECT DISTINCT DRIVER_ID FROM NOTES UNION SELECT DRIVER_ID FROM InitialDM ) drivers WHERE dd.Date BETWEEN '2022-01-01' AND '2022-11-28' ) -- 4. 填充每日DM值:优先取变更记录,无变更则用初始DM,空值向前填充 SELECT DRIVER, Date, LAST_VALUE(COALESCE(dc.DM, i.DM)) OVER ( PARTITION BY DRIVER ORDER BY Date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS DM FROM DriverDates dd LEFT JOIN DriverDMChanges dc ON dd.DRIVER = dc.DRIVER_ID AND dd.Date >= dc.ChangeDate AND dd.Date < dc.NextChangeDate LEFT JOIN InitialDM i ON dd.DRIVER = i.DRIVER_ID ORDER BY DRIVER, Date;
额外优化建议
- 添加复合索引:在NOTES表上创建索引,加速变更记录的查询:
CREATE NONCLUSTERED INDEX IX_NOTES_DRIVER_DATE_CAT3 ON NOTES (DRIVER_ID, NOTES_DATE) INCLUDE (CATEGORY_3); - 过滤无效数据:提前排除NOTES表中CATEGORY_3为空的行,减少处理数据量。
- 验证日期表:确保DATES表包含目标日期范围内的所有日期,避免结果遗漏。
内容的提问来源于stack exchange,提问作者Matt Nevland




