You need to enable JavaScript to run this app.
优惠活动
大模型
产品
解决方案
定价
更多
文档控制台
免费开始使用

如何高效实现带日期范围的数据无重叠分组与排序?

日期范围无重叠分组优化与性能提升方案

核心需求

将带有日期范围的DateVM数据分组,确保每组内的日期范围无重叠(允许间隔或连续衔接),默认Group值为0,需排序并更新Group字段,优化大数据集下的处理性能,以降低UI展示耗时。

数据结构定义

public class DateVM {
    public int Group { get; set; }
    public string Text { get; set; }
    public string BackColor { get; set; }
    public DateTime StartDate { get; set; }
    public DateTime EndDate { get; set; }
    public bool IsSorted {get;set;}
}

示例数据

输入数据

Sample Data
{Group:0, Text:Record1, BackColor:Red, StartDate:10/02/2025, EndDate:16/02/2025}
{Group:0, Text:Record2, BackColor:Green, StartDate:17/02/2025, EndDate:20/02/2025}
{Group:0, Text:Record3, BackColor:Blue, StartDate:17/02/2025, EndDate:20/02/2025}
{Group:0, Text:Record4, BackColor:Red, StartDate:21/02/2025, EndDate:23/02/2025}
{Group:0, Text:Record5, BackColor:Blue, StartDate:21/02/2025, EndDate:23/02/2025}
{Group:0, Text:Record6, BackColor:Green, StartDate:24/02/2025, EndDate:03/03/2025}
{Group:0, Text:Record7, BackColor:Blue, StartDate:24/02/2025, EndDate:02/03/2025}
{Group:0, Text:Record8, BackColor:Red, StartDate:04/03/2025, EndDate:09/03/2025}
{Group:0, Text:Record9, BackColor:Green, StartDate:04/03/2025, EndDate:09/03/2025}
{Group:0, Text:Record10, BackColor:Red, StartDate:19/02/2025, EndDate:20/02/2025}

期望输出

Needed Output
{Group:1, Text:Record1, BackColor:Red, StartDate:10/02/2025, EndDate:16/02/2025}
{Group:1, Text:Record2, BackColor:Green, StartDate:17/02/2025, EndDate:20/02/2025}
{Group:2, Text:Record3, BackColor:Blue, StartDate:17/02/2025, EndDate:20/02/2025}
{Group:1, Text:Record4, BackColor:Red, StartDate:21/02/2025, EndDate:23/02/2025}
{Group:2, Text:Record5, BackColor:Blue, StartDate:21/02/2025, EndDate:23/02/2025}
{Group:1, Text:Record6, BackColor:Green, StartDate:24/02/2025, EndDate:03/03/2025}
{Group:2, Text:Record7, BackColor:Blue, StartDate:24/02/2025, EndDate:02/03/2025}
{Group:1, Text:Record8, BackColor:Red, StartDate:04/03/2025, EndDate:09/03/2025}
{Group:2, Text:Record9, BackColor:Green, StartDate:04/03/2025, EndDate:09/03/2025}
{Group:3, Text:Record10, BackColor:Red, StartDate:19/02/2025, EndDate:20/02/2025}

当前处理流程及问题

排序逻辑

先按StartDate排序:

dateVM.Sort(DatebarVMComparer.CompareByStartDate);

比较器实现(存在性能问题):

public static int CompareByStartDate(DateVM x, DateVM y)
{
    if (x == null)
    {
        return y == null ? 0 : -1;
    }
    if (y == null)
    {
        return 1;
    }
    // 错误:DateVM的StartDate已是DateTime类型,无需重复Parse
    int retval = DateTime.Parse(x.StartDate).CompareTo(DateTime.Parse(y.StartDate));
    return retval;
}

分组逻辑

  1. 第一条记录加入第1组;
  2. 后续记录依次与已处理的所有组的日期范围比较,若重叠则移至下一组,否则加入当前组。

问题:大数据集下,每次分组都需遍历所有组的所有记录,时间复杂度为O(n²),效率极低、资源消耗大。

现有实现代码

public static IList<SomeObjectNonOverlapping> SortToNonOverlappingDates(List<DateVM> bars)
{
    IList<SomeObjectNonOverlapping> response = new List<SomeObjectNonOverlapping>();
    response = bars.Select((t, i) => new SomeObjectNonOverlapping
    {
        Text = t.Text,
        BackColor = t.BackColor,
        StartDate = DateTime.Parse(t.StartDate),
        EndDate = DateTime.Parse(t.EndDate),
        IsSorted = false
    }).ToList();

    var group = 1;
    response.First().Group = group;
    response.First().IsSorted = true;

    for (var i = 1; i < response.Count; i++)
    {
        group = response.Max(x => x.Group);

        for (var g = 1; g <= group; g++)
        {
            if (!response.Where(x => x.IsSorted && x.Group == g).Except(response[i]).Any(range => (response[i].StartDate >= range.StartDate && response[i].StartDate <= range.EndDate)))
            {
                response[i].Group = g;
                response[i].IsSorted = true;
                break;
            }
            else
            {
                group++;  
            }
        }
    }
    return response;
}

性能优化建议

1. 修正排序逻辑,减少不必要的DateTime解析

DateVMStartDateEndDate已是DateTime类型,无需重复调用DateTime.Parse,直接比较即可:

public static int CompareByStartDate(DateVM x, DateVM y)
{
    if (x == null) return y == null ? 0 : -1;
    if (y == null) return 1;
    int retval = x.StartDate.CompareTo(y.StartDate);
    // 若StartDate相同,按EndDate排序,进一步优化后续分组效率
    if (retval == 0)
    {
        retval = x.EndDate.CompareTo(y.EndDate);
    }
    return retval;
}

2. 预排序+维护分组最大结束日期,降低时间复杂度

  • 预排序:先按StartDate升序,StartDate相同则按EndDate升序排序,确保后续记录的StartDate不会早于已处理记录。
  • 维护分组最大EndDate:用字典存储每个分组的最晚结束日期,判断当前记录能否加入某分组时,只需比较当前记录的StartDate是否大于该分组的最大EndDate(因已排序,若当前StartDate大于分组最大EndDate,则与分组内所有记录无重叠)。

优化后的代码:

public static IList<SomeObjectNonOverlapping> SortToNonOverlappingDatesOptimized(List<DateVM> bars)
{
    // 预排序:StartDate升序,StartDate相同则EndDate升序
    var sortedBars = bars.OrderBy(x => x.StartDate).ThenBy(x => x.EndDate).ToList();

    // 转换目标对象,直接复用DateTime类型,避免重复解析
    var response = sortedBars.Select(t => new SomeObjectNonOverlapping
    {
        Text = t.Text,
        BackColor = t.BackColor,
        StartDate = t.StartDate,
        EndDate = t.EndDate,
        IsSorted = true,
        Group = 0
    }).ToList();

    if (!response.Any())
        return response;

    // 存储每个分组的最大结束日期,以及当前最大分组号
    var groupMaxEndDates = new Dictionary<int, DateTime>();
    int currentMaxGroup = 1;

    // 处理第一个元素
    var firstItem = response[0];
    firstItem.Group = currentMaxGroup;
    groupMaxEndDates[currentMaxGroup] = firstItem.EndDate;

    // 处理剩余元素
    for (int i = 1; i < response.Count; i++)
    {
        var currentItem = response[i];
        bool assigned = false;

        // 遍历现有分组,寻找可加入的分组
        foreach (var groupEntry in groupMaxEndDates)
        {
            int groupId = groupEntry.Key;
            DateTime groupMaxEnd = groupEntry.Value;

            // 已排序前提下,当前Item的StartDate >= 所有已处理Item的StartDate
            // 只要当前Item的StartDate > 分组最大EndDate,就无重叠
            if (currentItem.StartDate > groupMaxEnd)
            {
                currentItem.Group = groupId;
                // 更新分组的最大EndDate为当前Item和原最大值的较大者
                if (currentItem.EndDate > groupMaxEnd)
                {
                    groupMaxEndDates[groupId] = currentItem.EndDate;
                }
                assigned = true;
                break;
            }
        }

        // 无可用分组时,新增分组
        if (!assigned)
        {
            currentMaxGroup++;
            currentItem.Group = currentMaxGroup;
            groupMaxEndDates[currentMaxGroup] = currentItem.EndDate;
        }
    }

    return response;
}

3. 避免重复计算最大分组号

原代码每次循环调用response.Max(x => x.Group),时间复杂度为O(n),改用变量currentMaxGroup跟踪当前最大分组号,每次新增分组时自增即可,时间复杂度降为O(1)

C#日期范围处理类库推荐

可以使用TimePeriodLibrary.NET,它提供了专业的时间范围操作API,包括TimeRange类的IntersectsWith方法可快速判断两个时间范围是否重叠。不过针对当前需求,上述优化后的自定义逻辑已足够高效,无需引入第三方依赖。

内容的提问来源于stack exchange,提问作者TUS

火山引擎 最新活动