Microsoft SQL Server 2016,T-SQL:根据各个日期获取数据集的日期范围

2024-04-12

我在 SQL Server 2016 中有一个有趣的情况。我使用 T-SQL 语言。

我有一个名为(#dataset)的数据集:

最后一列称为 ContinuousDates 将始终具有没有间隙的连续日期值,例如 2021 年 1 月 1 日到 2021 年 12 月 31 日。它永远不会有相同 ID 或名称的重复日期,即给定日期的一个人只能有一个数据行。 (在此示例中,我仅显示一个人,ID = 1,Name = X。在我的实际数据中,我有多个人)。

请注意,纽约市在数据集中出现较早,并在最后 4 行中重复。

我需要根据日期范围获取以下数据集:

我尝试在数据集上使用简单的 MINIMUM 和 MAXIMUM,但我意识到有时我可以获得wrong输出,如下:

我尝试使用 RANK() 和 DENSE_RANK() 函数的一些选项,但无法找到解决方案。有人可以为我提供帮助吗?

我这里附有代码:

CREATE TABLE #dataset

(

ID int,
Name varchar(20),
City varchar(20),
ContinuousDates date

)


INSERT INTO #dataset
VALUES(1,'X','NYC','1/1/2021')

INSERT INTO #dataset
VALUES(1,'X','NYC','1/2/2021')

INSERT INTO #dataset
VALUES(1,'X','NYC','1/3/2021')

INSERT INTO #dataset
VALUES(1,'X','SFO','1/4/2021')

INSERT INTO #dataset
VALUES(1,'X','SFO','1/5/2021')

INSERT INTO #dataset
VALUES(1,'X','PHY','1/6/2021')

INSERT INTO #dataset
VALUES(1,'X','PHY','1/7/2021')

INSERT INTO #dataset
VALUES(1,'X','PHY','1/8/2021')

INSERT INTO #dataset
VALUES(1,'X','NYC','1/9/2021')

INSERT INTO #dataset
VALUES(1,'X','NYC','1/10/2021')

INSERT INTO #dataset
VALUES(1,'X','NYC','1/11/2021')

INSERT INTO #dataset
VALUES(1,'X','NYC','1/12/2021')




SELECT *
FROM #dataset
ORDER BY ContinuousDates

为了更好地演示,我有一组新代码:

CREATE TABLE #dataset

(

ID int,
Name varchar(20),
City varchar(20),
ContinuousDates date

)


INSERT INTO #dataset
VALUES(1,'X','NYC','1/1/2021')

INSERT INTO #dataset
VALUES(1,'X','NYC','1/2/2021')

INSERT INTO #dataset
VALUES(1,'X','NYC','1/3/2021')

INSERT INTO #dataset
VALUES(1,'X','SFO','1/4/2021')

INSERT INTO #dataset
VALUES(1,'X','SFO','1/5/2021')

INSERT INTO #dataset
VALUES(1,'X','PHY','1/6/2021')

INSERT INTO #dataset
VALUES(1,'X','PHY','1/7/2021')

INSERT INTO #dataset
VALUES(1,'X','PHY','1/8/2021')

INSERT INTO #dataset
VALUES(1,'X','NYC','1/9/2021')

INSERT INTO #dataset
VALUES(1,'X','NYC','1/10/2021')

INSERT INTO #dataset
VALUES(1,'X','NYC','1/11/2021')

INSERT INTO #dataset
VALUES(1,'X','NYC','1/12/2021')

INSERT INTO #dataset
VALUES(2,'Y','MEL','1/13/2021')

INSERT INTO #dataset
VALUES(3,'Z','SYD','1/14/2021')

INSERT INTO #dataset
VALUES(3,'Z','SYD','1/15/2021')

INSERT INTO #dataset
VALUES(3,'Z','PER','1/16/2021')

INSERT INTO #dataset
VALUES(4,'A',NULL,'1/16/2021')

INSERT INTO #dataset
VALUES(4,'A', NULL,'1/17/2021')



SELECT *
FROM #dataset
ORDER BY ID, ContinuousDates

解决步骤:

  • 编号部分,其中 ID 和名称按日期排序 (row_id)
  • 按日期排序的包含 ID、名称和城市的编号部分 (p_row_id)
  • 计算 row_id - p_row_id

现在,您在一组唯一的值中获得了每个期间的组编号。

您所需要做的就是按号码、ID、姓名和城市进行分组

ID Name City ContinuousDates p_row_id row_id row_id - p_row_id
1 X NYC 2021-01-01 1 1 0
1 X NYC 2021-01-02 2 2 0
1 X NYC 2021-01-03 3 3 0
1 X SFO 2021-01-04 1 4 3
1 X SFO 2021-01-05 2 5 3
1 X PHY 2021-01-06 1 6 5
1 X PHY 2021-01-07 2 7 5
1 X PHY 2021-01-08 3 8 5
1 X NYC 2021-01-09 4 9 5
1 X NYC 2021-01-10 5 10 5
1 X NYC 2021-01-11 6 11 5
1 X NYC 2021-01-12 7 12 5
select
     CD.ID
    ,CD.[Name]
    ,CD.City
    ,min(CD.ContinuousDates) as DateStart
    ,max(CD.ContinuousDates) as DateEnd
from
    (
        select *
            ,row_number() over(partition by CD.ID, CD.[Name], CD.City order by CD.ContinuousDates) as p_row_id
            ,row_number() over(partition by CD.ID, CD.[Name] order by CD.ContinuousDates) as row_id
        from #dataset CD
    ) CD
group by CD.row_id - CD.p_row_id
        ,CD.ID
        ,CD.[Name]
        ,CD.City
order by DateStart

多列模板:

select
     CD.GroupColumn1
    ,CD.GroupColumn2
    ..
    ,CD.Column1
    ,CD.Column2
    ,CD.Column3
    ,CD.Column4
    ..
    ,min(CD.ContinuousDates) as DateStart
    ,max(CD.ContinuousDates) as DateEnd
from
    (
        select *
            ,row_number() over(partition by
                 CD.GroupColumn1
                ,CD.GroupColumn2
                ..
                ,CD.Column1
                ,CD.Column2
                ,CD.Column3
                ,CD.Column4
                ..
                order by CD.ContinuousDates) as p_row_id
            ,row_number() over(partition by
                 CD.GroupColumn1
                ,CD.GroupColumn2
                ..
                order by CD.ContinuousDates) as row_id
        from #dataset CD
    ) CD
group by CD.row_id - CD.p_row_id
        ,CD.GroupColumn1
        ,CD.GroupColumn2
        ..
         CD.Column1
        ,CD.Column2
        ,CD.Column3
        ,CD.Column4
        ..
order by DateStart
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

Microsoft SQL Server 2016,T-SQL:根据各个日期获取数据集的日期范围 的相关文章

随机推荐