这很容易通过一种称为塔比比托桑.
该技术的作用是将每个组的行的位置与整个行集进行比较,以确定同一组中的行是否彼此相邻。
例如,对于您的示例数据,这看起来像:
WITH your_table AS (SELECT 1 ID, 'Michael' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 2 ID, 'Alex' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 3 ID, 'Tom' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 4 ID, 'John' NAME, 'Sales' department FROM dual UNION ALL
SELECT 5 ID, 'Brad' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 6 ID, 'Leo' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 7 ID, 'Kevin' NAME, 'Production' department FROM dual)
-- end of mimicking your table with data in it. See the SQL below:
SELECT ID,
NAME,
department,
row_number() OVER (ORDER BY ID) overall_rn,
row_number() OVER (PARTITION BY department ORDER BY ID) department_rn,
row_number() OVER (ORDER BY ID) - row_number() OVER (PARTITION BY department ORDER BY ID) grp
FROM your_table;
ID NAME DEPARTMENT OVERALL_RN DEPARTMENT_RN GRP
---------- ------- ---------- ---------- ------------- ----------
1 Michael Marketing 1 1 0
2 Alex Marketing 2 2 0
3 Tom Marketing 3 3 0
4 John Sales 4 1 3
5 Brad Marketing 5 4 1
6 Leo Marketing 6 5 1
7 Kevin Production 7 1 6
在这里,我为整个数据集中的所有行提供了按 id 升序排列的行号(overall_rn
列),并且我为每个部门中的行指定了行号(department_rn
列),再次按 id 升序排列。
现在我已经做到了这一点,我们可以从另一个中减去一个(grp
柱子)。
请注意,对于彼此相邻的部门行,组列中的数字如何保持相同,但每次出现间隙时它都会发生变化。
例如。对于营销部门,第 1-3 行彼此相邻,并且 grp = 0,但营销第 4 行实际上位于整个结果集的第 5 行,因此它现在具有不同的 grp 编号。由于第 5 行营销位于整个集合的第 6 行,因此它与第 4 行营销具有相同的 grp 编号,因此我们知道它们彼此相邻。
一旦我们有了 grp 信息,就可以对部门和新的 grp 列进行聚合查询分组,使用 min 和 max 来查找开始和结束 id:
WITH your_table AS (SELECT 1 ID, 'Michael' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 2 ID, 'Alex' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 3 ID, 'Tom' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 4 ID, 'John' NAME, 'Sales' department FROM dual UNION ALL
SELECT 5 ID, 'Brad' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 6 ID, 'Leo' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 7 ID, 'Kevin' NAME, 'Production' department FROM dual)
-- end of mimicking your table with data in it. See the SQL below:
SELECT department,
MIN(ID) start_id,
MAX(ID) end_id
FROM (SELECT ID,
NAME,
department,
row_number() OVER (ORDER BY ID) - row_number() OVER (PARTITION BY department ORDER BY ID) grp
FROM your_table)
GROUP BY department, grp;
DEPARTMENT START_ID END_ID
---------- ---------- ----------
Marketing 1 3
Marketing 5 6
Sales 4 4
Production 7 7
注意,我假设 id 列中的间隙并不重要(即,如果没有 id = 6 的行(因此 Leo 和 Kevin 的 id 分别为 7 和 8),那么 Leo 和 Brad 仍会出现在同一列中组,起始 id = 5,结束 id = 7。
如果 id 列中的间隙算作指示新组,那么您可以仅使用 id 来标记整个行集(即无需计算overall_rn;只需使用 id 列即可)。
这意味着您的查询将变为:
WITH your_table AS (SELECT 1 ID, 'Michael' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 2 ID, 'Alex' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 3 ID, 'Tom' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 4 ID, 'John' NAME, 'Sales' department FROM dual UNION ALL
SELECT 5 ID, 'Brad' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 7 ID, 'Leo' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 8 ID, 'Kevin' NAME, 'Production' department FROM dual)
-- end of mimicking your table with data in it. See the SQL below:
SELECT department,
MIN(ID) start_id,
MAX(ID) end_id
FROM (SELECT ID,
NAME,
department,
ID - row_number() OVER (PARTITION BY department ORDER BY ID) grp
FROM your_table)
GROUP BY department, grp;
DEPARTMENT START_ID END_ID
---------- ---------- ----------
Marketing 1 3
Sales 4 4
Marketing 5 5
Marketing 7 7
Production 8 8