快速而肮脏的方式:http://sqlfiddle.com/#!1/bd2f6/21我为我的专栏命名tstamp
而不是你的timestamp
with t as (
select
generate_series(mitstamp,matstamp,'15 minutes') as int,
duration
from
(select min(tstamp) mitstamp, max(tstamp) as matstamp from tmp) a,
(select duration from tmp group by duration) b
)
select
int as timestampwindowstart,
t.duration,
count(tmp.duration)
from
t
left join tmp on
(tmp.tstamp >= t.int and
tmp.tstamp < (t.int + interval '15 minutes') and
t.duration = tmp.duration)
group by
int,
t.duration
order by
int,
t.duration
简要说明:
- 计算最小和最大时间戳
- 在最小值和最大值之间生成 15 分钟的间隔
- 具有唯一持续时间值的交叉连接结果
- 左连接原始数据(左连接很重要,因为这将在输出中保留所有可能的组合,并且会有
null
其中给定时间间隔不存在持续时间。
- 汇总数据。
count(null)=0
如果您有更多表,并且算法应该应用于它们的并集。假设我们有三张表tmp1, tmp2, tmp3
全部带有列tstamp
and duration
。我们可以扩展之前的解决方案:
with
tmpout as (
select * from tmp1 union all
select * from tmp2 union all
select * from tmp3
)
,t as (
select
generate_series(mitstamp,matstamp,'15 minutes') as int,
duration
from
(select min(tstamp) mitstamp, max(tstamp) as matstamp from tmpout) a,
(select duration from tmpout group by duration) b
)
select
int as timestampwindowstart,
t.duration,
count(tmp.duration)
from
t
left join tmpout on
(tmp.tstamp >= t.int and
tmp.tstamp < (t.int + interval '15 minutes') and
t.duration = tmp.duration)
group by
int,
t.duration
order by
int,
t.duration
你真的应该知道with
PostgreSQL 中的子句。对于 PostgreSQL 中的任何数据分析来说,这是非常宝贵的概念。