这个问题与this https://stackoverflow.com/questions/32222889/how-to-calculate-power-consumption-from-power-records/一。我有一个包含设备功率值的表,我需要计算给定时间跨度的功耗并返回 10 个最耗电的设备。我生成了 192 个设备和 7742208 条测量记录(每个设备 40324 条)。这大约是设备一个月内产生的记录量。
对于如此大量的数据,我当前的查询需要 40 多秒才能执行,这太长了,因为时间跨度以及设备和测量的数量可能要高得多。我应该尝试使用与 lag() OVER PARTITION 不同的方法来解决这个问题吗?还可以进行哪些其他优化?我非常感谢带有代码示例的建议。
PostgreSQL 版本 9.4
使用示例值查询:
SELECT
t.device_id,
sum(len_y*(extract(epoch from len_x))) AS total_consumption
FROM (
SELECT
m.id,
m.device_id,
m.power_total,
m.created_at,
m.power_total+lag(m.power_total) OVER (
PARTITION BY device_id
ORDER BY m.created_at
) AS len_y,
m.created_at-lag(m.created_at) OVER (
PARTITION BY device_id
ORDER BY m.created_at
) AS len_x
FROM
measurements AS m
WHERE m.created_at BETWEEN '2015-07-30 13:05:24.403552+00'::timestamp
AND '2015-08-27 12:34:59.826837+00'::timestamp
) AS t
GROUP BY t.device_id
ORDER BY total_consumption
DESC LIMIT 10;
表信息:
Column | Type | Modifiers
--------------+--------------------------+----------------------------------------------------------
id | integer | not null default nextval('measurements_id_seq'::regclass)
created_at | timestamp with time zone | default timezone('utc'::text, now())
power_total | real |
device_id | integer | not null
Indexes:
"measurements_pkey" PRIMARY KEY, btree (id)
"measurements_device_id_idx" btree (device_id)
"measurements_created_at_idx" btree (created_at)
Foreign-key constraints:
"measurements_device_id_fkey" FOREIGN KEY (device_id) REFERENCES devices(id)
查询计划:
Limit (cost=1317403.25..1317403.27 rows=10 width=24) (actual time=41077.091..41077.094 rows=10 loops=1)
-> Sort (cost=1317403.25..1317403.73 rows=192 width=24) (actual time=41077.089..41077.092 rows=10 loops=1)
Sort Key: (sum((((m.power_total + lag(m.power_total) OVER (?))) * date_part('epoch'::text, ((m.created_at - lag(m.created_at) OVER (?)))))))
Sort Method: top-N heapsort Memory: 25kB
-> GroupAggregate (cost=1041700.67..1317399.10 rows=192 width=24) (actual time=25361.013..41076.562 rows=192 loops=1)
Group Key: m.device_id
-> WindowAgg (cost=1041700.67..1201314.44 rows=5804137 width=20) (actual time=25291.797..37839.727 rows=7742208 loops=1)
-> Sort (cost=1041700.67..1056211.02 rows=5804137 width=20) (actual time=25291.746..30699.993 rows=7742208 loops=1)
Sort Key: m.device_id, m.created_at
Sort Method: external merge Disk: 257344kB
-> Seq Scan on measurements m (cost=0.00..151582.05 rows=5804137 width=20) (actual time=0.333..5112.851 rows=7742208 loops=1)
Filter: ((created_at >= '2015-07-30 13:05:24.403552'::timestamp without time zone) AND (created_at <= '2015-08-27 12:34:59.826837'::timestamp without time zone))
Planning time: 0.351 ms
Execution time: 41114.883 ms
查询生成测试表和数据:
CREATE TABLE measurements (
id serial primary key,
device_id integer,
power_total real,
created_at timestamp
);
INSERT INTO measurements(
device_id,
created_at,
power_total
)
SELECT
device_id,
now() + (i * interval '1 minute'),
random()*(50-1)+1
FROM (
SELECT
DISTINCT(device_id),
generate_series(0,10) AS i
FROM (
SELECT
generate_series(1,5) AS device_id
) AS dev_ids
) AS gen_table;