我还没有能够让这个查询命中索引而不是执行完整扫描 - 我有另一个查询,它对几乎相同的表使用 date_part('day', datelocal) (该表的数据稍微少一些,但是相同的结构),并且将命中我在 datelocal 列上创建的索引(这是一个没有时区的时间戳)。查询(此查询对表执行并行 seq 扫描并执行内存快速排序):
SELECT
date_part('hour', datelocal) AS hour,
SUM(CASE WHEN gender LIKE 'male' THEN views ELSE 0 END) AS male,
SUM(CASE WHEN gender LIKE 'female' THEN views ELSE 0 END) AS female
FROM reportimpression
WHERE datelocal >= '2-1-2019' AND datelocal < '2-28-2019'
GROUP BY date_part('hour', datelocal)
ORDER BY date_part('hour', datelocal)
这是确实命中我的日期本地索引的另一项:
SELECT
date_part('day', datelocal) AS day,
SUM(CASE WHEN gender LIKE 'male' THEN views ELSE 0 END) AS male,
SUM(CASE WHEN gender LIKE 'female' THEN views ELSE 0 END) AS female
FROM reportimpressionday
WHERE datelocal >= '2-1-2019' AND datelocal < '2-28-2019'
GROUP BY date_trunc('day', datelocal), date_part('day', datelocal)
ORDER BY date_trunc('day', datelocal)
为这件事让我头疼!关于如何加快第一个或至少使其达到索引的任何想法?我尝试在 datelocal 字段上创建索引,在 datelocal、性别和视图上创建复合索引,以及在 date_part('hour', datelocal) 上创建表达式索引,但这些都不起作用。
Schemas:
-- Table Definition ----------------------------------------------
CREATE TABLE reportimpression (
datelocal timestamp without time zone,
devicename text,
network text,
sitecode text,
advertisername text,
mediafilename text,
gender text,
agegroup text,
views integer,
impressions integer,
dwelltime numeric
);
-- Indices -------------------------------------------------------
CREATE INDEX reportimpression_datelocal_index ON reportimpression(datelocal timestamp_ops);
CREATE INDEX reportimpression_viewership_index ON reportimpression(datelocal timestamp_ops,views int4_ops,impressions int4_ops,gender text_ops,agegroup text_ops);
CREATE INDEX reportimpression_test_index ON reportimpression(datelocal timestamp_ops,(date_part('hour'::text, datelocal)) float8_ops);
-- Table Definition ----------------------------------------------
CREATE TABLE reportimpressionday (
datelocal timestamp without time zone,
devicename text,
network text,
sitecode text,
advertisername text,
mediafilename text,
gender text,
agegroup text,
views integer,
impressions integer,
dwelltime numeric
);
-- Indices -------------------------------------------------------
CREATE INDEX reportimpressionday_datelocal_index ON reportimpressionday(datelocal timestamp_ops);
CREATE INDEX reportimpressionday_detail_index ON reportimpressionday(datelocal timestamp_ops,views int4_ops,impressions int4_ops,gender text_ops,agegroup text_ops);
解释(分析、缓冲)输出:
Finalize GroupAggregate (cost=999842.42..999859.67 rows=3137 width=24) (actual time=43754.700..43754.714 rows=24 loops=1)
Group Key: (date_part('hour'::text, datelocal))
Buffers: shared hit=123912 read=823290
I/O Timings: read=81228.280
-> Sort (cost=999842.42..999843.99 rows=3137 width=24) (actual time=43754.695..43754.698 rows=48 loops=1)
Sort Key: (date_part('hour'::text, datelocal))
Sort Method: quicksort Memory: 28kB
Buffers: shared hit=123912 read=823290
I/O Timings: read=81228.280
-> Gather (cost=999481.30..999805.98 rows=3137 width=24) (actual time=43754.520..43777.558 rows=48 loops=1)
Workers Planned: 1
Workers Launched: 1
Buffers: shared hit=123912 read=823290
I/O Timings: read=81228.280
-> Partial HashAggregate (cost=998481.30..998492.28 rows=3137 width=24) (actual time=43751.649..43751.672 rows=24 loops=2)
Group Key: date_part('hour'::text, datelocal)
Buffers: shared hit=123912 read=823290
I/O Timings: read=81228.280
-> Parallel Seq Scan on reportimpression (cost=0.00..991555.98 rows=2770129 width=17) (actual time=13.097..42974.126 rows=2338145 loops=2)
Filter: ((datelocal >= '2019-02-01 00:00:00'::timestamp without time zone) AND (datelocal < '2019-02-28 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 6792750
Buffers: shared hit=123912 read=823290
I/O Timings: read=81228.280
Planning time: 0.185 ms
Execution time: 43777.701 ms