我多次听说 postgres 处理exists查询速度更快左连接.
http://archives.postgresql.org/pgsql-performance/2002-12/msg00185.php http://archives.postgresql.org/pgsql-performance/2002-12/msg00185.php
这对于一个表聚合来说绝对是正确的。
但在我们的例子中,它们不止一个,并且使用以下命令构建相同的查询exists这使得 postgres 永远挂起:
explain
SELECT count(DISTINCT "groups".id) AS count_all
FROM "groups"
WHERE (exists(
select * from products p where groups.id = p.group_id AND exists(
select * from products_categories pc where p.id = pc.product_id AND pc.category_id in (2,3))) AND groups.id != 3)
result:
Aggregate (cost=26413436.66..26413436.67 rows=1 width=4)
-> Seq Scan on groups (cost=0.00..26413403.84 rows=13126 width=4)
Filter: ((id <> 3) AND (subplan))
SubPlan
-> Index Scan using index_products_on_group_id on products p (cost=0.00..1006.13 rows=1 width=1483)
Index Cond: ($1 = group_id)
Filter: (subplan)
SubPlan
-> Seq Scan on products_categories pc (cost=0.00..498.49 rows=1 width=8)
Filter: ((category_id = ANY ('{2,3}'::integer[])) AND ($0 = product_id))
这就是执行时间极长的根本原因吗?
这是某种配置问题吗?
谢谢,
博格丹。
好吧,对于“groups”中的每一行,postgresql 都会对 products_categories 进行全面扫描,这并不好。不一定是配置问题,但也许可以在没有像这样的嵌套子查询的情况下声明查询?
SELECT count(DISTINCT "groups".id) AS count_all
FROM "groups"
WHERE exists(
select 1 from products p where groups.id = p.group_id
join products_categories pc on pc.product_id = p.id
where pc.category_id in (2,3)
) and groups.id <> 3
也确实products_categories
有索引product_id
?
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)