幸运的是,我发现我可以通过确保计划者(隐式)提示来做到这一点其他视图分区/窗口的列顺序更加同质,尽管在语义上不是必需的。
以下更改现在返回了我最初期望得到的内容(层索引的使用):
...
window
-- w_pyl as (partition by x.pot, x.year, x.loc) -- showstopper (from above)
w_pyl as (partition by x.pot, x.loc, x.year) -- speedy
w_pl as (partition by x.pot, x.loc order by x.year)
执行速度提高 1000 倍的结果:
Limit (cost=1.25..308.02 rows=100 width=512)
-> WindowAgg (cost=1.25..284794.82 rows=93138 width=408)
-> WindowAgg (cost=1.25..282000.68 rows=93138 width=408)
-> Nested Loop Left Join (cost=1.25..278508.01 rows=93138 width=408)
Join Filter: ...
-> Nested Loop Left Join (cost=0.83..214569.60 rows=93138 width=392)
-> Index Scan using pot_location_year_idx__p_l_y on pot_location_year x (cost=0.42..11665.49 rows=93138 width=306)
-> Index Scan using ... (cost=0.41..2.17 rows=1 width=140)
Index Cond: ...
-> Index Scan using ... (cost=0.41..0.67 rows=1 width=126)
Index Cond: ...
2014年10月9日更新:
汤姆·莱恩-2 写道这(主要的 postgres 开发人员之一)与我在这里面临的另一个(可能相关的)窗口函数问题相关,以及与 pg 9.2.2 相关的 2013-02:
... 没有那么多的智力
在系统中关于窗口函数,到目前为止。所以你必须写
直接输出查询并将 WHERE 子句放在较低级别,如果
您希望进行此优化。
因此,关于窗口函数、数据仓库功能等主题的更多(有争议的)一般想法可以在这里考虑:
上面是一个很好的陈述,它强化了我的假设,当决定在一般项目和 DWH 环境中进行一些 Oracle->Postgres 迁移时,花费更多时间和金钱这样做的风险会相当高。 (尽管所研究的功能可能看起来足够了。)
与 Oracle 相比,我在重要领域更喜欢 Postgres,例如在代码的语法和清晰度以及其他方面(我猜甚至是源代码,因此可维护性(在所有方面)都更好),但 Oracle 在资源优化、支持和工具方面显然是更先进的参与者当您在典型的 CRUD 管理之外处理更复杂的数据库功能时。
I guess the open source Postgres (as well as the EnterpriseDB topups) will catch up in the long run in those areas, but it will take them at least 10 years, and maybe only if it is pushed heavily by big, altruistic1 global players like Google etc.)
1 altruistic in the sense, that if the pushed areas stay "free", the benefit for those companies must be surely somewhere else (maybe with some advertisement rows added randomly - I guess we could live with it here and there ;))
2014年10月13日更新:
正如我之前的更新(2014-10-09)中所链接的,当您想要使用约束/过滤器查询上述视图(此处)时,优化问题及其解决方法以非常相似的方式进行(在上述修复之后)在 pot_id 上):
explain select * foo where pot_id = '12345' fetch first 100 rows only
...
Limit (cost=1.25..121151.44 rows=100 width=211)
-> Subquery Scan on foo (cost=1.25..279858.20 rows=231 width=211)
Filter: ((foo.pot_id)::text = '12345'::text)
-> WindowAgg (cost=1.25..277320.53 rows=203013 width=107)
-> WindowAgg (cost=1.25..271230.14 rows=203013 width=107)
-> Nested Loop Left Join (cost=1.25..263617.16 rows=203013 width=107)
-> Merge Left Join (cost=0.83..35629.02 rows=203013 width=91)
Merge Cond: ...
-> Index Scan using pot_location_year_idx__p_l_y on pot_location_year x (cost=0.42..15493.80 rows=93138 width=65)
-> Materialize (cost=0.41..15459.42 rows=33198 width=46)
-> Index Scan using ... (cost=0.41..15376.43 rows=33198 width=46)
-> Index Scan using ... (cost=0.42..1.11 rows=1 width=46)
Index Cond: ...
正如上面链接中所建议的,如果您想在窗口聚合之前“下推”约束/过滤器,则必须在视图本身中明确执行此操作,这对于此类查询来说是有效的,然后再使用另一个 1000第 100 行的加速倍数:
create view foo as
...
where pot_id='12345'
...
...
Limit (cost=1.25..943.47 rows=100 width=211)
-> WindowAgg (cost=1.25..9780.52 rows=1039 width=107)
-> WindowAgg (cost=1.25..9751.95 rows=1039 width=107)
-> Nested Loop Left Join (cost=1.25..9715.58 rows=1039 width=107)
-> Nested Loop Left Join (cost=0.83..1129.47 rows=1039 width=91)
-> Index Scan using pot_location_year_idx__p_l_y on pot_location_year x (cost=0.42..269.77 rows=106 width=65)
Index Cond: ((pot_id)::text = '12345'::text)
-> Index Scan using ... (cost=0.41..8.10 rows=1 width=46)
Index Cond: ...
-> Index Scan using ... (cost=0.42..8.25 rows=1 width=46)
Index Cond: ...
After some more view parameterization effort2 this approach will help speedup certain queries constraining those columns, but is still quite inflexible regarding a more general foo-view usage and query optimization.
2: You can "parameterize such a view" putting it (its SQL) in a (set-returning) table function (the Oracle equivalent to a pipelined table function). Further details regarding this may be found in the forum link above.