Postgres 基于窗口的查询的优化/规划不佳(partition by(, group by?)) - 加速 1000 倍

2023-12-01

我们正在运行 Postgres 9.3.5。 (07/2014) 我们有相当复杂的数据仓库/报告设置(ETL、物化视图、索引、聚合、分析函数……)。

我现在发现的可能很难在优化器中实现(?),但它在性能上产生了巨大的差异(只有与我们的查询非常相似的示例代码,以减少不必要的复杂性):

create view foo as
select
  sum(s.plan) over w_pyl as pyl_plan,      -- money planned to spend in this pot/loc/year
  sum(s.booked) over w_pyl as pyl_booked,  -- money already booked in this pot/loc/year

  -- money already booked in this pot/loc the years before (stored as sum already)
  last_value(s.booked_prev_years) over w_pl as pl_booked_prev_years,    

  -- update 2014-10-08: maybe the following additional selected columns
  -- may be implementation-/test-relevant since they could potentially be determined
  -- by sorting within the partition:
  min(s.id) over w_pyl,
  max(s.id) over w_pyl,

  -- ... anything could follow here ...
  x.*,
  s.*
from
  pot_location_year x  -- may be some materialized view or (cache/regular) table
  left outer join  spendings s 
    on (s.pot = x.pot and s.loc = x.loc and s.year = x.year)
window
  w_pyl  as (partition by  x.pot, x.year, x.loc)
  w_pl   as (partition by  x.pot, x.loc  order by x.year)

我们有这两个相关索引:

pot_location_year_idx__p_y_l  -- on pot, year, loc
pot_location_year_idx__p_l_y  -- on pot, loc, year

现在我们对一些测试查询运行解释

explain select * from foo fetch first 100 rows only

这向我们展示了一些非常表现不佳,因为pyl使用索引,其中结果集必须不必要地排序两次:-( (the outmost WindowAgg/Sort分步排序ply因为这对我们来说是必要的last_value(..) as pl_booked_prev_years):

 Limit  (cost=289687.87..289692.12 rows=100 width=512)
   ->  WindowAgg  (cost=289687.87..292714.85 rows=93138 width=408)
         ->  Sort  (cost=289687.87..289920.71 rows=93138 width=408)
               Sort Key: x.pot, x.loc, x.year
               ->  WindowAgg  (cost=1.25..282000.68 rows=93138 width=408)
                     ->  Nested Loop Left Join  (cost=1.25..278508.01 rows=93138 width=408)
                           Join Filter: ...
                           ->  Nested Loop Left Join  (cost=0.83..214569.60 rows=93138 width=392)
                                 ->  Index Scan using pot_location_year_idx__p_y_l on pot_location_year x  (cost=0.42..11665.49 rows=93138 width=306)
                                 ->  Index Scan using ...  (cost=0.41..2.17 rows=1 width=140)
                                       Index Cond: ...
                           ->  Index Scan using ...  (cost=0.41..0.67 rows=1 width=126)
                                 Index Cond: ...

所以显而易见的问题是规划者应该选择现有的ply相反,不必排序两次.


幸运的是,我发现我可以通过确保计划者(隐式)提示来做到这一点其他视图分区/窗口的列顺序更加同质,尽管在语义上不是必需的。

以下更改现在返回了我最初期望得到的内容(层索引的使用):

...
window
  -- w_pyl  as (partition by  x.pot, x.year, x.loc)  -- showstopper (from above)
     w_pyl  as (partition by  x.pot, x.loc, x.year)  -- speedy
     w_pl   as (partition by  x.pot, x.loc  order by x.year)

执行速度提高 1000 倍的结果:

 Limit  (cost=1.25..308.02 rows=100 width=512)
   ->  WindowAgg  (cost=1.25..284794.82 rows=93138 width=408)
         ->  WindowAgg  (cost=1.25..282000.68 rows=93138 width=408)
               ->  Nested Loop Left Join  (cost=1.25..278508.01 rows=93138 width=408)
                     Join Filter: ...
                     ->  Nested Loop Left Join  (cost=0.83..214569.60 rows=93138 width=392)
                           ->  Index Scan using pot_location_year_idx__p_l_y on pot_location_year x  (cost=0.42..11665.49 rows=93138 width=306)
                           ->  Index Scan using ...  (cost=0.41..2.17 rows=1 width=140)
                                 Index Cond: ...
                     ->  Index Scan using ...  (cost=0.41..0.67 rows=1 width=126)
                           Index Cond: ...

2014年10月9日更新:

汤姆·莱恩-2 写道这(主要的 postgres 开发人员之一)与我在这里面临的另一个(可能相关的)窗口函数问题相关,以及与 pg 9.2.2 相关的 2013-02:

... 没有那么多的智力 在系统中关于窗口函数,到目前为止。所以你必须写 直接输出查询并将 WHERE 子句放在较低级别,如果 您希望进行此优化。

因此,关于窗口函数、数据仓库功能等主题的更多(有争议的)一般想法可以在这里考虑:

上面是一个很好的陈述,它强化了我的假设,当决定在一般项目和 DWH 环境中进行一些 Oracle->Postgres 迁移时,花费更多时间和金钱这样做的风险会相当高。 (尽管所研究的功能可能看起来足够了。)

与 Oracle 相比,我在重要领域更喜欢 Postgres,例如在代码的语法和清晰度以及其他方面(我猜甚至是源代码,因此可维护性(在所有方面)都更好),但 Oracle 在资源优化、支持和工具方面显然是更先进的参与者当您在典型的 CRUD 管理之外处理更复杂的数据库功能时。

I guess the open source Postgres (as well as the EnterpriseDB topups) will catch up in the long run in those areas, but it will take them at least 10 years, and maybe only if it is pushed heavily by big, altruistic1 global players like Google etc.)

1 altruistic in the sense, that if the pushed areas stay "free", the benefit for those companies must be surely somewhere else (maybe with some advertisement rows added randomly - I guess we could live with it here and there ;))


2014年10月13日更新:

正如我之前的更新(2014-10-09)中所链接的,当您想要使用约束/过滤器查询上述视图(此处)时,优化问题及其解决方法以非常相似的方式进行(在上述修复之后)在 pot_id 上):

explain select * foo where pot_id = '12345' fetch first 100 rows only

...

 Limit  (cost=1.25..121151.44 rows=100 width=211)
   ->  Subquery Scan on foo  (cost=1.25..279858.20 rows=231 width=211)
         Filter: ((foo.pot_id)::text = '12345'::text)
         ->  WindowAgg  (cost=1.25..277320.53 rows=203013 width=107)
               ->  WindowAgg  (cost=1.25..271230.14 rows=203013 width=107)
                     ->  Nested Loop Left Join  (cost=1.25..263617.16 rows=203013 width=107)
                           ->  Merge Left Join  (cost=0.83..35629.02 rows=203013 width=91)
                                 Merge Cond: ...
                                 ->  Index Scan using pot_location_year_idx__p_l_y on pot_location_year x  (cost=0.42..15493.80 rows=93138 width=65)
                                 ->  Materialize  (cost=0.41..15459.42 rows=33198 width=46)
                                       ->  Index Scan using ...  (cost=0.41..15376.43 rows=33198 width=46)
                           ->  Index Scan using ...  (cost=0.42..1.11 rows=1 width=46)
                                 Index Cond: ...

正如上面链接中所建议的,如果您想在窗口聚合之前“下推”约束/过滤器,则必须在视图本身中明确执行此操作,这对于此类查询来说是有效的,然后再使用另一个 1000第 100 行的加速倍数:

 create view foo as
 ...
 where pot_id='12345'
 ...

...

 Limit  (cost=1.25..943.47 rows=100 width=211)
   ->  WindowAgg  (cost=1.25..9780.52 rows=1039 width=107)
         ->  WindowAgg  (cost=1.25..9751.95 rows=1039 width=107)
               ->  Nested Loop Left Join  (cost=1.25..9715.58 rows=1039 width=107)
                     ->  Nested Loop Left Join  (cost=0.83..1129.47 rows=1039 width=91)
                           ->  Index Scan using pot_location_year_idx__p_l_y on pot_location_year x (cost=0.42..269.77 rows=106 width=65)
                                 Index Cond: ((pot_id)::text = '12345'::text)
                           ->  Index Scan using ...  (cost=0.41..8.10 rows=1 width=46)
                                 Index Cond: ...
                     ->  Index Scan using ...  (cost=0.42..8.25 rows=1 width=46)
                           Index Cond: ...

After some more view parameterization effort2 this approach will help speedup certain queries constraining those columns, but is still quite inflexible regarding a more general foo-view usage and query optimization.

2: You can "parameterize such a view" putting it (its SQL) in a (set-returning) table function (the Oracle equivalent to a pipelined table function). Further details regarding this may be found in the forum link above.

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

Postgres 基于窗口的查询的优化/规划不佳(partition by(, group by?)) - 加速 1000 倍 的相关文章

随机推荐