我有这样的月销售额数据
Company Month Sales
Adidas 2018-09 100
Adidas 2018-08 95
Adidas 2018-07 120
Adidas 2018-06 155
...and so on
我需要添加另一列说明median over the past 12 months
(如果没有 12 个月的数据,则尽可能多)。
在Python中我想出了如何做到这一点for
循环,但我不知道在 BigQuery 中该怎么做。
谢谢你!
这是一种可能有效的方法:
CREATE TEMP FUNCTION MEDIAN(arr ANY TYPE) AS ((
SELECT
IF(
MOD(ARRAY_LENGTH(arr), 2) = 0,
(arr[OFFSET(DIV(ARRAY_LENGTH(arr), 2) - 1)] + arr[OFFSET(DIV(ARRAY_LENGTH(arr), 2))]) / 2,
arr[OFFSET(DIV(ARRAY_LENGTH(arr), 2))]
)
FROM (SELECT ARRAY_AGG(x ORDER BY x) AS arr FROM UNNEST(arr) AS x)
));
SELECT
Company,
Month,
MEDIAN(
ARRAY_AGG(Sales) OVER (PARTITION BY Company ORDER BY Month ROWS BETWEEN 11 PRECEDING AND CURRENT ROW)
) AS trailing_median
FROM (
SELECT 'Adidas' AS Company, '2018-09' AS Month, 100 AS Sales UNION ALL
SELECT 'Adidas', '2018-08', 95 UNION ALL
SELECT 'Adidas', '2018-07', 120 UNION ALL
SELECT 'Adidas', '2018-06', 155
);
结果是:
+---------+---------+-----------------+
| Company | Month | trailing_median |
+---------+---------+-----------------+
| Adidas | 2018-06 | 155.0 |
| Adidas | 2018-07 | 137.5 |
| Adidas | 2018-08 | 120.0 |
| Adidas | 2018-09 | 110.0 |
+---------+---------+-----------------+
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)