我有2张桌子。表 A 包含日期、ISBN(书籍)、需求(该日期的需求)。表 B 包含日期、ISBN(书籍)和 SalesRank。
样本数据如下:
DailyBookFile 每个日期都有 150k 条记录,从 2010 年开始(即 150k * 365 天 * 8 年)行。每个日期大约有 50 万条记录的 SalesRank 表也是如此
DailyBookFile
Date Isbn13 CurrentModifiedDemandTotal
20180122 9780955153075 13
20180122 9780805863567 9
20180122 9781138779396 1
20180122 9780029001516 9
20180122 9780470614150 42
SalesRank
importdate ISBN13 SalesRank
20180122 9780029001516 69499
20180122 9780470614150 52879
20180122 9780805863567 832429
20180122 9780955153075 44528
20180122 9781138779396 926435
Required Output
Date Avg_Rank Book_Group
20180122 385154 Elite
20180121 351545 Elite
20180120 201545 Elite
我想获取每天前 200 个当前修改需求,并取平均排名。
由于我是 SQL 新手,因此无法找到解决方案。
我首先获取了昨天的前 200 名当前修改需求,并获取了去年的平均排名。
SELECT DBF.Filedate AS [Date],
AVG(AMA.SalesRank) AS Avg_Rank,
'Elite' AS Book_Group
FROM [ODS].[wholesale].[DailyBookFile] AS DBF
INNER JOIN [ODS].[MarketplaceMonitor].[SalesRank] AS AMA ON (DBF.Isbn13 = AMA.ISBN13
AND DBF.FileDate = AMA.importdate)
WHERE DBF.Isbn13 IN (SELECT TOP 200 Isbn13
FROM [ODS].[wholesale].[DailyBookFile]
WHERE FileDate = 20180122
AND CAST(CurrentModifiedDemandTotal AS int) > 200)
AND DBF.Filedate > 20170101
GROUP BY DBF.Filedate;
但结果并不是我想要的。所以,现在我想要每天前 200 个当前修改需求的 ISBN 及其平均排名。我尝试过这个。
DECLARE @i int;
SET @i = 20180122;
WHILE (SELECT DISTINCT(DBF.Filedate)
FROM [ODS].[wholesale].[DailyBookFile] AS DBF
WHERE DBF.Filedate = @i) IS NOT NULL
BEGIN
SELECT DBF.Filedate AS [Date],
AVG(AMA.SalesRank) AS Avg_Rank,
'Elite' AS Book_Group
FROM [ODS].[wholesale].[DailyBookFile] AS DBF
INNER JOIN [ODS].[MarketplaceMonitor].[SalesRank] as AMA ON DBF.Isbn13 = AMA.ISBN13
AND DBF.FileDate = AMA.importdate
WHERE DBF.Isbn13 in (SELECT TOP 200 Isbn13
FROM [ODS].[wholesale].[DailyBookFile]
WHERE FileDate = @i
AND CAST (CurrentModifiedDemandTotal AS int) > 500)
AND DBF.Filedate = @i
GROUP BY DBF.Filedate;
SET @i = @i+1;
END
在此,我在每个窗口中得到一个选择查询结果。有什么办法可以将结果放在一个表中吗?
附:每天的前 200 本书的列表会根据当前修改的需求而变化。我想取他们的平均值。当日销量排名。