我有一个类似于以下内容的 pandas MultiIndex 数据框:
import pandas as pd
rows = [('One', 'One', 'One', '20120105', 1, 'Text1'),
('One', 'One', 'One', '20120107', 2, 'Text2'),
('One', 'One', 'One', '20120110', 3, 'Text3'),
('One', 'One', 'Two', '20120104', 4, 'Text4'),
('One', 'Two', 'One', '20120109', 5, 'Text5'),
('Two', 'Three', 'Four', '20120111', 6, 'Text6')]
cols = ['Type', 'Subtype', 'Subsubtype', 'Date', 'Number', 'Text']
df = pd.DataFrame.from_records(rows, columns=cols)
df['Date'] = pd.to_datetime(df['Date'])
df = df.set_index(['Type', 'Subtype', 'Subsubtype'])
end_date = max(df['Date'])
print(df)
Date Number Text
Type Subtype Subsubtype
One One One 2012-01-05 1 Text1
One 2012-01-07 2 Text2
One 2012-01-10 3 Text3
Two 2012-01-04 4 Text4
Two One 2012-01-09 5 Text5
Two Three Four 2012-01-11 6 Text6
我想对数据进行上采样,以便 Type-Subtype-Subsubtype 索引的每个组合都能获取每日日期数据:从数据可用的最小日期到 end_date = max(df['Date'])。
我想要的一个例子:
Date Number Text
Type Subtype Subsubtype
One One One 2012-01-05 1 Text1
One 2012-01-06 1 Text2
One 2012-01-07 2 Text2
One 2012-01-08 2 Text2
One 2012-01-09 2 Text2
One 2012-01-10 3 Text3
One 2012-01-11 3 Text3
Two 2012-01-04 4 Text4
Two 2012-01-05 4 Text4
Two 2012-01-06 4 Text4
Two 2012-01-07 4 Text4
Two 2012-01-08 4 Text4
Two 2012-01-09 4 Text4
Two 2012-01-10 4 Text4
Two 2012-01-11 4 Text4
Two One 2012-01-09 5 Text5
One 2012-01-10 5 Text5
One 2012-01-11 5 Text5
Two Three Four 2012-01-11 6 Text6
浏览类似的问题,我找不到任何可以工作的东西。任何帮助是极大的赞赏。