所以我有一个名为“df”的 pandas 数据框,我想删除秒数并只使用 YYYY-MM-DD HH:MM 格式的索引。然后还会对分钟进行分组并显示该分钟的平均值。
所以我想把这个数据框
value
2015-05-03 00:00:00 61.0
2015-05-03 00:00:10 60.0
2015-05-03 00:00:25 60.0
2015-05-03 00:00:30 61.0
2015-05-03 00:00:45 61.0
2015-05-03 00:01:00 61.0
2015-05-03 00:01:10 60.0
2015-05-03 00:01:25 60.0
2015-05-03 00:01:30 61.0
2015-05-03 00:01:45 61.0
2015-05-03 00:02:00 61.0
2015-05-03 00:02:10 60.0
2015-05-03 00:02:25 60.0
2015-05-03 00:02:40 60.0
2015-05-03 00:02:55 60.0
2015-05-03 00:03:00 59.0
2015-05-03 00:03:15 59.0
2015-05-03 00:03:20 59.0
2015-05-03 00:03:35 59.0
2015-05-03 00:03:40 60.0
进入这个数据框
value
2015-05-03 00:00 60.6
2015-05-03 00:01 60.6
2015-05-03 00:02 60.2
2015-05-03 00:03 59.2
我试过类似的代码
df['value'].resample('1Min').mean()
or
df.index.resample('1Min').mean()
但这似乎不起作用。有任何想法吗?
您需要首先将索引转换为DatetimeIndex http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DatetimeIndex.html:
df.index = pd.DatetimeIndex(df.index)
#another solution
#df.index = pd.to_datetime(df.index)
print (df['value'].resample('1Min').mean())
#another same solution
#print (df.resample('1Min')['value'].mean())
2015-05-03 00:00:00 60.6
2015-05-03 00:01:00 60.6
2015-05-03 00:02:00 60.2
2015-05-03 00:03:00 59.2
Freq: T, Name: value, dtype: float64
另一种解决方案是将索引中的秒值设置为0
by astype
:
print (df.groupby([df.index.values.astype('<M8[m]')])['value'].mean())
2015-05-03 00:00:00 60.6
2015-05-03 00:01:00 60.6
2015-05-03 00:02:00 60.2
2015-05-03 00:03:00 59.2
Name: value, dtype: float64
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)