我有一个包含下表的文件:
Name AvailableDate totalRemaining
0 X3321 2018-03-14 13:00:00 200
1 X3321 2018-03-14 14:00:00 200
2 X3321 2018-03-14 15:00:00 200
3 X3321 2018-03-14 16:00:00 200
4 X3321 2018-03-14 17:00:00 193
我想返回一个 DataFrame,其中包含特定的所有记录time期间与实际情况无关date.
我按照这里的例子:
按时间过滤 pandas 数据框 https://stackoverflow.com/questions/35052691/filter-pandas-dataframe-by-time
但是当我执行以下命令时:
## setup
import pandas as pd
import numpy as np
### Step 2
### Check available slots
file2 = r'C:\Users\user\Desktop\Files\data.xlsx'
slots = pd.read_excel(file2,na_values='')
## filter the preferred ones
slots['nextAvailableDate'] = pd.to_datetime((slots['nextAvailableDate']))
slots['times'] = pd.to_datetime((slots['nextAvailableDate']))
slots = slots[slots['times'].between('21:00:00', '02:00:00')]
这将返回空 DataFrame 以及此解决方案:
slots = slots[slots['times'].dt.strftime('%H:%M:%S').between('21:00:00', '02:00:00')]
有没有一种方法可以正确执行此操作,而无需单独创建时间列?请问我应该如何处理这个问题?
My goal:
Name AvailableDate totalRemaining
0 X3321 2018-03-14 21:00:00 200
1 X3321 2018-03-14 22:00:00 200
2 X3321 2018-03-14 23:00:00 200
3 X3321 2018-03-14 00:00:00 200
4 X3321 2018-03-14 01:00:00 193
对于数据集中出现的每一天。
我认为需要between_time http://pandas.pydata.org/pandas-docs/version/0.18/generated/pandas.DataFrame.between_time.html与...一起工作Datetimeindex
由...制作set_index http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.set_index.html,对于列添加reset_index http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.reset_index.html with reindex http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.reindex.html对于相同顺序的列:
print (slots)
Name AvailableDate totalRemaining
0 X3321 2018-03-14 21:00:00 200
1 X3321 2018-03-14 20:00:00 200
2 X3321 2018-03-14 22:00:00 200
3 X3321 2018-03-14 23:00:00 200
4 X3321 2018-03-14 00:00:00 200
5 X3321 2018-03-14 01:00:00 193
6 X3321 2018-03-14 13:00:00 200
7 X3321 2018-03-14 14:00:00 200
8 X3321 2018-03-14 15:00:00 200
9 X3321 2018-03-14 16:00:00 200
10 X3321 2018-03-14 17:00:00 193
slots['AvailableDate'] = pd.to_datetime(slots['AvailableDate'])
df = (slots.set_index('AvailableDate')
.between_time('21:00:00', '02:00:00')
.reset_index()
.reindex(columns=df.columns))
print (df)
AvailableDate Name totalRemaining
0 2018-03-14 21:00:00 X3321 200
1 2018-03-14 22:00:00 X3321 200
2 2018-03-14 23:00:00 X3321 200
3 2018-03-14 00:00:00 X3321 200
4 2018-03-14 01:00:00 X3321 193
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)