这是对前一个问题的参考,是对该问题的扩展。
我想使用 Pandas 迭代 xlsx,包含时间戳并获取停机时间 https://stackoverflow.com/questions/60808781/i-want-to-iterate-through-an-xlsx-using-pandas-containing-timestamps-and-get-d/60811346?noredirect=1#comment107789909_60811346
所以我已经实现了一个字典,其中包含停机时间的时间戳,并以日期作为键。
但是现在出现的问题是,如果一天内有两次单独的停机时间,则不会为其提供单独的条目,而是将其附加在停机时间列表中,例如,它显示为
Timestamp('2019-10-18 00:00:00')":['00:20:00','00:30:00','00:20:00','00:40:00','05:50:00','05:60:00']
因此,为了阐明目的,我从该条目中提取第一个和最后一个元素,以获取任何特定日期的停机时间的开始和结束时间,然后给出总小时数。
我可以将它们分成两个不同的字典,我使用了这个:
df1=pd.DataFrame.from_dict(result, orient='index')
print(df)
df1=df1.fillna('0')
df1=df1.replace(to_replace =0,value ='0')
for i in df1.index:
print(i)
for j in range(len(df1.loc[i])-3):
if (df1.loc[i][j+1] is not '0' and df1.loc[i][j] is not '0'):
#the error is faced over here is, we have a total of 72 72 rows, however not all of the are filled for all the timestamps, hence they remain as NoneType, thus need to be ignored.
x=(datetime.datetime.strptime(df1.loc[i][j+1],"%H:%M:%S"))-(datetime.datetime.strptime(df1.loc[i][j],"%H:%M:%S"))
if(x>datetime.timedelta(seconds=600)):
print(df1.loc[i][j]," ",df1.loc[i][j+1])
print(i,"fixed")
#this gives us the complete appended dictionary with two new entries however we are missing the Date column for these
z=list(df1.loc[i][:j])
y=list(df1.loc[i][j+1:])
z={i:z}
y={i:y}
df2=pd.DataFrame.from_dict(z, orient='index')
df3=pd.DataFrame.from_dict(y, orient='index')
df1=df1.drop(i)
df1 = pd.concat([df2, df1], ignore_index=False,sort=False)
df1 = pd.concat([df3, df1], ignore_index=False,sort=False)
df1=df1.fillna(0)
df1=df1.replace(to_replace ='0',value =0)
break
else:
break
所以在处理之后,我得到了一个按结果名称存储的字典。
我在这之后得到的错误是:
Traceback (most recent call last):
File "path", line 85, in <module>
x=(datetime.datetime.strptime(df1.loc[i][j+1],"%H:%M:%S"))-(datetime.datetime.strptime(df1.loc[i][j],"%H:%M:%S"))
TypeError: strptime() argument 1 must be str, not int
正如您所看到的,我已指定 if 条件不超出非零元素,但仍然会发生此错误。