假设您要计算日期之间的天数,那么这是一种解决方案:
import datetime as dt
diff = (pd.to_datetime(df.finish_date) - pd.to_datetime(df.start_date)).dt.days
EDIT
另一种选择是
Start = pd.to_datetime(df.finish_date)
End = pd.to_datetime(df.start_date)
End.subtract(Start)
示例:这里我选择计算 df 中的日期与现在之间的差异。
metric_id device_id timestamp cpu_5min vol_max
0 device_1 2020-12-04 05:15:00 116.0 734.0
1 device_1 2020-12-04 05:30:00 213.0 325.0
2 device_1 2020-12-04 05:35:00 427.0 668.0
3 device_2 2020-12-04 05:15:00 540.0 NaN
4 device_2 2020-12-04 05:30:00 127.0 NaN
5 device_2 2020-12-04 05:35:00 654.0 NaN
and
df['tDATE'] = pd.to_datetime(df['timestamp'])
df['DIFF1'] = (df['tDATE'] - dt.datetime.now()).dt.days #method 11
df['DIFF2'] = df['tDATE'].subtract(dt.datetime.now()) #method2
返回
metric_id device_id timestamp cpu_5min vol_max \
0 device_1 2020-12-04 05:15:00 116.0 734.0
1 device_1 2020-12-04 05:30:00 213.0 325.0
2 device_1 2020-12-04 05:35:00 427.0 668.0
3 device_2 2020-12-04 05:15:00 540.0 NaN
4 device_2 2020-12-04 05:30:00 127.0 NaN
5 device_2 2020-12-04 05:35:00 654.0 NaN
metric_id tDATE difd DIFF1 DIFF2
0 2020-12-04 05:15:00 -14 -14 -14 days +22:13:26.627607
1 2020-12-04 05:30:00 -14 -14 -14 days +22:28:26.627607
2 2020-12-04 05:35:00 -14 -14 -14 days +22:33:26.627607
3 2020-12-04 05:15:00 -14 -14 -14 days +22:13:26.627607
4 2020-12-04 05:30:00 -14 -14 -14 days +22:28:26.627607
5 2020-12-04 05:35:00 -14 -14 -14 days +22:33:26.627607
编辑:使用时间戳
从您下面的评论中可以看出,由于您正在使用此时间戳,所以上面的示例显然需要准备。请注意,这就是为什么在提出问题时提供足够的信息很重要(例如,您正在处理什么类型的数据)。当涉及到日期时,这一点甚至更加重要,因为有多种格式。这是您在评论中给出的日期格式的示例:
import datetime as dt
Date = '2020-09-03T16:18:38.929863799Z'
Date2 = '2020-10-03T16:18:38.929863799Z'
你这里有的是Timestamps
,所以你的第一步是将它们转换为datetime
然后使用to_pydate
(它过去被称为Timestamp.to_datetime()
但现已弃用。
Date = pd.to_datetime(Date)
Date2 = pd.to_datetime(Date2)
DATE_1 = Date.to_pydatetime()
DATE_2 = Date2.to_pydatetime()
之后你可以计算差异
DIFF = (pd.to_datetime(DATE_1) -pd.to_datetime(DATE_2))
这是Timedelta('-30 days +00:00:00')