cut
与日期时间对象配合得很好,因此可用于创建您希望聚合的 5 分钟间隔。这是一个例子:
首先,一些示例数据:
set.seed(1)
mydf <- data.frame(P_alex = sample(0:5, 40, replace = TRUE),
P_hvh = sample(0:3, 40, replace = TRUE),
date = as.POSIXct("2011-06-27 22:00:00") + 60 * 0:39)
list(head(mydf), tail(mydf))
# [[1]]
# P_alex P_hvh date
# 1 1 3 2011-06-27 22:00:00
# 2 2 2 2011-06-27 22:01:00
# 3 3 3 2011-06-27 22:02:00
# 4 5 2 2011-06-27 22:03:00
# 5 1 2 2011-06-27 22:04:00
# 6 5 3 2011-06-27 22:05:00
#
# [[2]]
# P_alex P_hvh date
# 35 4 1 2011-06-27 22:34:00
# 36 4 3 2011-06-27 22:35:00
# 37 4 3 2011-06-27 22:36:00
# 38 0 1 2011-06-27 22:37:00
# 39 4 3 2011-06-27 22:38:00
# 40 2 3 2011-06-27 22:39:00
现在,执行聚合。在下面的示例中,我们聚合原始数据集中的所有列,但从数据集中删除“date”变量(使用mydf[setdiff(names(mydf), "date")]
).
# Aggregate all columns by the intervals created with cut.
# For the dataset, we drop the original date column since
# it is no longer needed here. Our function is "sum"
aggregate(. ~ cut(mydf$date, "5 min"),
mydf[setdiff(names(mydf), "date")],
sum)
# cut(mydf$date, "5 min") P_alex P_hvh
# 1 2011-06-27 22:00:00 12 12
# 2 2011-06-27 22:05:00 16 8
# 3 2011-06-27 22:10:00 12 5
# 4 2011-06-27 22:15:00 17 6
# 5 2011-06-27 22:20:00 10 8
# 6 2011-06-27 22:25:00 11 8
# 7 2011-06-27 22:30:00 12 7
# 8 2011-06-27 22:35:00 14 13