这是另一种方法。我试图将所有数据保留在此处的最终输出中。请注意,出于演示目的,我对您的数据进行了一些修改。在我的代码中,我首先按以下方式排列数据ID
and Time
。然后我就改变了Status
(即 Y 和 N)为 0 和 1 以创建group
. Here, group
可以告诉我们什么时候Status
改变了。如果您看到几行出现相同的数字,则意味着Status
没有改变。然后,我计算了时间差(即gap
)对于每个 ID。终于,我改变了gap
对于每组,未出现在第一行的值为 NA。也就是说,我做了不必要的间隙 NA。请注意,每个 ID 的第一个观察值在gap
以及。gap
排在第二位。
ann <- data.frame(ID = c(1,2,3,4,1,2,2,1,1,1,3),
Status = c("Y", "Y", "Y", "Y",
"N", "N", "Y", "Y", "Y", "N", "N"),
Time = c("2013-07-01 08:07:00", "2013-07-01 08:07:03",
"2013-07-01 08:07:04", "2013-07-01 08:07:06",
"2013-07-01 08:07:07", "2013-07-01 08:07:23",
"2013-07-01 08:07:34", "2013-07-01 08:07:45",
"2013-07-01 08:07:47", "2013-07-01 08:07:56",
"2013-07-01 08:07:58"),
stringsAsFactors = FALSE)
ann$Time <- as.POSIXct(ann$Time)
# ID Status Time
#1 1 Y 2013-07-01 08:07:00
#2 2 Y 2013-07-01 08:07:03
#3 3 Y 2013-07-01 08:07:04
#4 4 Y 2013-07-01 08:07:06
#5 1 N 2013-07-01 08:07:07
#6 2 N 2013-07-01 08:07:23
#7 2 Y 2013-07-01 08:07:34
#8 1 Y 2013-07-01 08:07:45
#9 1 Y 2013-07-01 08:07:47
#10 1 N 2013-07-01 08:07:56
#11 3 N 2013-07-01 08:07:58
ann %>%
arrange(ID, Time) %>%
group_by(ID) %>%
mutate(Status = ifelse(Status == "Y", 1, 0),
group = cumsum(c(T, diff(Status) != 0)),
gap = Time - lag(Time)) %>%
group_by(ID, group) %>%
mutate(gap = ifelse(row_number() != 1, NA, gap))
# ID Status Time group gap
#1 1 1 2013-07-01 08:07:00 1 NA
#2 1 0 2013-07-01 08:07:07 2 7
#3 1 1 2013-07-01 08:07:45 3 38
#4 1 1 2013-07-01 08:07:47 3 NA
#5 1 0 2013-07-01 08:07:56 4 9
#6 2 1 2013-07-01 08:07:03 1 NA
#7 2 0 2013-07-01 08:07:23 2 20
#8 2 1 2013-07-01 08:07:34 3 11
#9 3 1 2013-07-01 08:07:04 1 NA
#10 3 0 2013-07-01 08:07:58 2 54
#11 4 1 2013-07-01 08:07:06 1 NA