我试图根据依赖于连续日期的两个条件来过滤数据。
- 我在寻找对于 5 个以上的连续日期,值低于 2,
- with a “缓冲期” of 值 2 到 5 最多连续 3 天.
它看起来像这样(很抱歉这里的 Excel 尝试很糟糕):
第 1 天到第 10 天将包括在内,第 11 天将不包括在内。第 6 天到第 8 天将被视为“缓冲期”。我希望这是有道理的!!
现在,我可以获得缓冲期(在代表中)仅但我不知道如何添加开始和结束条件 对于小于 2 的值5 个连续日期被包括 (这 5 天可能会因中间的缓冲期而被打破,但我觉得这可能会让事情变得复杂).
任何帮助将不胜感激!
对于我的代表(below),最终 df 中包含的日期为蓝色(日期从 1/1/2000 到 1/9/2000,以及 1/22/2000 到 1/30/2000),而灰色日期则不会是。
Reprex:
library("dplyr")
#Goal: include all values with values of 2 or less for 5 consecutive days and allow for a "cushion" period of values of 2 to 5 for up to 3 days
data <- data.frame(Date = c("2000-01-01", "2000-01-02", "2000-01-03", "2000-01-04", "2000-01-05", "2000-01-06", "2000-01-07", "2000-01-08", "2000-01-09", "2000-01-10", "2000-01-11", "2000-01-12", "2000-01-13", "2000-01-14", "2000-01-15", "2000-01-16", "2000-01-17", "2000-01-18", "2000-01-19", "2000-01-20", "2000-01-21", "2000-01-22", "2000-01-23", "2000-01-24", "2000-01-25", "2000-01-26", "2000-01-27", "2000-01-28", "2000-01-29", "2000-01-30"),
Value = c(2,3,4,5,2,2,1,0,1,8,7,9,4,5,2,3,4,5,7,2,6,0,2,1,2,0,3,4,0,1))
head(data)
#Goal: values should include dates from 1/1/2000 to 1/9/2000, and 1/22/2000 to 1/30/2000
#I am able to subset the "cushion period" but I'm not sure how to add the starting and ending conditions for it
attempt1 <- data %>%
group_by(group_id = as.integer(gl(n(),3,n()))) %>%
filter(Value <= 5 & Value >=3) %>%
ungroup() %>%
select(-group_id)
head(attempt1)