(我最初发布了一个问题here https://stackoverflow.com/questions/66478148/create-a-list-of-vectors-from-a-vector-where-n-consecutive-values-are-not-0-in-r/66480056#66480056,但它并没有完全涵盖我的问题)
我有一个带有“日期”列和降水量(降雨量)的数据框:
date precip
1 1 0.0
2 2 0.0
3 3 12.4
4 4 10.2
5 5 0.0
6 6 13.6
我想创建一个“事件”列,其中包含每个连续降雨周期的计数器(ID)。降雨事件可以定义为降水量大于例如 的连续运行。 0。
如果我们不允许任何零降雨的短暂间隙,“事件”将如下所示,带有一个用于非降雨的计数器0
期间,以及NA
没有下雨的时期。
date precip event
1 1 0.0 NA
2 2 0.0 NA
3 3 12.4 1
4 4 10.2 1
5 5 0.0 NA
6 6 13.6 2
此外,我希望能够允许较短的无雨时间,例如尺寸的n
= 1 天,每次运行非0
.
例如,在上面的数据框中,如果我们允许在连续的降雨期内有 1 天的降雨量为 0,例如第 5 天,然后第 3 天到第 6 天可以定义为一次降雨事件:
date precip event
1 1 0.0 NA
2 2 0.0 NA
3 3 12.4 1
4 4 10.2 1
5 5 0.0 1 # <- gap of 1 day with no rain: OK
6 6 13.6 1
稍微大一点的玩具数据集:
structure(list(date = 1:31, precip = c(0, 0, 12.3999996185303,
10.1999998092651, 0, 13.6000003814697, 16.6000003814697, 21.5,
7.59999990463257, 0, 0, 0, 0.699999988079071, 0, 0, 0, 5.40000009536743,
0, 1, 35.4000015258789, 11.5, 16.7000007629395, 13.5, 13.1000003814697,
11.8000001907349, 1.70000004768372, 0, 15.1000003814697, 12.8999996185303,
3.70000004768372, 24.2999992370605)), row.names = c(NA, -31L), class = "data.frame")
现在我真的被困住了。我尝试了一些奇怪的事情,比如下面的(只是一个开始),但我想我自己不会弄清楚,并且非常感谢任何帮助
# this is far from being any helpful, but just to show the direction I was heading...
# the threshold could be 0 to mirror the example above...
rainfall_event = function(df,
daily_thresh = .2,
n = 1) {
for (i in 1:nrow(df)) {
zero_index = 1
if (df[i,]$precip < daily_thresh) {
# every time you encounter a value below the threshold count the 0s
zero_counter = 0
while (df[i,]$precip < daily_thresh) {
zero_counter = zero_counter + 1
if (i != nrow(df)) {
i = i + 1
zero_index = zero_index + 1
} else{
break
}
}
if (zero_counter > n) {
df[zero_index:zero_index + zero_counter,][["event"]] = NA
}
} else{
event_counter = 1
while (df[i, ]$precip > daily_thresh) {
df[["event"]] = event_counter
if (i != nrow(rainfall_one_slide)) {
i = i + 1
} else{
break
}
}
}
}
}