可能的重复:
用最新的非 NA 值替换 NA https://stackoverflow.com/questions/7735647/replacing-nas-with-latest-non-na-value
如何使用每列以前的值填充缺失的信息?
Date.end Date.beg Pollster Serra.PSDB
2012-06-26 2012-06-25 Datafolha 31.0
2012-06-27 <NA> <NA> NA
2012-06-28 <NA> <NA> NA
2012-06-29 <NA> <NA> NA
2012-06-30 <NA> <NA> NA
2012-07-01 <NA> <NA> NA
2012-07-02 <NA> <NA> NA
2012-07-03 <NA> <NA> NA
2012-07-04 <NA> Ibope 22
2012-07-05 <NA> <NA> NA
2012-07-06 <NA> <NA> NA
2012-07-07 <NA> <NA> NA
2012-07-08 <NA> <NA> NA
2012-07-09 <NA> <NA> NA
2012-07-10 <NA> <NA> NA
2012-07-11 <NA> <NA> NA
2012-07-12 2012-07-09 Veritá 31.4
我不确定这是否是最好的方法。可能有一些包具有完全相同的功能。以下方法可能不是性能最好的方法,但它确实有效,并且适合中小型数据集。我会谨慎地将其应用于非常大的数据集(超过一百万行或类似的数据)
fillNAByPreviousData <- function(column) {
# At first we find out which columns contain NAs
navals <- which(is.na(column))
# and which columns are filled with data.
filledvals <- which(! is.na(column))
# If there would be no NAs following each other, navals-1 would give the
# entries we need. In our case, however, we have to find the last column filled for
# each value of NA. We may do this using the following sapply trick:
fillup <- sapply(navals, function(x) max(filledvals[filledvals < x]))
# And finally replace the NAs with our data.
column[navals] <- column[fillup]
column
}
以下是使用测试数据集的一些示例:
set.seed(123)
test <- 1:20
test[floor(runif(5,1, 20))] <- NA
> test
[1] 1 2 3 4 5 NA 7 NA 9 10 11 12 13 14 NA 16 NA NA 19 20
> fillNAByPreviousData(test)
[1] 1 2 3 4 5 5 7 7 9 10 11 12 13 14 14 16 16 16 19 20
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)