我想找到 R 中最快的方法来识别 Ytimes 数组中最接近给定 Xtimes 值的元素索引。
到目前为止,我一直在使用一个简单的 for 循环,但必须有更好的方法来做到这一点:
Xtimes <- c(1,5,8,10,15,19,23,34,45,51,55,57,78,120)
Ytimes <- seq(0,120,length.out = 1000)
YmatchIndex = array(0,length(Xtimes))
for (i in 1:length(Xtimes)) {
YmatchIndex[i] = which.min(abs(Ytimes - Xtimes[i]))
}
print(Ytimes[YmatchIndex])
强制性 Rcpp 解决方案。利用矢量已排序并且不包含重复项的事实来将O(n^2)
进入一个O(n)
。对于您的应用来说可能实用,也可能不实用;)
C++:
#include <Rcpp.h>
#include <cmath>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector closest_pts(NumericVector Xtimes, NumericVector Ytimes) {
int xsize = Xtimes.size();
int ysize = Ytimes.size();
int y_ind = 0;
double minval = R_PosInf;
IntegerVector output(xsize);
for(int x_ind = 0; x_ind < xsize; x_ind++) {
while(std::abs(Ytimes[y_ind] - Xtimes[x_ind]) < minval) {
minval = std::abs(Ytimes[y_ind] - Xtimes[x_ind]);
y_ind++;
}
output[x_ind] = y_ind;
minval = R_PosInf;
}
return output;
}
R:
microbenchmark::microbenchmark(
for_loop = {
for (i in 1:length(Xtimes)) {
which.min(abs(Ytimes - Xtimes[i]))
}
},
apply = sapply(Xtimes, function(x){which.min(abs(Ytimes - x))}),
fndIntvl = {
Y2 <- c(-Inf, Ytimes + c(diff(Ytimes)/2, Inf))
Ytimes[ findInterval(Xtimes, Y2) ]
},
rcpp = closest_pts(Xtimes, Ytimes),
times = 100
)
Unit: microseconds
expr min lq mean median uq max neval cld
for_loop 3321.840 3422.51 3584.452 3492.308 3624.748 10458.52 100 b
apply 68.365 73.04 106.909 84.406 93.097 2345.26 100 a
fndIntvl 31.623 37.09 50.168 42.019 64.595 105.14 100 a
rcpp 2.431 3.37 5.647 4.301 8.259 10.76 100 a
identical(closest_pts(Xtimes, Ytimes), findInterval(Xtimes, Y2))
# TRUE
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)