如果我有一个向量v
姓名:
John Murray Lisa Mike Joe Ann
0.0832090 0.0475580 -0.2797860 0.1086225 0.0104590 -0.0028250
时间复杂度是多少v['Joe']
versus v[4]
?我猜前者会花费 O(log n) 因为它应该涉及二分搜索,但我仍然不确定后者是否是 O(1) 。
另外,结果是否可以推广到以下情况:v
是列表/数据框而不是原子向量?
好像是大约O(n),即在名称查找的情况下进行矢量扫描。您对使用索引查找的 O(1) 的猜想似乎是合理的......
# Unique names for longish vector
nms <- apply( expand.grid( letters , letters , letters , letters ) , 1 , paste , collapse = "" )
length(nms)
#[1] 456976
length(unique(nms))
#[1] 456976
# Start of names
head(nms)
#[1] "aaaa" "baaa" "caaa" "daaa" "eaaa" "faaa"
# End of names
tail(nms)
#[1] "uzzz" "vzzz" "wzzz" "xzzz" "yzzz" "zzzz"
# Large named vector
x <- setNames( runif( 456976 ) , nms )
# Small named vector
y <- setNames( runif(26) , letters )
# Timing information
require( microbenchmark )
bm <- microbenchmark( x['daaa'] , x[4] , x['vzzz'] , x[456972] , y['d'] , y[4] )
print( bm , order = 'median' , unit = 'relative' , digits = 3 )
#Unit: relative
# expr min lq median uq max neval
# x[456972] NaN 1.00e+00 1.00 1.00 1.000 100
# x[4] Inf 1.00e+00 1.33 1.07 0.957 100
# y[4] NaN 5.01e-01 1.33 1.14 0.191 100
# y["d"] Inf 1.00e+00 2.00 1.25 0.265 100
# x["vzzz"] Inf 6.57e+04 44412.24 9969.64 3439.154 100
# x["daaa"] Inf 6.59e+04 44582.73 10049.63 1207.337 100
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)