您不能指望 numba 在如此简单的向量化操作上优于 numpy。此外,您的比较并不完全公平,因为 numba 函数包括外部函数调用的成本。如果对一个更大的数组求和,您会发现两者的性能收敛,并且您所看到的只是非常快速的操作的开销:
import numpy as np
import numba as nb
@nb.njit
def asd(x,y):
return x+y
def asd2(x, y):
return x + y
u=np.random.random(10000)
w=np.random.random(10000)
%timeit asd(u,w)
%timeit asd2(u,w)
The slowest run took 17796.43 times longer than the fastest. This could mean
that an intermediate result is being cached.
100000 loops, best of 3: 6.06 µs per loop
The slowest run took 29.94 times longer than the fastest. This could mean that
an intermediate result is being cached.
100000 loops, best of 3: 5.11 µs per loop
就并行功能而言,对于这个简单的操作,您可以使用nb.vectorize
:
@nb.vectorize([nb.float64(nb.float64, nb.float64)], target='parallel')
def asd3(x, y):
return x + y
u=np.random.random((100000, 10))
w=np.random.random((100000, 10))
%timeit asd(u,w)
%timeit asd2(u,w)
%timeit asd3(u,w)
但同样,如果您操作小型数组,您将看到线程分派的开销。对于上面的数组大小,我发现并行给我带来了 2 倍的加速。
numba 真正出色的地方是执行在 numpy 中使用广播很难执行的操作,或者当操作会导致大量临时中间数组分配时。