以下用于逻辑比较的 numba 编译函数性能下降的原因可能是什么:
from numba import njit
t = (True, 'and_', False)
#@njit(boolean(boolean, unicode_type, boolean))
@njit
def f(a,b,c):
if b == 'and_':
out = a&c
elif b == 'or_':
out = a|c
return out
x = f(*t)
%timeit f(*t)
#1.78 µs ± 9.52 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit f.py_func(*t)
#108 ns ± 0.0042 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
要按照答案中的建议大规模测试这一点:
x = np.random.choice([True,False], 1000000)
y = np.random.choice(["and_","or_"], 1000000)
z = np.random.choice([False, True], 1000000)
#using jit compiled f
def f2(x,y,z):
L = x.shape[0]
out = np.empty(L)
for i in range(L):
out[i] = f(x[i],y[i],z[i])
return out
%timeit f2(x,y,z)
#2.79 s ± 86.4 ms per loop
#using pure Python f
def f3(x,y,z):
L = x.shape[0]
out = np.empty(L)
for i in range(L):
out[i] = f.py_func(x[i],y[i],z[i])
return out
%timeit f3(x,y,z)
#572 ms ± 24.3 ms per
我是否遗漏了一些东西,是否有办法编译“快速”版本,因为这将成为执行约 1e6 次循环的一部分。