一种矢量化方式(将“矢量化”视为“避免 Python 级循环”)是将其视为线性信号滤波器 https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.lfilter.html:
import numpy as np
import pandas as pd
import scipy.signal
def via_lfilter(arr):
period = 14
y0 = 5.0 # initial value
# calc_data[idx] = (calc_data[idx - 1] * (period - 1) + df.base.iloc[idx]) / period
b = [1.0/period] # coefficients of 'original' terms
a = [1.0, -(period-1)/period] # coefficients of 'computed' terms
zi = scipy.signal.lfiltic(b, a, [y0], x=arr[1::-1])
y = np.zeros_like(arr)
y[0] = y0
result = scipy.signal.lfilter(b, a, arr[1:], axis=0, zi=zi)
y[1:] = result[0]
return y
但在现实世界中,我只使用 numba,它的设计正是为了给我们带来矢量化的性能优势,而不会带来麻烦:
import numba
@numba.jit(nopython=True)
def via_numba(arr):
calc_data = np.zeros_like(arr)
period = 14
calc_data[0] = 5.0 # initial value
for idx in range(1, len(arr)):
calc_data[idx] = (calc_data[idx - 1] * (period - 1) + arr[idx]) / period
return calc_data
这些给了我:
In [238]: df["vect"] = via_lfilter(df.base.values.astype(float))
...: df["via_numba"] = via_numba(df.base.values.astype(float))
...:
...:
In [239]: df
Out[239]:
base calculated vect via_numba
0 15 5.000000 5.000000 5.000000
1 16 5.785714 5.785714 5.785714
2 2 5.515306 5.515306 5.515306
3 16 6.264213 6.264213 6.264213
4 14 6.816769 6.816769 6.816769
5 1 6.401286 6.401286 6.401286
6 18 7.229765 7.229765 7.229765
7 18 7.999068 7.999068 7.999068
8 4 7.713420 7.713420 7.713420
9 7 7.662461 7.662461 7.662461
10 4 7.400857 7.400857 7.400857
11 18 8.157939 8.157939 8.157939
12 19 8.932372 8.932372 8.932372
13 13 9.222916 9.222916 9.222916
14 16 9.706994 9.706994 9.706994
15 11 9.799351 9.799351 9.799351
16 1 9.170826 9.170826 9.170826
17 8 9.087196 9.087196 9.087196
18 1 8.509539 8.509539 8.509539
19 9 8.544572 8.544572 8.544572
并且两者在较大的框架下表现得相当合理:
In [240]: df = pd.DataFrame({"base": np.random.uniform(1, 100, 10**6)})
In [241]: %timeit via_lfilter(df.base.values.astype(float))
11.4 ms ± 49.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [242]: %timeit via_numba(df.base.values.astype(float))
11 ms ± 342 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)