我一直在比较使用模仿 python 字典的 c++ 映射与在 cython 中使用普通 python 字典的性能。我在 sklearn 中编写了“fast_dict”实现的(简化)变体(https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/fast_dict.pyx https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/fast_dict.pyx)。与 fast_dict 文档字符串中的注释一致,在 cpp 情况下,映射创建速度要慢得多,但查找也是如此(我预计会更快)。
我的实现基本上如下所示(很大程度上来自 sklearn 实现,但经过修改以涉及 Int->Int 的映射):
from libcpp.map cimport map as cpp_map
from cython.operator cimport derefernce as deref
cdef class IntDict:
def __init__(self,dict orig_dict={}):
"""loads from an ordinary dict directly"""
for key,val in orig_dict.iteritems():
self.my_map[key] = val
def __setitem__(self, int key, int value):
self.my_map[key] = value
def __getitem__(self, int key):
cdef cpp_map[ITYPE_t,ITYPE_t].iterator it = self.my_map.find(key)
if it == self.my_map.end():
raise KeyError('%d' % key)
return deref(it).second
.pxd 文件如下所示(同样,主要来自 sklearn):
DTYPE = np.float64
ctypedef np.float64_t DTYPE_t
ITYPE = np.intp
ctypedef np.intp_t ITYPE_t
cdef class IntDict:
cdef cpp_map[ITYPE_t,ITYPE_t] my_mapenter
以下是我设计的测试:
cpdef load_ord_dict(r=100000):
cdef dict d = {}
cdef int i,size = len(range(r))
for i in range(size):
d[i] = 0
return d
cpdef load_cpp(r=100000):
cdef IntDict d = IntDict()
cdef int i,size = len(range(r))
for i in range(size):
d[i] = 0
return d
cpdef lookup_ord(dict d,r=100000):
cdef int i,size = len(range(r))
for i in range(size):
d[i]
cpdef lookup_cpp(IntDict d,r=100000):
cdef int i,size = len(range(r))
for i in range(size):
d[i]
结果(在linux机器上测试,在我的mac上结果类似):
In [3]: timeit(load_ord_dict())
100 loops, best of 3: 4.3 ms per loop
In [4]: timeit(load_cpp())
10 loops, best of 3: 21.7 ms per loop
In [5]: d1 = load_ord_dict()
In [6]: timeit(lookup_ord(d1))
100 loops, best of 3: 3.95 ms per loop
In [7]: d2 = load_cpp()
In [8]: timeit(lookup_cpp(d2))
100 loops, best of 3: 11.8 ms per loop
有想过为什么 cpp 版本这么慢吗?我在这里做错了什么吗?
更新:我按照下面的建议为 unordered_map 添加了相同的测试。性能似乎介于字典和有序映射之间:
from libcpp.unordered_map cimport unordered_map as umap
cdef class UnOrdIntDict:
def __init__(self,dict orig_dict={}):
for key,val in orig_dict.iteritems():
"""exchange cpp_map[..] in pxd with umap[..] in this case"""
self.my_map[key] = val
def __setitem__(self, int key, int value):
self.my_map[key] = value
def __getitem__(self,int key):
"""I'm assuming that this works like the ordinary cpp_map"""
cdef umap[ITYPE_t,ITYPE_t].iterator it = self.my_map.find(key)
if it == self.my_map.end():
raise KeyError('%d' % key)
return deref(it).second
cpdef load_cpp_unordered(r=100000):
cdef UnOrdIntDict d = UnOrdIntDict()
cdef int i,size = len(range(r))
for i in range(size):
d[i] = 0
return d
cpdef lookup_cpp_unordered(UnOrdIntDict d,r=100000):
cdef int i,size = len(range(r))
for i in range(size):
d[i]
结果:
In [3]: timeit(load_cpp_unordered())
10 loops, best of 3: 29.8 ms per loop
In [4]: timeit(lookup_cpp_unordered(d3))
100 loops, best of 3: 8.23 ms per loop