python - 池映射的字典迭代器

2024-04-14

我正在处理一组冻结集。我试图在字典“输出”中找到每个冻结集的最小集。我有 70k freezesets,所以我正在制作这个 freezeset 字典的块并并行化这个任务。当我尝试将此字典作为我的函数的输入传递时,仅发送密钥,因此我收到错误,有人可以帮助我找出其中的问题。

output => {frozenset({'rfid', 'zone'}): 0, frozenset({'zone'}): 0, frozenset({'zone', 'time'}): 0}

def reduce(prob,result,output):
    print(output)
    for k in output.keys():
    #Function to do something


def reducer(prob,result,output):
    print(output)
    p = Pool(4) #number of processes = number of CPUs
    func2 = partial(reduce,prob,result)
    reduced_values= p.map( func2,output,chunksize=4)
    p.close() # no more tasks
    p.join()  # wrap up current tasks
    return reduced_values

if __name__ == '__main__':
    final = reducer(prob,result,output)

{frozenset({'rfid', 'zone'}): 0, frozenset({'zone'}): 0, frozenset({'zone', 'time'}): 0}
frozenset({'rfid', 'zone'}) 
Error : AttributeError: 'frozenset' object has no attribute 'keys'

按要求更新

from multiprocessing import Pool
from functools import partial
import itertools

output = {frozenset({'rfid', 'zone'}): 0, frozenset({'zone'}): 0, frozenset({'zone', 'time'}): 0}
prob = {'3': 0.3, '1': 0.15, '2': 0.5, '4': 0.05}
result = {'2': {frozenset({'time', 'zone'}), frozenset({'time', 'rfid'})}, '3': {frozenset({'time', 'rfid'}), frozenset({'rfid', 'zone'})}}

def reduce(prob,result,output):
    print(output)
    for k in output.keys():
        for ky,values in result.items():
            if any(k>=l for l in values):
                output[k] += sum((j for i,j in prob.items() if i == ky))
    return output


def reducer(prob,result,output):
    print(output)
    p = Pool(4) #number of processes = number of CPUs
    func2 = partial(reduce,prob,result)
    reduced_values= p.map( func2,output,chunksize=4)
    p.close() # no more tasks
    p.join()  # wrap up current tasks
    return reduced_values

if __name__ == '__main__':
    final = reducer(prob,result,output)


{frozenset({'zone', 'rfid'}): 0, frozenset({'zone'}): 0, frozenset({'time', 'zone'}): 0}
    for k in output.keys():
AttributeError: 'frozenset' object has no attribute 'keys'
frozenset({'zone', 'rfid'})

来自控制台的完整错误详细信息:

{frozenset({'zone', 'time'}): 0, frozenset({'zone', 'rfid'}): 0, frozenset({'zone'}): 0}
frozenset({'zone', 'time'})
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "F:\Python34\lib\multiprocessing\pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "F:\Python34\lib\multiprocessing\pool.py", line 44, in mapstar
    return list(map(*args))
  File "C:\Users\Dell\workspace\key_mining\src\variable.py", line 16, in reduce
    for k in output.keys():
AttributeError: 'frozenset' object has no attribute 'keys'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\***\variable.py", line 33, in <module>
    final = reducer(prob,result,output)
  File "C:\***\variable.py", line 27, in reducer
    reduced_values= p.map( func2,output,chunksize=4)
  File "F:\Python34\lib\multiprocessing\pool.py", line 260, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "F:\Python34\lib\multiprocessing\pool.py", line 599, in get
    raise self._value
AttributeError: 'frozenset' object has no attribute 'keys'

问题是你正在传递一个dict反对map. When map迭代中的项目output,它正在这样做:

for key in output:  # When you iterate over a dictionary, you just get the keys.
    func2(key)

所以每次func2被称为,所有包含在output是一个单键(afrozenset)来自字典。

根据您上面的评论,您似乎想将整个字典传递给func2,但如果你这样做,你实际上并没有并行做任何事情。我想也许你认为这样做

pool.map(func2, output, chunksize=4)

将导致output字典被分成四个字典,每个块被传递给一个实例func2。但事实并非如此。相反,字典中的每个键都是单独发送的func2.

chunksize只是用来告诉pool有多少个元素output通过进程间通信一次发送给每个子进程。它仅用于内部目的;无论chunksize你用,func2只会用单个元素来调用output.

如果你想实际传递字典的块,你需要这样做:

# Break the output dict into 4 lists of (key, value) pairs
items = list(output.items())
chunksize = 4
chunks = [items[i:i + chunksize ] for i in range(0, len(items), chunksize)]
reduced_values= p.map(func2, chunks)

这将传递一个列表(key, value)元组来自output听写到func2。然后,里面func2,您可以将列表重新转换为字典:

def reduce(prob,result,output):
    output = dict(item for item in output)  # Convert back to a dict
    print(output)
    ...
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

python - 池映射的字典迭代器 的相关文章

随机推荐