spacy 与 joblib 库生成 _pickle.PicklingError: 无法腌制任务以将其发送给工作人员

2024-01-15

我有一个很大的句子列表(约 700 万个),我想从中提取名词。

I used joblib并行提取过程的库,如下所示:

import spacy
from tqdm import tqdm
from joblib import Parallel, delayed
nlp = spacy.load('en_core_web_sm')

class nouns:

    def get_nouns(self, text):
        doc = nlp(u"{}".format(text))
        return [token.text for token in doc if token.tag_ in ['NN', 'NNP', 'NNS', 'NNPS']]

    def parallelize(self, sentences):
        results = Parallel(n_jobs=1)(delayed(self.get_nouns)(sent) for sent in tqdm(sentences))
        return results

if __name__ == '__main__':
    sentences = ['we went to the school yesterday',
                 'The weather is really cold',
                 'Can we catch the dog?',
                 'How old are you John?',
                 'I like diving and swimming',
                 'Can the world become united?']
    obj = nouns()
    print(obj.parallelize(sentences))

when n_jobs当并行化函数大于 1 时,我得到这个长错误:

100%|██████████| 6/6 [00:00<00:00, 200.00it/s]
joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Python35\lib\site-packages\joblib\externals\loky\backend\queues.py", line 150, in _feed
    obj_ = dumps(obj, reducers=reducers)
  File "C:\Python35\lib\site-packages\joblib\externals\loky\backend\reduction.py", line 243, in dumps
    dump(obj, buf, reducers=reducers, protocol=protocol)
  File "C:\Python35\lib\site-packages\joblib\externals\loky\backend\reduction.py", line 236, in dump
    _LokyPickler(file, reducers=reducers, protocol=protocol).dump(obj)
  File "C:\Python35\lib\site-packages\joblib\externals\cloudpickle\cloudpickle.py", line 267, in dump
    return Pickler.dump(self, obj)
  File "C:\Python35\lib\pickle.py", line 408, in dump
    self.save(obj)
  File "C:\Python35\lib\pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python35\lib\pickle.py", line 623, in save_reduce
    save(state)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Python35\lib\pickle.py", line 836, in _batch_setitems
    save(v)
  File "C:\Python35\lib\pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python35\lib\pickle.py", line 623, in save_reduce
    save(state)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Python35\lib\pickle.py", line 841, in _batch_setitems
    save(v)
  File "C:\Python35\lib\pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python35\lib\pickle.py", line 623, in save_reduce
    save(state)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Python35\lib\pickle.py", line 836, in _batch_setitems
    save(v)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 770, in save_list
    self._batch_appends(obj)
  File "C:\Python35\lib\pickle.py", line 797, in _batch_appends
    save(tmp[0])
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 725, in save_tuple
    save(element)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\site-packages\joblib\externals\cloudpickle\cloudpickle.py", line 718, in save_instancemethod
    self.save_reduce(types.MethodType, (obj.__func__, obj.__self__), obj=obj)
  File "C:\Python35\lib\pickle.py", line 599, in save_reduce
    save(args)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 725, in save_tuple
    save(element)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\site-packages\joblib\externals\cloudpickle\cloudpickle.py", line 395, in save_function
    self.save_function_tuple(obj)
  File "C:\Python35\lib\site-packages\joblib\externals\cloudpickle\cloudpickle.py", line 594, in save_function_tuple
    save(state)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Python35\lib\pickle.py", line 836, in _batch_setitems
    save(v)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Python35\lib\pickle.py", line 841, in _batch_setitems
    save(v)
  File "C:\Python35\lib\pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python35\lib\pickle.py", line 623, in save_reduce
    save(state)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Python35\lib\pickle.py", line 836, in _batch_setitems
    save(v)
  File "C:\Python35\lib\pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python35\lib\pickle.py", line 599, in save_reduce
    save(args)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 740, in save_tuple
    save(element)
  File "C:\Python35\lib\pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python35\lib\pickle.py", line 623, in save_reduce
    save(state)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 740, in save_tuple
    save(element)
  File "C:\Python35\lib\pickle.py", line 495, in save
    rv = reduce(self.proto)
  File "stringsource", line 2, in preshed.maps.PreshMap.__reduce_cython__
TypeError: self.c_map cannot be converted to a Python object for pickling
"""Exception in thread QueueFeederThread:
Traceback (most recent call last):
  File "C:\Python35\lib\site-packages\joblib\externals\loky\backend\queues.py", line 150, in _feed
    obj_ = dumps(obj, reducers=reducers)
  File "C:\Python35\lib\site-packages\joblib\externals\loky\backend\reduction.py", line 243, in dumps
    dump(obj, buf, reducers=reducers, protocol=protocol)
  File "C:\Python35\lib\site-packages\joblib\externals\loky\backend\reduction.py", line 236, in dump
    _LokyPickler(file, reducers=reducers, protocol=protocol).dump(obj)
  File "C:\Python35\lib\site-packages\joblib\externals\cloudpickle\cloudpickle.py", line 267, in dump
    return Pickler.dump(self, obj)
  File "C:\Python35\lib\pickle.py", line 408, in dump
    self.save(obj)
  File "C:\Python35\lib\pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python35\lib\pickle.py", line 623, in save_reduce
    save(state)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Python35\lib\pickle.py", line 836, in _batch_setitems
    save(v)
  File "C:\Python35\lib\pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python35\lib\pickle.py", line 623, in save_reduce
    save(state)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Python35\lib\pickle.py", line 841, in _batch_setitems
    save(v)
  File "C:\Python35\lib\pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python35\lib\pickle.py", line 623, in save_reduce
    save(state)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Python35\lib\pickle.py", line 836, in _batch_setitems
    save(v)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 770, in save_list
    self._batch_appends(obj)
  File "C:\Python35\lib\pickle.py", line 797, in _batch_appends
    save(tmp[0])
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 725, in save_tuple
    save(element)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\site-packages\joblib\externals\cloudpickle\cloudpickle.py", line 718, in save_instancemethod
    self.save_reduce(types.MethodType, (obj.__func__, obj.__self__), obj=obj)
  File "C:\Python35\lib\pickle.py", line 599, in save_reduce
    save(args)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 725, in save_tuple
    save(element)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\site-packages\joblib\externals\cloudpickle\cloudpickle.py", line 395, in save_function
    self.save_function_tuple(obj)
  File "C:\Python35\lib\site-packages\joblib\externals\cloudpickle\cloudpickle.py", line 594, in save_function_tuple
    save(state)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Python35\lib\pickle.py", line 836, in _batch_setitems
    save(v)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Python35\lib\pickle.py", line 841, in _batch_setitems
    save(v)
  File "C:\Python35\lib\pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python35\lib\pickle.py", line 623, in save_reduce
    save(state)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Python35\lib\pickle.py", line 836, in _batch_setitems
    save(v)
  File "C:\Python35\lib\pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python35\lib\pickle.py", line 599, in save_reduce
    save(args)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 740, in save_tuple
    save(element)
  File "C:\Python35\lib\pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python35\lib\pickle.py", line 623, in save_reduce
    save(state)
  File "C:\Python35\lib\pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python35\lib\pickle.py", line 740, in save_tuple
    save(element)
  File "C:\Python35\lib\pickle.py", line 495, in save
    rv = reduce(self.proto)
  File "stringsource", line 2, in preshed.maps.PreshMap.__reduce_cython__
TypeError: self.c_map cannot be converted to a Python object for pickling

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Python35\lib\threading.py", line 914, in _bootstrap_inner
    self.run()
  File "C:\Python35\lib\threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Python35\lib\site-packages\joblib\externals\loky\backend\queues.py", line 175, in _feed
    onerror(e, obj)
  File "C:\Python35\lib\site-packages\joblib\externals\loky\process_executor.py", line 310, in _on_queue_feeder_error
    self.thread_wakeup.wakeup()
  File "C:\Python35\lib\site-packages\joblib\externals\loky\process_executor.py", line 155, in wakeup
    self._writer.send_bytes(b"")
  File "C:\Python35\lib\multiprocessing\connection.py", line 183, in send_bytes
    self._check_closed()
  File "C:\Python35\lib\multiprocessing\connection.py", line 136, in _check_closed
    raise OSError("handle is closed")
OSError: handle is closed



The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File ".../playground.py", line 43, in <module>
    print(obj.Paralize(sentences))
  File ".../playground.py", line 32, in Paralize
    results = Parallel(n_jobs=2)(delayed(self.get_nouns)(sent) for sent in tqdm(sentences))
  File "C:\Python35\lib\site-packages\joblib\parallel.py", line 934, in __call__
    self.retrieve()
  File "C:\Python35\lib\site-packages\joblib\parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "C:\Python35\lib\site-packages\joblib\_parallel_backends.py", line 521, in wrap_future_result
    return future.result(timeout=timeout)
  File "C:\Python35\lib\concurrent\futures\_base.py", line 405, in result
    return self.__get_result()
  File "C:\Python35\lib\concurrent\futures\_base.py", line 357, in __get_result
    raise self._exception
_pickle.PicklingError: Could not pickle the task to send it to the workers.

我的代码有什么问题?


同样的问题。我通过更改后端解决了loky to threading in Parallel.

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

spacy 与 joblib 库生成 _pickle.PicklingError: 无法腌制任务以将其发送给工作人员 的相关文章

  • winpdb 不适用于 python 3.3

    我无法让 rpdb2 与 python 3 3 一起运行 但根据多个来源 这应该是可能的 rpdb2 d myscript py A password should be set to secure debugger client serv
  • 在 Kivy 应用程序中获取文本输入值

    Python Kivy 新手尝试构建一个测试应用程序 其中包含输入框 确定按钮和单击确定按钮时应更改文本的标签 但我得到了 NameError 全局名称 txt1 未定义 我究竟做错了什么 import Kivy import kivy i
  • Bokeh 相当于 matplotlib 子图

    我正在寻找一种方法来创建包含多个子图的绘图 例如 fig ax0 ax1 plt subplots nrows 2 sharex True 可以在 matplotlib 中完成 然后可以通过以下方式解决ax0 and ax1 有没有办法在
  • 如何从numpy数组中获取两个最小值

    我想从数组中取出两个最小值x 但是当我使用np where A B np where x x min 0 1 我收到此错误 ValueError 需要超过 1 个值才能解压 我该如何修复这个错误 我需要在数组中按升序排列数字吗 您可以使用n
  • 如何读取 10 位原始图像?其中包含 RGB-IR 数据

    我想知道如何从我的 10 位原始 它有 rgb ir 图像数据 数据中提取 RGB 图像 如何使用 Python 或 MATLAB 进行阅读 拍摄时的相机分辨率为 1280x720 室内照片图片下载 https drive google c
  • 对seaborn图中的分类x轴进行排序

    我正在尝试使用 seaborn 散点图绘制数据框中前 30 的值 如下所示 同一图的可重现代码 import seaborn as sns df sns load dataset iris function to return top 30
  • 我的本地 postgresql 数据库 url 的形式是什么?

    我正在学习 Flask sqlalchemy 教程https pythonhosted org Flask SQLAlchemy quickstart html a minimal application https pythonhoste
  • Flask 无法识别两个 URL 参数

    我正在尝试将两个参数发送到使用 Flask 路由的 URL If I do curl i http 127 0 0 1 5000 api journeys count startStationName Hansard 20Mews 20Sh
  • 如何在 Django 中创建多选框?

    我正在尝试创建多选框字段来自姜戈选择 2 https github com applegrew django select2库如下图所示 我使用了下一个代码 但它返回简单的选择多个小部件 我想我忘了补充一些东西 我的错误在哪里 有人可以告诉
  • Python:从字符串访问变量[重复]

    这个问题在这里已经有答案了 这可能是非常基本和简单的事情 我可能只是在谷歌上搜索错误的术语 但希望这里有人可以帮助我 我仍然是编程的初学者 这从这个问题中可能是显而易见的 我正在寻找一种从字符串访问变量的方法 像这样 A1 B1 C1 my
  • 是否可以使用seaborn 进行“缩放插图”?

    这个例子 https matplotlib org examples pylab examples axes demo html来自 matplotlib 的展示了如何进行插图 不过我正在使用seaborn 特别是kdeplot sns k
  • matplotlib 轴标签偏移量的因素和变化

    在 matplotlib 中的轴刻度标签上 有两种可能的偏移量 factors and shifts 在右下角 1e 8 是一个 因子 1 441249698e1 是一个 移位 这里有很多答案展示了如何操纵两个都 matplotlib 将轴
  • TensorFlow - 为什么这个 softmax 回归没有学到任何东西?

    我的目标是用 TensorFlow 做大事 但我正在尝试从小事做起 我有一些小的灰度方块 有一点噪音 我想根据它们的颜色对它们进行分类 例如 3 个类别 黑色 灰色 白色 我编写了一个小 Python 类来生成正方形和 1 hot 向量 并
  • 如何用不同的颜色填充seaborn.distplot中的区域

    是否可以用颜色填充两条阈值线 line1 和 line2 之外的区域 并通过 distplot 绘制的 KDE 曲线限制 Y 轴 代表我的应用程序的 3 sigmas import pylab as pl import seaborn as
  • 第 100 次避免循环导入

    Summary 我继续有一个ImportError在一个复杂的项目中 我已经将其蒸馏到仍然会出现错误的最低限度 Example 巫师有装有绿色和棕色药水的容器 这些可以添加在一起 产生同样是绿色或棕色的新药水 我们有一个PotionABC
  • 如何在 Python Paramiko 中配置 ssh StrictHostKeyChecking=no 的等效项

    我正在使用 Paramiko 通过 Python 脚本进行 sshing 我的ssh命令如下 ssh A o strictHostKeyChecking no
  • 从 C++ 检索 Python 类型

    这个问题实际上是以下两个问题的延伸 如何在 Python 中实现 C 类 以供 C 调用 https stackoverflow com questions 9040669 how can i implement a c class in
  • mpld3图,注释问题

    我正在使用 mpld3 在 Intranet 网站上显示图形 我正在使用将图形保存到字典并使用 mpld3 js 在客户端渲染它的选项 除非我想使用注释 否则该图呈现良好 这些显然是抵消的 我不明白为什么 因为即使我将偏移量设置为 0 0
  • 设置restrict_xpaths设置后出现UnicodeEncodeError

    我是 python 和 scrapy 的新手 将restrict xpaths 设置设置为 table class lista 后 我收到了以下回溯 奇怪的是 通过使用其他 xpath 规则 爬虫可以正常工作 Traceback most
  • Python 调度一个作业,每个工作日开始,每小时运行一次

    我目前有一个示例代码定义为 import schedule import time def job t print I m working t return schedule every day at 01 00 do job It is

随机推荐