“pickle 数据流”是“什么”的通用描述pickle.dump
and pickle.load
do”. A 数据流例如,可以读取数据的文件依次地。它是一个泡菜数据当所述流包含时的流pickle 生成或消耗的数据.
Pickle Streams 有一个概念内部参考文献- 如果同一个对象在流中出现多次,则它仅存储一次,然后仅被引用。但是,这仅指已存储在流中的内容 - 引用不能指向流外部的对象,例如原始对象。从概念上讲,pickle 数据流的内容是其原始数据的副本。
import pickle
bar = (1, 2)
foo = {1: 1, 2: (1, 1), 'bar': bar}
with open('foo.pkl', 'wb') as out_stream: # open a data stream...
pickle.dump((bar, foo), out_stream) # ...for pickle data
with open('foo.pkl', 'rb') as in_stream:
bar2, foo2 = pickle.load(in_stream)
assert bar2 is foo2['bar'] # internal identity is preserved
assert bar is not bar2 # external identity is broken
持久 ID 可用于引用不在流中的内容 - 例如原始对象、全局数据库句柄、另一个流中的内容或类似内容。从概念上讲,持久 ID 只是允许其他代码处理 pickling/unpickling。然而,持久 ID 的定义和实现取决于要解决的问题。
定义和使用持久 ID 并不困难。然而,它需要一些编排和簿记。一个非常简单的例子如下所示:
import pickle
# some object to persist
# usually, one would have some store or bookkeeping in place
bar = (1, 2)
# The create/load implementation of the persistent id
# extends pickling/unpickling
class PersistentPickler(pickle.Pickler):
def persistent_id(self, obj):
"""Return a persistent id for the `bar` object only"""
return "it's a bar" if obj is bar else None
class PersistentUnpickler(pickle.Unpickler):
def persistent_load(self, pers_id):
"""Return the object identified by the persistent id"""
if pers_id == "it's a bar":
return bar
raise pickle.UnpicklingError("This is just an example for one persistent object!")
# we can now dump and load the persistent object
foo = {'bar': bar}
with open("foo.pkl", "wb") as out_stream:
PersistentPickler(out_stream).dump(foo)
with open("foo.pkl", "rb") as in_stream:
foo2 = PersistentUnpickler(in_stream).load()
assert foo2 is not foo # regular objects are not persistent
assert foo2['bar'] is bar # persistent object identity is preserved
作为一个现实世界的例子,我的旧cpy2py模块 https://github.com/maxfischer2781/cpy2py使用pickle在不同解释器之间交换数据。对于常规的类似值的对象,这意味着在一个解释器中进行序列化并在另一种解释器中进行反序列化。对于某些特殊的有状态对象,这意味着仅交换在所有连接的解释器中唯一标识该对象的持久 ID。
涉及到一些簿记,但你可以想到永久ID https://github.com/maxfischer2781/cpy2py/blob/master/cpy2py/proxy/tracker.py在这种情况下作为元组(process_id, object_id, object_type)
。拥有的解释器可以使用此 ID 来查找真实对象,而其他解释器可以创建一个占位符对象。这种情况下的重点是状态不是被存储和复制的,而只是被引用。