我已经训练了一个 TextVectorization 层(见下文),我想将其保存到磁盘,以便下次可以重新加载它?我努力了pickle
and joblib.dump()
。这是行不通的。
from tensorflow.keras.layers.experimental.preprocessing import TextVectorization
text_dataset = tf.data.Dataset.from_tensor_slices(text_clean)
vectorizer = TextVectorization(max_tokens=100000, output_mode='tf-idf',ngrams=None)
vectorizer.adapt(text_dataset.batch(1024))
生成的错误如下:
InvalidArgumentError: Cannot convert a Tensor of dtype resource to a NumPy array
我该如何保存它?
不要腌制对象,而是腌制配置和权重。稍后将其解封并使用配置来创建对象并加载保存的权重。官方文档here https://keras.io/guides/serialization_and_saving/.
Code
text_dataset = tf.data.Dataset.from_tensor_slices([
"this is some clean text",
"some more text",
"even some more text"])
# Fit a TextVectorization layer
vectorizer = TextVectorization(max_tokens=10, output_mode='tf-idf',ngrams=None)
vectorizer.adapt(text_dataset.batch(1024))
# Vector for word "this"
print (vectorizer("this"))
# Pickle the config and weights
pickle.dump({'config': vectorizer.get_config(),
'weights': vectorizer.get_weights()}
, open("tv_layer.pkl", "wb"))
print ("*"*10)
# Later you can unpickle and use
# `config` to create object and
# `weights` to load the trained weights.
from_disk = pickle.load(open("tv_layer.pkl", "rb"))
new_v = TextVectorization.from_config(from_disk['config'])
# You have to call `adapt` with some dummy data (BUG in Keras)
new_v.adapt(tf.data.Dataset.from_tensor_slices(["xyz"]))
new_v.set_weights(from_disk['weights'])
# Lets see the Vector for word "this"
print (new_v("this"))
Output:
tf.Tensor(
[[0. 0. 0. 0. 0.91629076 0.
0. 0. 0. 0. ]], shape=(1, 10), dtype=float32)
**********
tf.Tensor(
[[0. 0. 0. 0. 0.91629076 0.
0. 0. 0. 0. ]], shape=(1, 10), dtype=float32)
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)