Huggingface Bert TPU 微调适用于 Colab，但不适用于 GCP

2024-04-14

我正在尝试在 TPU 上微调 Huggingface Transformers BERT 模型。它在 Colab 中工作，但当我切换到 GCP 上的付费 TPU 时失败。 Jupyter笔记本代码如下：

[1] model = transformers.TFBertModel.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
# works
[2] cluster_resolver = tf.distribute.cluster_resolver.TPUClusterResolver(
    tpu='[My TPU]',
    zone='us-central1-a',
    project='[My Project]'
)
tf.config.experimental_connect_to_cluster(cluster_resolver)
tf.tpu.experimental.initialize_tpu_system(cluster_resolver)
tpu_strategy = tf.distribute.experimental.TPUStrategy(cluster_resolver)
#Also works. Got a bunch of startup messages from the TPU - all good.

[3] with tpu_strategy.scope():
    model = TFBertModel.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
#Generates the error below (long). Same line works in Colab.

这是错误消息：

NotFoundError                             Traceback (most recent call last)
<ipython-input-14-2cfc1a238903> in <module>
      1 with tpu_strategy.scope():
----> 2     model = TFBertModel.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')

~/.local/lib/python3.5/site-packages/transformers/modeling_tf_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    309             return load_pytorch_checkpoint_in_tf2_model(model, resolved_archive_file, allow_missing_keys=True)
    310 
--> 311         ret = model(model.dummy_inputs, training=False)  # build the network with dummy inputs
    312 
    313         assert os.path.isfile(resolved_archive_file), "Error retrieving file {}".format(resolved_archive_file)

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    820           with base_layer_utils.autocast_context_manager(
    821               self._compute_dtype):
--> 822             outputs = self.call(cast_inputs, *args, **kwargs)
    823           self._handle_activity_regularization(inputs, outputs)
    824           self._set_mask_metadata(inputs, outputs, input_masks)

~/.local/lib/python3.5/site-packages/transformers/modeling_tf_bert.py in call(self, inputs, **kwargs)
    688 
    689     def call(self, inputs, **kwargs):
--> 690         outputs = self.bert(inputs, **kwargs)
    691         return outputs
    692 

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    820           with base_layer_utils.autocast_context_manager(
    821               self._compute_dtype):
--> 822             outputs = self.call(cast_inputs, *args, **kwargs)
    823           self._handle_activity_regularization(inputs, outputs)
    824           self._set_mask_metadata(inputs, outputs, input_masks)

~/.local/lib/python3.5/site-packages/transformers/modeling_tf_bert.py in call(self, inputs, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, training)
    548 
    549         embedding_output = self.embeddings([input_ids, position_ids, token_type_ids, inputs_embeds], training=training)
--> 550         encoder_outputs = self.encoder([embedding_output, extended_attention_mask, head_mask], training=training)
    551 
    552         sequence_output = encoder_outputs[0]

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    820           with base_layer_utils.autocast_context_manager(
    821               self._compute_dtype):
--> 822             outputs = self.call(cast_inputs, *args, **kwargs)
    823           self._handle_activity_regularization(inputs, outputs)
    824           self._set_mask_metadata(inputs, outputs, input_masks)

~/.local/lib/python3.5/site-packages/transformers/modeling_tf_bert.py in call(self, inputs, training)
    365                 all_hidden_states = all_hidden_states + (hidden_states,)
    366 
--> 367             layer_outputs = layer_module([hidden_states, attention_mask, head_mask[i]], training=training)
    368             hidden_states = layer_outputs[0]
    369 

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    820           with base_layer_utils.autocast_context_manager(
    821               self._compute_dtype):
--> 822             outputs = self.call(cast_inputs, *args, **kwargs)
    823           self._handle_activity_regularization(inputs, outputs)
    824           self._set_mask_metadata(inputs, outputs, input_masks)

~/.local/lib/python3.5/site-packages/transformers/modeling_tf_bert.py in call(self, inputs, training)
    341         hidden_states, attention_mask, head_mask = inputs
    342 
--> 343         attention_outputs = self.attention([hidden_states, attention_mask, head_mask], training=training)
    344         attention_output = attention_outputs[0]
    345         intermediate_output = self.intermediate(attention_output)

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    820           with base_layer_utils.autocast_context_manager(
    821               self._compute_dtype):
--> 822             outputs = self.call(cast_inputs, *args, **kwargs)
    823           self._handle_activity_regularization(inputs, outputs)
    824           self._set_mask_metadata(inputs, outputs, input_masks)

~/.local/lib/python3.5/site-packages/transformers/modeling_tf_bert.py in call(self, inputs, training)
    290         input_tensor, attention_mask, head_mask = inputs
    291 
--> 292         self_outputs = self.self_attention([input_tensor, attention_mask, head_mask], training=training)
    293         attention_output = self.dense_output([self_outputs[0], input_tensor], training=training)
    294         outputs = (attention_output,) + self_outputs[1:]  # add attentions if we output them

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    820           with base_layer_utils.autocast_context_manager(
    821               self._compute_dtype):
--> 822             outputs = self.call(cast_inputs, *args, **kwargs)
    823           self._handle_activity_regularization(inputs, outputs)
    824           self._set_mask_metadata(inputs, outputs, input_masks)

~/.local/lib/python3.5/site-packages/transformers/modeling_tf_bert.py in call(self, inputs, training)
    222 
    223         batch_size = shape_list(hidden_states)[0]
--> 224         mixed_query_layer = self.query(hidden_states)
    225         mixed_key_layer = self.key(hidden_states)
    226         mixed_value_layer = self.value(hidden_states)

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    820           with base_layer_utils.autocast_context_manager(
    821               self._compute_dtype):
--> 822             outputs = self.call(cast_inputs, *args, **kwargs)
    823           self._handle_activity_regularization(inputs, outputs)
    824           self._set_mask_metadata(inputs, outputs, input_masks)

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/layers/core.py in call(self, inputs)
   1142         outputs = gen_math_ops.mat_mul(inputs, self.kernel)
   1143     if self.use_bias:
-> 1144       outputs = nn.bias_add(outputs, self.bias)
   1145     if self.activation is not None:
   1146       return self.activation(outputs)  # pylint: disable=not-callable

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/ops/nn_ops.py in bias_add(value, bias, data_format, name)
   2756     else:
   2757       return gen_nn_ops.bias_add(
-> 2758           value, bias, data_format=data_format, name=name)
   2759 
   2760 

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/ops/gen_nn_ops.py in bias_add(value, bias, data_format, name)
    675       try:
    676         return bias_add_eager_fallback(
--> 677             value, bias, data_format=data_format, name=name, ctx=_ctx)
    678       except _core._SymbolicException:
    679         pass  # Add nodes to the TensorFlow graph.

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/ops/gen_nn_ops.py in bias_add_eager_fallback(value, bias, data_format, name, ctx)
    703     data_format = "NHWC"
    704   data_format = _execute.make_str(data_format, "data_format")
--> 705   _attr_T, _inputs_T = _execute.args_to_matching_eager([value, bias], ctx)
    706   (value, bias) = _inputs_T
    707   _inputs_flat = [value, bias]

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/eager/execute.py in args_to_matching_eager(l, ctx, default_dtype)
    265         dtype = ret[-1].dtype
    266   else:
--> 267     ret = [ops.convert_to_tensor(t, dtype, ctx=ctx) for t in l]
    268 
    269   # TODO(slebedev): consider removing this as it leaks a Keras concept.

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/eager/execute.py in <listcomp>(.0)
    265         dtype = ret[-1].dtype
    266   else:
--> 267     ret = [ops.convert_to_tensor(t, dtype, ctx=ctx) for t in l]
    268 
    269   # TODO(slebedev): consider removing this as it leaks a Keras concept.

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/framework/ops.py in convert_to_tensor(value, dtype, name, as_ref, preferred_dtype, dtype_hint, ctx, accepted_result_types)
   1312 
   1313     if ret is None:
-> 1314       ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
   1315 
   1316     if ret is NotImplemented:

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/values.py in _tensor_conversion_mirrored(var, dtype, name, as_ref)
   1174 # allowing instances of the class to be used as tensors.
   1175 def _tensor_conversion_mirrored(var, dtype=None, name=None, as_ref=False):
-> 1176   return var._dense_var_to_tensor(dtype=dtype, name=name, as_ref=as_ref)  # pylint: disable=protected-access
   1177 
   1178 

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/values.py in _dense_var_to_tensor(self, dtype, name, as_ref)
    908     if _enclosing_tpu_context() is None:
    909       return super(TPUVariableMixin, self)._dense_var_to_tensor(
--> 910           dtype=dtype, name=name, as_ref=as_ref)
    911     # pylint: enable=protected-access
    912     elif dtype is not None and dtype != self.dtype:

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/values.py in _dense_var_to_tensor(self, dtype, name, as_ref)
   1164     assert not as_ref
   1165     return ops.convert_to_tensor(
-> 1166         self.get(), dtype=dtype, name=name, as_ref=as_ref)
   1167 
   1168   def _clone_with_new_values(self, new_values):

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/values.py in get(self, device)
    835   def get(self, device=None):
    836     if (_enclosing_tpu_context() is None) or (device is not None):
--> 837       return super(TPUVariableMixin, self).get(device=device)
    838     else:
    839       raise NotImplementedError(

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/values.py in get(self, device)
    320         device = distribute_lib.get_update_device()
    321         if device is None:
--> 322           return self._get_cross_replica()
    323     device = device_util.canonicalize(device)
    324     return self._device_map.select_for_device(self._values, device)

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/values.py in _get_cross_replica(self)
   1136     replica_id = self._device_map.replica_for_device(device)
   1137     if replica_id is None:
-> 1138       return array_ops.identity(self.primary)
   1139     return array_ops.identity(self._values[replica_id])
   1140 

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/util/dispatch.py in wrapper(*args, **kwargs)
    178     """Call target, and fall back on dispatchers if there is a TypeError."""
    179     try:
--> 180       return target(*args, **kwargs)
    181     except (TypeError, ValueError):
    182       # Note: convert_to_eager_tensor currently raises a ValueError, not a

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/ops/array_ops.py in identity(input, name)
    265     # variables. Variables have correct handle data when graph building.
    266     input = ops.convert_to_tensor(input)
--> 267   ret = gen_array_ops.identity(input, name=name)
    268   # Propagate handle data for happier shape inference for resource variables.
    269   if hasattr(input, "_handle_data"):

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/ops/gen_array_ops.py in identity(input, name)
   3824         pass  # Add nodes to the TensorFlow graph.
   3825     except _core._NotOkStatusException as e:
-> 3826       _ops.raise_from_not_ok_status(e, name)
   3827   # Add nodes to the TensorFlow graph.
   3828   _, _, _op, _outputs = _op_def_library._apply_op_helper(

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/framework/ops.py in raise_from_not_ok_status(e, name)
   6604   message = e.message + (" name: " + name if name is not None else "")
   6605   # pylint: disable=protected-access
-> 6606   six.raise_from(core._status_to_exception(e.code, message), None)
   6607   # pylint: enable=protected-access
   6608 

/usr/local/lib/python3.5/dist-packages/six.py in raise_from(value, from_value)

NotFoundError: '_MklMatMul' is neither a type of a primitive operation nor a name of a function registered in binary running on n-aa2fcfb7-w-0. One possible root cause is the client and server binaries are not built with the same version. Please make sure the operation or function is registered in the binary running in this process. [Op:Identity]

我将其发布在 Huggingface github 上（https://github.com/huggingface/transformers/issues/2572 https://github.com/huggingface/transformers/issues/2572）并且他们建议 TPU 服务器版本可能与 TPU 客户端版本不匹配，但是 a）我不知道如何检查，也不知道 b）该怎么办。建议表示赞赏。

None

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

Huggingface Bert TPU 微调适用于 Colab，但不适用于 GCP 的相关文章

WCF 服务主机配置 - 请尝试将 HTTP 端口更改为 8732

我的 PC 上运行着一个复杂的基于 WCF 服务的解决方案但由于安装 Windows 8 1 时出现问题我不得不刷新我的 PC 现在我已经重新安装了 Visual Studio 2012 我的项目不再正常运行当我调试单元测试时 w
Java中单例的其他方式[重复]

这个问题在这里已经有答案了只是我在考虑编写单例类的其他方法那么这个类是否被认为是单例类呢 public class MyClass static Myclass myclass static myclass new MyClass pr
Chrome 调试器注入 javascript

我有这样的好奇心是否可以以某种方式在我的页面中注入 javascript 并执行它并调试它正如您在控制台中所做的那样但在控制台中您无法暂停并观察变量是否可以调试我通过控制台输入的代码为什么无法调试通过 XHR 接收的代码 Than
mybatis：使用带有 XML 配置的映射器接口作为全局参数

我喜欢使用 XML 表示法来指定全局参数例如连接字符串我也喜欢 Mapper 注释当我尝试将两者结合起来时我得到这个例外 https stackoverflow com questions 4263832 type interfac
如何在连接到 Heroku PostgreSQL 的 Flask 应用程序上处理更多并发用户？

Heroku 上的 Flask API 有许多端点它们在将 json 化结果返回给客户端之前在我的 Heroku PostgreSQL 数据库上运行查询我当前的计划是 Hobby Basic 层因此数据库最多只能处理 20 个连接如
Matplotlib loglog 的错误刻度/标签（双轴）

我正在使用 matplotlib 创建对数图如下图所示默认刻度选择得很糟糕充其量是这样右边的 y 轴甚至根本没有在线性等效中确实如此而两个 x 轴都只有一个有没有办法获得合理数量的带有标签的刻度 without为每个情节手动指
Jquery 以编程方式更改
文本

编辑解决方案是将其添加到个人资料页面而不是性别页面 profile live pageinit function event p pTest text localStorage getItem gender 我在列表视图中有一个带有一些文
如何为命令 stdout 添加 [stdout] 和 [stderr] 前缀？

使用命令结构 https doc rust lang org std process struct Command html 如何向 stdout 和 stderr 缓冲区添加前缀我希望输出看起来像这样 stdout things are
$ 在 JQuery 中意味着什么

在下面的 var obj one 1 two 2 three 3 four 4 five 5 each obj function i val console log val 这里是什么意思是对象吗是一个别名jQuery对象函数它充当
在DialogFragment中，onCreate应该做什么？

我目前正在摆弄 DialogFragment 以学习使用它我假设相比onCreateView onCreate 可以这样做 public void onCreate Bundle savedInstanceState super onCr
如何通过点击复制 folium 地图上的标记位置？

I am able to print the location of a given marker on the map using folium plugins MousePosition class GeoMap def update
使用溢出支持定位粘性填充材料[重复]

这个问题在这里已经有答案了我在用position sticky在我的应用程序中在使用overflow属性来显示滚动条我已经寻找了一个确实支持这种情况的polyfill 但到目前为止没有任何运气有谁知道这样的polyfill shim
在成为FirstResponder或resignFirstResponder的情况下将对象保持在键盘顶部？

我目前在键盘顶部有一个 UITextField 当您点击它时它应该粘在键盘顶部并平滑地向上移动我不知道键盘的具体时长和动画类型所以确实很坎坷这是我所拥有的 theTextView resignFirstResponder UIVie
水平和垂直居中 div 位于页面中间，页眉和页脚粘在页面顶部和底部

我正在尝试制作一个具有固定高度页眉和页脚的页面页眉位于屏幕顶部 100 宽度页脚位于底部 100 宽度我想将一个具有可变高度内容的 div 居中放置在页眉和页脚之间的空间中在下面的 jsfiddle 中如果内容比空格短它会起作用
是否可以使用 Dapper 流式传输大型 SQL Server 数据库结果集？

我需要从数据库返回大约 500K 行请不要问为什么然后我需要将这些结果保存为 XML 更紧急并将该文件通过 ftp 传输到某个神奇的地方我还需要转换结果集中的每一行现在这就是我正在做的事情 TOP 100结果使用 Dappe
描述符“join”需要“unicode”对象，但收到“str”

代码改编自here http wiki geany org howtos convert camelcase from foo bar to Foo Bar def lower case underscore to camel case s
为什么 try catch 块没有捕获 Promise 异常？

我对承诺的错误处理感到困惑答案可能很明显但我不明白我有以下示例代码 var test async function throw new Error Just another error try test then catch err
HTML 表格 - 固定列宽和多个可变列宽

我必须建立一个有 5 列的表表格宽度是可变的内容宽度的 50 有些列包含固定大小的按钮因此这些列应该有一个固定大小例如 100px 有些列中有文本所以我希望这些列具有可变的列宽例如 Column1 tablewidth sum
通过 Telnet 运行应用程序

我需要创建一个 BAT 文件来通过 telnet 运行应用程序但据我所知在 DOS 上无法执行此操作 Telnet 不允许在连接的瞬间向远程计算机发送任何命令并且 BAT 文件中的每个后续命令只有在 telnet 停止后才会执行这段
R data.table 1.9.2 关于 setkey 的问题

这似乎是 1 8 10 后引入的一个错误与包含列表的 DT 的 setkey 相关运行下面两个代码来查看问题 library data table dtl lt list dtl 1 lt data table scenario 1 p

随机推荐

阻止 eclipse CDT 从 main() 进行调试？

如果我使用 eclipse CDT 调试我的 C 代码它似乎总是从main 函数即使在开头没有断点main 有没有办法让 Eclipse CDT 从第一个断点开始调试而不是main 在菜单上运行 gt 调试配置右键单击C C 应用程序
在 Android 中开发 Web 监视器

我想监控过滤用户在 Android 中打开的网站我知道如何使用浏览器历史记录中的 ContentObserver 检索上次访问的 URL 在 Android 默认浏览器中 private static class BrowserObse
如何检测客户端线程是否退出？

这是一个有趣的图书馆作家的困境在我的库在我的例子中是 EasyNetQ 中我正在分配线程本地资源因此当客户端创建一个新线程然后调用我的库上的某些方法时就会创建新资源对于 EasyNetQ 当客户端在新线程上调用 Publis
node.js Date#getTime() 的作用是什么？

我现在正在研究 learnyounode 模块 13 在提示部分它声称 Date getTime 也会派上用场我查找了 Date 对象并找到了 getTime 方法但是当存在散列而不是句点时这意味着什么这只是一个参考getTime
无法加载 `Rails.application.database_configuration`：未知别名：默认

我是 Ruby on Rails 的新手我猜我的问题的答案非常简单但我找不到它我最近创建了一个项目并使用 railsgeneratescaffold 一切工作正常我想向数据库添加另一列因此我使用了 railsgeneratemig
ModuleNotFoundError：Heroku 中没有名为“django”的模块

我尝试在 Heroku 中部署我的应用程序并出现此错误 2018 05 03T14 35 40 682441 00 00 heroku web 1 Starting process with command python manage p
为什么编译器无法用文字确定 std::max 的模板？

既不是 clang 也不是 gcc 编译这个 include
我可以使用 Web Config Transform 而不使用 Visual Studio 2012 进行发布吗？

Visual Studio 2012 是否支持使用特定 Web config 转换运行解决方案而无需发布我们正在使用 web config 来更改发布时的客户端设置并希望在本地测试它们不太一样但你可以preview使用 Visua
Bradley-Roth 自适应阈值算法 - 如何获得更好的性能？

我有以下图像阈值代码使用 Bradley Roth 图像阈值方法 from PIL import Image import copy import time def bradley threshold image threshold 75
如何在没有 redis 的情况下扩展 socket.io

我目前正在寻找一种替代方案来使用 socket io 扩展我的 Express 应用程序问题是我不想使用 redis 作为 socket io 存储除了使用之外是否还有其他可能性来集群 socket io集群集线器 https git
已等待但从未解决/拒绝承诺内存使用[重复]

这个问题在这里已经有答案了 Will awaiting a Promise既不解决也不拒绝从不解决未实现导致内存泄漏在查看 React hooks 时我对此感到好奇slorber awesome debounce promise h
如何将我所有选定的列放入虚拟变量中？

背景这个问题是一个后续问题上一个问题 https stackoverflow com questions 45981422 how to measure query duration without showing results of
如何在 TreePanel 上拖放后触发事件

如何使用 Ext tree ViewDDPlugin 的事件我有一个使用 DDPplugin 的 TreePanel 但我想知道如何监听 drop 事件这就是我的代码的样子 var monPretree Ext create Ext t
GWT+Jetty JSP 编译器问题的解决方法？（Java 1.5源代码级别不被识别）

As 显示使用新的 Jetty 服务器在 GWT 托管模式下编译 JSP 似乎存在问题 2 ERROR in tmp Jetty 0 0 0 0 8080 war ut4fm1 jsp org apache jsp test jsp ja
如何在 GitLab CI 构建期间从私有 GitLab Git 存储库中提取 NPM 依赖项

我有一份工作 gitlab ci yml执行以下操作的文件npm install像这样 test image node 10 script npm install npm test 问题是我在我的项目中引用了一个私有的 GitLab 存储库
iOS 将音频采样率从 16 kHz 转换为 8 kHz

我尝试将 PCM 音频从 16kHz 转换为 8kHz 只是采样率没有格式更改流程看起来很简单但我不断得到kAudioConverterErr InvalidInputSize insz 来自呼叫AudioConverterFillC
将值写入PE文件

我想尝试以下操作我有一个 C 程序它将一个文件作为输入并计算这五个 MD5 的 MD5 算法我的算法对每个文件都有一个唯一的值该值是一个 128 位值因此我想使用此技术通过将 md5 算法的输出值保存到我的 PE 文件中来保护我的
使用 jq 将 Json 文件中的表格形式的元素相关联

我是新来的jq我有以下代码来获取每个名为的元素的值列表Abc Abc objects select has Abc Abc tsv 这是我得到的当前输出 Abc 4 2 1 9 3 2 4 9 我想在左侧添加 4 列以显示每列Abc值对应的
Mido - 如何从不同端口实时获取 midi 数据

我创建了 2 个端口作为输入用于从键盘和 midi 表面控制器有一堆滑块和旋钮捕获数据虽然我不确定如何从两者获取数据 for msg1 in input hw if not msg1 type clock print msg1 Pl
Huggingface Bert TPU 微调适用于 Colab，但不适用于 GCP

我正在尝试在 TPU 上微调 Huggingface Transformers BERT 模型它在 Colab 中工作但当我切换到 GCP 上的付费 TPU 时失败 Jupyter笔记本代码如下 1 model transformers

Huggingface Bert TPU 微调适用于 Colab，但不适用于 GCP

Huggingface Bert TPU 微调适用于 Colab，但不适用于 GCP 的相关文章

随机推荐

热门标签