跑bart代码

2023-05-16

跑huggingface上的bart遇到的一系列问题

1.无法连接到huggingface

在这里插入图片描述

解决1：

使用git、wget方式下载：
在这里插入图片描述

失败

解决2：

从官网下载下来模型并上传，讲代码中模型导入的路径改为本地路径
下载过程可以参考该博客。
在这里插入图片描述
不知道为啥直接变成了路径接上，不过不重要，最后也不知道怎么捣鼓的反正就是接上了，成功导入，出现另一个错误：

2.nvida版本过低

在这里插入图片描述
下午再解决，累了

RuntimeError: The NVIDIA driver on your system is too old (found version 10010). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.

好像是不能直接下载最新版的pytorch，跟cuda的版本对不上（好像是），所以去找一下怎么下载特定版本的pytorch，另：昨天好像手滑把另一个配好的cuda环境的pytorch搞没了，记得找到怎么下载1.1.0版本的pytorch后，重新去那个版本里再下一个。
bart要求的pytorch版本因该是>=1.6.0的。
具体cuda和pytorch、torchvision对应版本的下载语句如下网页所示：
https://pytorch.org/get-started/previous-versions/#conda-3

3.内存不够

在这里插入图片描述
查看显卡内存：

nvidia-smi

在这里插入图片描述
换卡3试试
指定使用哪块显卡的语句：

os.environ['CUDA_VISIBLE_DEVICES']='3'

但是依旧报错，考虑是所有卡的空间都不太够了
因为这份代码已经添加了去掉梯度的语句，所以不可再添加去掉梯度的语句了，添加梯度的博客：
https://www.cnblogs.com/dyc99/p/12664126.html
https://blog.csdn.net/weixin_43760844/article/details/113462431

明早等卡没人用了再说。
有卡了，还是报错，先看论文吧

load reference!
  0%|                                                                 | 0/843 [00:00<?, ?it/s]/home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/transformers/tokenization_utils_base.py:2217: FutureWarning: The `pad_to_max_length` argument is deprecated and will be removed in a future version, use `padding=True` or `padding='longest'` to pad to the longest sequence in the batch, or use `padding='max_length'` to pad to a max length. In this case, you can give a specific length with `max_length` (e.g. `max_length=45`) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
  FutureWarning,
/home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/transformers/tokenization_utils_base.py:2217: FutureWarning: The `pad_to_max_length` argument is deprecated and will be removed in a future version, use `padding=True` or `padding='longest'` to pad to the longest sequence in the batch, or use `padding='max_length'` to pad to a max length. In this case, you can give a specific length with `max_length` (e.g. `max_length=45`) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
  FutureWarning,
/home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/transformers/tokenization_utils_base.py:2217: FutureWarning: The `pad_to_max_length` argument is deprecated and will be removed in a future version, use `padding=True` or `padding='longest'` to pad to the longest sequence in the batch, or use `padding='max_length'` to pad to a max length. In this case, you can give a specific length with `max_length` (e.g. `max_length=45`) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
  FutureWarning,
/home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/transformers/tokenization_utils_base.py:2217: FutureWarning: The `pad_to_max_length` argument is deprecated and will be removed in a future version, use `padding=True` or `padding='longest'` to pad to the longest sequence in the batch, or use `padding='max_length'` to pad to a max length. In this case, you can give a specific length with `max_length` (e.g. `max_length=45`) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
  FutureWarning,
epoch 1, step 0, loss 5.0370:   0%|                           | 1/843 [00:02<05:19,  2.63it/s]Traceback (most recent call last):
  File "main.py", line 217, in <module>
    train_one_epoch(config, train_dataloader, model, optimizer, criterion, epoch + 1)
  File "main.py", line 85, in train_one_epoch
    loss.backward()
  File "/home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/tensor.py", line 185, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/autograd/__init__.py", line 127, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: CUDA out of memory. Tried to allocate 1.08 GiB (GPU 0; 11.91 GiB total capacity; 10.66 GiB already allocated; 377.06 MiB free; 10.95 GiB reserved in total by PyTorch)
Exception raised from malloc at /opt/conda/conda-bld/pytorch_1595629403081/work/c10/cuda/CUDACachingAllocator.cpp:272 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x4d (0x7f44c686d77d in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x20626 (0x7f44c6ac5626 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #2: <unknown function> + 0x214f4 (0x7f44c6ac64f4 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #3: <unknown function> + 0x21b81 (0x7f44c6ac6b81 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #4: at::native::empty_cuda(c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) + 0x249 (0x7f44c99d5c79 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #5: <unknown function> + 0xd25dc9 (0x7f44c79f8dc9 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #6: <unknown function> + 0xd3fbf7 (0x7f44c7a12bf7 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #7: <unknown function> + 0xe450dd (0x7f44f9b2c0dd in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #8: <unknown function> + 0xe453f7 (0x7f44f9b2c3f7 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #9: at::empty(c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) + 0xfa (0x7f44f9c36e7a in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #10: at::native::empty_like(at::Tensor const&, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) + 0x49e (0x7f44f98b509e in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #11: <unknown function> + 0xfe3521 (0x7f44f9cca521 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #12: <unknown function> + 0x101ecc3 (0x7f44f9d05cc3 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #13: at::empty_like(at::Tensor const&, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) + 0x101 (0x7f44f9c19f91 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #14: at::Tensor at::native::(anonymous namespace)::host_softmax_backward<at::native::(anonymous namespace)::LogSoftMaxBackwardEpilogue, true>(at::Tensor const&, at::Tensor const&, long, bool) + 0x16c (0x7f44c9123eac in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #15: at::native::log_softmax_backward_cuda(at::Tensor const&, at::Tensor const&, long, at::Tensor const&) + 0x8d (0x7f44c90ff17d in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #16: <unknown function> + 0xd13a40 (0x7f44c79e6a40 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #17: <unknown function> + 0xe6f636 (0x7f44f9b56636 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #18: at::_log_softmax_backward_data(at::Tensor const&, at::Tensor const&, long, at::Tensor const&) + 0x119 (0x7f44f9be4aa9 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #19: <unknown function> + 0x2c217ff (0x7f44fb9087ff in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #20: <unknown function> + 0xe6f636 (0x7f44f9b56636 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #21: at::_log_softmax_backward_data(at::Tensor const&, at::Tensor const&, long, at::Tensor const&) + 0x119 (0x7f44f9be4aa9 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #22: torch::autograd::generated::LogSoftmaxBackward::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&) + 0x1d7 (0x7f44fb7844b7 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #23: <unknown function> + 0x30d1017 (0x7f44fbdb8017 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #24: torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&, std::shared_ptr<torch::autograd::ReadyQueue> const&) + 0x1400 (0x7f44fbdb3860 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #25: torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&) + 0x451 (0x7f44fbdb4401 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #26: torch::autograd::Engine::thread_init(int, std::shared_ptr<torch::autograd::ReadyQueue> const&, bool) + 0x89 (0x7f44fbdac579 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #27: torch::autograd::python::PythonEngine::thread_init(int, std::shared_ptr<torch::autograd::ReadyQueue> const&, bool) + 0x4a (0x7f45000db99a in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #28: <unknown function> + 0xc9039 (0x7f4502c0e039 in /home/tianminghui/anaconda3/envs/bart/lib/python3.7/site-packages/torch/lib/../../../.././libstdc++.so.6)
frame #29: <unknown function> + 0x76ba (0x7f4525bb76ba in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #30: clone + 0x6d (0x7f45258ed41d in /lib/x86_64-linux-gnu/libc.so.6)

epoch 1, step 0, loss 5.0370:   0%|                           | 1/843 [00:03<48:19,  3.44s/it

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

bart

跑bart代码的相关文章

ubuntu开启rdp服务

概要 ssh登录用于终端 xff0c 如果需要GUI的远程登陆ubuntu 我了解到大概3中方案 vncxrdp第三方软件向日葵 TeamViewer之类的因为vnc我一直配置不好 xff0c 所以试了一下xrdp xff0c 这样wi
csp模拟2-T1 HRZ的序列

题目时间限制 1s 空间限制 64MB 题目描述相较于咕咕东 xff0c 瑞神是个起早贪黑的好孩子 xff0c 今天早上瑞神起得很早 xff0c 刷B站时看到了一个序列aaa xff0c 他对这个序列产生了浓厚的兴趣他好奇是否存在一个
Ubuntu 中文件和目录的操作命令

在 Ubuntu 中 xff0c 文件和目录的操作命令是非常重要的这些命令帮助您在文件系统中创建复制移动删除和查看文件和目录以下是一些常用的文件和目录操作命令 xff1a cd cd 命令用于切换当前工作目录例如 xff0c 要
Docker无法在WSL2的Ubuntu启动的问题

今天在更新了WSL2上的Ubuntu22 04版本 xff0c 在安装Docker后无法启动 xff0c 查看Docker的日志显示如下的错误 INFO 2022 04 22T16 14 55 718999500 43 08 00 stop
C语言中的移位操作

C语言中的移位操作 xff0c 内容不多不过有些地方你不注意 xff0c 就疏忽了先做两个小题先 1 unsigned char x 61 3 x lt lt 1是多少 xff1f x gt gt 1是多少 xff1f 2 char x
Android获取设备唯一标识的方法

String uniqueId String mac 61 getMacAddressByInetAddress if mac 61 null amp amp mac equals 34 34 amp amp mac equals 34 0
Linux 搭建私有CA证书服务器之超详细版本

一 CA简介 CA是什么 xff1f CA是Certificate Authority的简写 xff0c 从字面意思翻译过来是凭证管理中心 xff0c 认证授权它有点类似我们生活中的身份证颁发机构 xff0c 这里的CA就相当于生活中颁发
基于Nginx搭建RTMP-HLS视频直播服务器（推流+拉流）

1 环境准备 Linux centos7 6 nginx 1 18 0 源码包 span class token function wget span http nginx org download nginx 1 8 1 tar gz n
k8s-部署本地仓库harbor

1 基础配置 xff1a 主机名IP系统版本k8s master192 168 32 128centos 7 6k8s node1192 168 32 129centos 7 6k8s node2192 168 32 130centos 7
k8s部署nginx容器

1 创建挂载nginx namespace yaml配置文件 xff08 k8s master xff1a 192 168 32 128 xff09 apiVersion v1 kind Namespace metadata name ng
k8s部署tomcat并且映射本地目录

1 编写Dockerfile span class token punctuation span root 64 VM 12 7 centos opt span class token punctuation span span class
自动化运维记录之GitLab CI/CD 自动化部署入门教程

1 前端项目自动化部署需要的环境依赖 Node 安装项目依赖打包都需要 Nginx web 项目部署必须正向代理方向代理负载均衡等等 GitLab 也会用到 Nginx span class token punctuation sp
k8s-kubeadm证书过期续订解决方法

1 实验目的通过kubeadm安装的kubernetes集群各个组件所使用证书的期限为1年 xff0c 本实验练习的是到期之后如何续期 2 实验环境 ubernetes环境及版本整个实验三台机器192 168 32 128作为maste
宝塔部署Django项目-避坑必看

1 在linux windoes机器上已经安装宝塔浏览器登录宝塔管理页面 1 1打包模块 span class token number 1 span 将本地计算机的项目下的模块打包 python m pip freeze span cl
week11作业—A - 必做题11-1—

题目蒜头君从现在开始工作 xff0c 年薪 NNN 万他希望在蒜厂附近买一套 606060 平米的房子 xff0c 现在价格是 200200200 万假设房子价格以每年百分之 KKK 增长 xff0c 并且蒜头君未来年薪不变 xff0
Linux 系统 nginx 源码编译安装

nginx版本 xff1a nginx 1 18 0 操作实施环境 Ubuntu 16 04 6 LTS SUSE Linux Enterprise Server 12 SP4 x86 64 注 xff1a 因为是源码安装 xff0c 操作
ubuntu18.04和20.04(ubuntu focal)安装MySQL8并使用navicat连接（详细）

文章的第一节转载自原文连接本文实现了服务器ubuntu18 04和虚拟机ubuntu20 04两个版本的MySQL8 的安装 xff0c 和navicat的连接其中ubuntu18 04对应第二节 xff0c ubuntu20 04是u

随机推荐

对接淘宝公共平台API

1 说明由于项目临时提出需求 xff0c 需要对接淘宝公共平台查询用户的一些信息 xff0c 所以需要和淘宝平台做对接 xff0c 我查看了一下淘宝公共平台开发文档 xff0c 虽然写的挺丰富挺整洁 xff0c 但我还是一头雾水 xff0
使用IDEA插件从数据库表生成实体类

目录 1 介绍 2 添加插件 3 创建数据库连接 4 添加数据库连接信息和驱动 5 表生成实体类 1 介绍 EasyCode是基于IntelliJ IDEA Ultimate版开发的一个代码生成插件 xff0c 主要通过自定义模板基于ve
SpringBoot多环境动态环境切换(nacos)

目录 1 环境变量切换 1 1 建立各环境配置文件 1 2 设置环境变量 2 nacos配置中心动态切换 2 1 配置文件 2 2 nacos配置 2 3 启动服务 3 同一nacos环境下服务不同环境控制 3 1 cloud方式 3 1
批量插入或更新数据(MyBatis-plus框架)

目录 1 场景说明 2 DUPLICATE 和REPLACE比较 3 批量插入或者更新两种方式方式一 xff1a mybatis plus的saveOrUpdateBatch方法问题 xff1a 如果操作类集成了基础类 xff0c 比
SpringBoot+Nacos+OpenFeign环境搭建

目录 1 boot方式nacos与openFeign集成 1 引入依赖 2 添加配置 3 测试接口调用 4 常见问题 xff1a 1 版本依赖 2 nacos客户端 2 cloud方式nacos与openFeign集成 1 引入依赖 2 添
RestTemplate连接池使用

说明在调用淘宝的公共平台接口时候 xff0c 响应较慢 xff0c 我们需要60ms能够获取到响应 xff0c 但是却经常是200ms甚至更长时间 xff0c 别人的接口只能够优化网络响应时间来提升接口响应由于接口并发量发 xff0c
华为云CCI方式部署服务

1 创建工作负载说明 xff1a 创建负载使用的是swr自己上传的镜像工作负载的创建过程相对简单 xff0c 和CCE类似 xff0c 创建好工作负载后会自动生成服务 2 配置路由说明 xff1a 添加路由需要指定好容器端口和服务端口
RocketMQ单机环境搭建测试+springboot整合

1 资源下载官网 xff1a 下载 RocketMQ 这里选择使用编译后可以直接用的下载后解压 xff1a 略 2 更改配置主要是更改 conf broker conf 的配置 xff0c 记得添加上下面这几行 xff0c 否则消息发
Assignment 2: Exploratory Data Analysis

Assignment 2 Exploratory Data Analysis 在此作业中 xff0c 您将识别出感兴趣的数据集并进行探索性分析 xff0c 以更好地理解数据的形状和结构 xff0c 调查最初的问题以及发展初步的见解和假设您
jsp页面不显示的问题

明明前后端的测试都写好了 xff0c 但是就是显示不出来数据最后的最后发现是在引入js的时候一个小小的疏忽浪费了我半个小时的时间 xff0c 值得记录一下
mysql group by 用法解析(详细)

group by 用法解析 group by语法可以根据给定数据列的每个成员对查询结果进行分组统计 xff0c 最终得到一个分组汇总表 SELECT子句中的列名必须为分组列或列函数列函数对于GROUP BY子句定义的每个组各返回一个结果
kubectl get pod卡住的问题

安装minikube之后 xff0c 出现了kubectl get pod卡住的问题 xff0c 我这里主要网络的问题 xff0c 因为使用代理时没有过滤本地的IP xff0c 添加上过滤IP就可以用了 export no proxy 61
记录Win10+Ubuntu18.04(引导Win10启动)双系统迁移到SSD，Ubuntu迁移成功但丢失Win10启动项

原来的Win10 43 Ubuntu双系统是先装的Win10后装Ubuntu时选择 34 与Windows系统共存 34 xff0c 如此开机后由Ubuntu启动项紫屏接管引导进入Ubuntu或Windows系统看网上的教程如果不dd
ubuntu 22.04部署quincy版ceph

ceph集群安装配置有多种方式 xff0c 下方cephadm方式是借助容器部署 cephadm从ceph的octopus版本开始支持安装需要主机配置安装了容器和python 3 配置安排 xff1a ceph版本 xff1a quinc
C/C++ 中头文件相互包含引发的问题

今天下午遇到一个头文件相互包含而导致的编译问题 xff0c 花了我不少时间去调试没找到问题 xff0c 最后晚上跟师兄讨论不少时间 xff0c 突然有所顿悟 xff01 问题重现我把问题脱离于项目简单描述一下 xff1a 我写了一个函数
安装Discuz!论坛提示mysqli_connect() 不支持

安装Discuz 论坛时提示不支持Mysql数据库 xff0c 无法安装论坛的解决方法 1 在系统的 system32 xff08 C windows system32 xff09 目录下缺少libmysql dll文件 xff0c 解
汇编语言程序格式

汇编语言程序格式 1 汇编程序功能在计算机上运行汇编语言程序的步骤是 xff1a 用编辑程序建立ASM源文件用MASM程序把ASM文件转换成OBJ文件用LINK程序把OBJ文件转换成EXE文件用DOS命令直接键入文件名就可执行该程序
TX2 ubuntu18.04 系统源

See http help ubuntu com community UpgradeNotes for how to upgrade to newer versions of the distribution deb http ports
解决 gpg: Can't check signature: public key not found

repo init 的时候出现错误 object 12fd10c20115046dcd2fbe468a45e566f38ffbc9 type commit tag v1 12 7 tagger Conley Owens lt cco3 64
跑bart代码

跑huggingface上的bart遇到的一系列问题 1 无法连接到huggingface 解决1 xff1a 使用git wget方式下载 xff1a 失败解决2 xff1a 从官网下载下来模型并上传 xff0c 讲代码中模型导入的路径