T5的整体介绍【代码实战】

2023-10-31

T5的整体介绍【代码实战】

0、前言
1.Header
2.summary
3 T5 model

0、前言

本文是对T5预训练模型的一个介绍，以及能够用来做任务测试，完整的代码稍后挂上链接。

1.Header

import torch
from torch import nn
import torch.nn.functional as F
import transformers
# from transformers_utils import get_params
from transformers import pipeline
# ~/.cache/huggingface/hub
from transformers import AutoTokenizer, AutoConfig, AutoModel
# ~/.cache/huggingface/datasets
from datasets import load_dataset
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
from IPython.display import Image

# trainable parameters of the model
def get_params(model):
    model_parameters = filter(lambda p: p.requires_grad, model.parameters())# 用filter函数过滤掉那些不需要梯度更新的参数，只保留那些需要梯度更新的参数，然后把它们放在一个变量，叫做model_parameters。这个变量也是一个迭代器。
    params = sum([np.prod(p.size()) for p in model_parameters])# 用一个列表推导式遍历model_parameters中的每个参数，然后用np.prod函数计算每个参数的元素个数。np.prod函数的作用是把一个序列中的所有元素相乘。例如，如果一个参数的形状是(2, 3)，那么它的元素个数就是2 * 3 = 6。然后把所有参数的元素个数加起来，得到一个总和，放在一个变量，叫做params。
    return params

# default: 100
mpl.rcParams['figure.dpi'] = 150
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

‘cuda’

2.summary

t5: Text-To-Text Transfer Transformer
- https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html
- https://huggingface.co/docs/transformers/model_doc/t5

Image('t5.png')

在这里插入图片描述
可见可以做的任务有1.翻译；2.是否接受一个句子；3.句子直接的相似度计算；4.摘要。

CoLA: Linguistic Acceptability
- CoLA，全称为The Corpus of Linguistic Acceptability，是一个英语语言的句子接受度数据集，由华盛顿大学计算机科学与工程系的一组研究人员于2018年创建。该数据集旨在提供一个用于评估自然语言处理模型所生成文本的语言接受度和流畅度的基准测试集。
- CoLA数据集由10657个英语句子组成，这些句子来自各种不同的来源，包括核心新闻材料和审判文件等。每个句子都被标记为可接受或不可接受，可接受的句子应该具有语法正确性和常识性，相反，不可接受的句子可能会涉及句法错误、歧义、语义冲突等问题。
- CoLA数据集是典型的二元分类问题，用于测试模型对自然语言句子的语法和语义的理解能力。同时，CoLA数据集还提供了一个挑战，即对于不可接受的句子，模型需要能够识别错误类型以及如何修复它们。
- CoLA数据集被广泛应用于自然语言处理领域，特别是语言理解、句法分析、语义分析等方面的研究。该数据集已被众多研究人员用于评估自然语言处理模型的性能和对其改进。
STSB: Semantic Textual Similarity Beachmark
- STSB，全称为The Semantic Textual Similarity Benchmark，是一个用于评估自然语言处理模型对文本语义相似性的数据集和基准测试集。该数据集由华盛顿大学计算机科学与工程系的研究人员于2012年创建，主要用于评估模型在句子和文本相似性任务中的性能。
- STSB数据集包含7,000对句子，这些句子来自各种不同的来源，例如自然语言文本、问题回答对、比较句等。每个句子对都被标记为0到5之间的相似度分数（0表示两个句子毫不相似，5表示两个句子非常相似）。
- 在STSB数据集中，模型需要对每对句子的相似度进行预测。这是一个连续值预测问题，需要模型预测一个浮点数作为句子对之间的相似性得分。
- STSB数据集是测试模型在句子和文本相似性任务中的常用基准测试集之一。自该数据集发布以来，很多研究已经使用它来评估不同的自然语言处理模型和各种技术。它在一些实际应用领域，例如问答系统和文本匹配任务中具有重要实际意义。

3 T5 model

vocabulary size：32128

model	参数量	hidden dim	encoder/decoder layers
t5-small	61M	512 (64*8) -> 512	6
t5-base	223M	768 (64*12) -> 768	12
t5-large	738M	1024 (64*16) -> 1024	24
t5-3b	2.85B	4096 (128*32) -> 1024	24
t5-11b	11B	16384 (128*128) -> 1024	24

# t5-small
# t5-base
# t5-large
# t5-3b
# tb-11b
model_ckpt = 't5-3b'
# 比 T5Model 多了一层 hidden_state -> vocab_size 的 mlp 的映射
model = AutoModel.from_pretrained(model_ckpt, )
tokenizer = AutoTokenizer.from_pretrained(model_ckpt)
config = AutoConfig.from_pretrained(model_ckpt)
config

T5Config {
“_name_or_path”: “t5-3b”,
“architectures”: [
“T5WithLMHeadModel”
],
“d_ff”: 16384,
“d_kv”: 128,
“d_model”: 1024,
“decoder_start_token_id”: 0,
“dense_act_fn”: “relu”,
“dropout_rate”: 0.1,
“eos_token_id”: 1,
“feed_forward_proj”: “relu”,
“initializer_factor”: 1.0,
“is_encoder_decoder”: true,
“is_gated_act”: false,
“layer_norm_epsilon”: 1e-06,
“model_type”: “t5”,
“n_positions”: 512,
“num_decoder_layers”: 24,
“num_heads”: 32,
“num_layers”: 24,
“output_past”: true,
“pad_token_id”: 0,
“relative_attention_max_distance”: 128,
“relative_attention_num_buckets”: 32,
“task_specific_params”: {
“summarization”: {
“early_stopping”: true,
“length_penalty”: 2.0,
“max_length”: 200,
“min_length”: 30,
“no_repeat_ngram_size”: 3,
“num_beams”: 4,
“prefix”: "summarize: "
},
“translation_en_to_de”: {
“early_stopping”: true,
“max_length”: 300,
“num_beams”: 4,
“prefix”: "translate English to German: "
},
“translation_en_to_fr”: {
“early_stopping”: true,
“max_length”: 300,
“num_beams”: 4,
“prefix”: "translate English to French: "
},
“translation_en_to_ro”: {
“early_stopping”: true,
“max_length”: 300,
“num_beams”: 4,
“prefix”: "translate English to Romanian: "
}
},
“transformers_version”: “4.28.0”,
“use_cache”: true,
“vocab_size”: 32128
}

可见相关任务是1. summarize: 2.translate English to German: 3.translate English to French:4. translate English to Romanian:。

model

T5Model(
(shared): Embedding(32128, 1024)
(encoder): T5Stack(
(embed_tokens): Embedding(32128, 1024)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
(relative_attention_bias): Embedding(32, 32)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-23): 23 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(decoder): T5Stack(
(embed_tokens): Embedding(32128, 1024)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
(relative_attention_bias): Embedding(32, 32)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerCrossAttention(
(EncDecAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(2): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-23): 23 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerCrossAttention(
(EncDecAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(2): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)

format(get_params(model), ',')

‘2,851,598,336’

参数是2.85B

3.1 forward

input_ids = tokenizer(
    "Studies have been shown that owning a dog is good for you", return_tensors="pt"
).input_ids  # Batch size 1
decoder_input_ids = tokenizer("Studies show that", return_tensors="pt").input_ids  
# preprocess: Prepend decoder_input_ids with start token which is pad token for T5Model.
# This is not needed for torch's T5ForConditionalGeneration as it does this internally using labels arg.
decoder_input_ids = model._shift_right(decoder_input_ids)
input_ids

tensor([[6536, 43, 118, 2008, 24, 293, 53, 3, 9, 1782, 19, 207,
21, 25, 1]])

model.eval()
# forward pass
outputs = model(input_ids=input_ids, decoder_input_ids=decoder_input_ids)
last_hidden_states = outputs.last_hidden_state
last_hidden_states

tensor([[[-0.1611, -0.0524, 0.2812, …, -0.0113, -0.5561, -0.1034],
[-0.0441, 0.0494, 0.0101, …, 0.2337, 0.1868, 0.0204],
[-0.1586, -0.0830, -0.0067, …, 0.1704, 0.0040, 0.1689],
[-0.0349, -0.0160, 0.0020, …, 0.1688, -0.0871, 0.1037]]],
grad_fn=< MulBackward0>)

def t5_forward(model, input_ids, decoder_input_ids):
    encoder_outputs = model.encoder(input_ids=input_ids)
#     print(encoder_outputs)
    hidden_states = encoder_outputs[0]
    decoder_outputs = model.decoder(input_ids=decoder_input_ids, 
                                    encoder_hidden_states=hidden_states,)
    return decoder_outputs.last_hidden_state

t5_forward(model, input_ids, decoder_input_ids)

tensor([[[-0.1611, -0.0524, 0.2812, …, -0.0113, -0.5561, -0.1034],
[-0.0441, 0.0494, 0.0101, …, 0.2337, 0.1868, 0.0204],
[-0.1586, -0.0830, -0.0067, …, 0.1704, 0.0040, 0.1689],
[-0.0349, -0.0160, 0.0020, …, 0.1688, -0.0871, 0.1037]]],
grad_fn=< MulBackward0>)

可以看到两个结果是一样的。

3.2 预训练任务

Unsupervised denoising training
- MLM(Mask Language Model)
- span mask
Supervised training
- seq2seq

from transformers import T5ForConditionalGeneration
model = T5ForConditionalGeneration.from_pretrained(model_ckpt)
model

T5ForConditionalGeneration(
(shared): Embedding(32128, 1024)
(encoder): T5Stack(
(embed_tokens): Embedding(32128, 1024)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
(relative_attention_bias): Embedding(32, 32)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-23): 23 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(decoder): T5Stack(
(embed_tokens): Embedding(32128, 1024)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
(relative_attention_bias): Embedding(32, 32)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerCrossAttention(
(EncDecAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(2): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-23): 23 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerCrossAttention(
(EncDecAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(2): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(lm_head): Linear(in_features=1024, out_features=32128, bias=False)
)

可见T5ForConditionalGeneration多了最后一层，即(lm_head): Linear(in_features=1024, out_features=32128, bias=False)，Language Model_head。

# Unsupervised denoising training
# mlm
input_ids = tokenizer("The <extra_id_0> walks in <extra_id_1> park", return_tensors="pt").input_ids
labels = tokenizer("<extra_id_0> cute dog <extra_id_1> the <extra_id_2>", return_tensors="pt").input_ids

# the forward function automatically creates the correct decoder_input_ids
loss = model(input_ids=input_ids, labels=labels).loss
loss.item()

1.9458732604980469

# Supervised training
# seq2seq

input_ids = tokenizer("translate English to German: The house is wonderful.", return_tensors="pt").input_ids
labels = tokenizer("Das Haus ist wunderbar.", return_tensors="pt").input_ids

# the forward function automatically creates the correct decoder_input_ids
loss = model(input_ids=input_ids, labels=labels).loss
loss.item()

0.9009745717048645

3.2.1 multi sentence pairs

# the following 2 hyperparameters are task-specific
max_source_length = 512
max_target_length = 128

# Suppose we have the following 2 training examples:
input_sequence_1 = "Welcome to NYC"
output_sequence_1 = "Bienvenue à NYC"

input_sequence_2 = "HuggingFace is a company"
output_sequence_2 = "HuggingFace est une entreprise"

# encode the inputs
task_prefix = "translate English to French: "
input_sequences = [input_sequence_1, input_sequence_2]

encoding = tokenizer(
    [task_prefix + sequence for sequence in input_sequences],
    padding="longest",
    max_length=max_source_length,
    truncation=True,
    return_tensors="pt",
)

input_ids, attention_mask = encoding.input_ids, encoding.attention_mask

# encode the targets
target_encoding = tokenizer(
    [output_sequence_1, output_sequence_2],
    padding="longest",
    max_length=max_target_length,
    truncation=True,
    return_tensors="pt",
)
labels = target_encoding.input_ids

# replace padding token id's of the labels by -100 so it's ignored by the loss
labels[labels == tokenizer.pad_token_id] = -100

# forward pass
loss = model(input_ids=input_ids, attention_mask=attention_mask, labels=labels).loss
loss.item()

0.19245588779449463

3.3 完成 tasks

input_ids = tokenizer.encode("translate English to German: Hello, my dog is cute", return_tensors="pt") 
result = model.generate(input_ids)
tokenizer.decode(result[0])

‘ Hallo, mein Hund ist süß’

# inference
input_ids = tokenizer(
    "summarize: Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pretraining objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.", 
    return_tensors="pt"
).input_ids  # Batch size 1
outputs = model.generate(input_ids, max_length=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# studies have shown that owning a dog is good for you.

transfer learning has emerged as a powerful technique in natural language processing (NLP) in this paper, we explore the landscape of transfer learning techniques for NLP. we introduce a unified framework that converts every language problem into a text-to-text format.

参考：
https://github.com/chunhuizhang/bert_t5_gpt/blob/main/tutorials/09_t5_overall.ipynb

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

pytorch深度学习实战

python

开发语言

NLP

深度学习

T5的整体介绍【代码实战】的相关文章

使用单个文件的 Python 日志记录（函数名、文件名、行号）

我正在尝试了解应用程序的工作原理为此我将调试命令插入作为每个函数主体的第一行目的是记录函数的名称以及向日志输出发送消息的行号代码内最后由于这个应用程序由许多文件组成我想创建一个日志文件以便我可以更好地理解应用程序的控制流这
分配列表的多个值

我很想知道是否有一种 Pythonic 方式将列表中的值分配给元素为了更清楚我要求这样的事情 myList 3 5 7 2 a b c d something myList So that a 3 b 5 c 7 d 2 我正在寻找比手
在推送到容器注册表之前如何对构建的映像运行测试？

从 gitlab 文档中可以看出如何使用 kaniko 创建 docker 镜像 build stage build image name gcr io kaniko project executor debug entrypoint sc
Django 查询：“datetime + delta”作为表达式

好吧我的问题如下假设我有下一个模型这是一个简单的情况 class Period models Model name CharField field specs here start date DateTimeField field s
如何使用 python、openCV 计算图像中的行数

我想数纸张所以我正在考虑使用线条检测我尝试过一些方法例如Canny HoughLines and FLD 但我只得到处理过的照片我不知道如何计算有一些小线段就是我们想要的线我用过len lines or len contours
Python 中 time.sleep 和多线程的问题

我对 python 中的 time sleep 函数有疑问我正在运行一个脚本需要等待另一个程序生成 txt 文件虽然这是一台非常旧的机器所以当我休眠 python 脚本时我遇到了其他程序不生成文件的问题除了使用 time sl
Karasuba算法递归过多

我正在尝试用 c 实现 Karasuba 乘法算法但现在我只是想让它在 python 中工作这是我的代码 def mult x y b m if max x y lt b return x y bm pow b m x0 x bm x1
Django 1.7 应用程序配置导入错误：没有名为 appname.apps 的模块

我正在尝试按照以下文档为我的一个名为文章的 Django 应用程序设置自定义应用程序配置https docs djangoproject com en dev ref applications https docs djangoproj
Jupyter笔记本突然变得很慢

我以前在anaconda环境下运行jupyter运行得很好显示警告后 IOPub data rate exceeded The notebook server will temporarily stop sending output to
使用 Windows 任务计划程序安排 [Virtualenv 相关] Python 脚本

I want to schedule a python script to start at 3AM and break at 5PM every weekday However the problem arises when I need
我可以在 if 语句中使用“as”机制吗

是否可以使用as in if类似的声明with我们使用的例如 with open tmp foo r as ofile do something with ofile 这是我的代码 def my list rtrn lst True if
异步异常处理程序：在事件循环线程停止之前不会被调用

我正在我的异步事件循环上设置异常处理程序但是在事件循环线程停止之前它似乎不会被调用例如考虑以下代码 def exception handler loop context print Exception handler called
有没有办法拉伸整个显示图像以适应给定的分辨率？

我最近一直在使用pygame制作游戏遇到了一个小问题基本上我希望能够将屏幕上的整个图像我已经传输到它的所有内容拉伸到用户将窗口大小调整到的分辨率我在 pygame 和堆栈溢出的文档中搜索了很多但我似乎找不到答案这可能吗我的
重定向 python 交互式帮助()

我正在为使用 Qt 的应用程序开发交互式 python shell 但是我似乎无法获得重定向的交互式帮助我的 python 代码中有这个 class OutputCatcher def init self self data def wr
使用 Sphinx 时，如何记录没有文档字符串的成员？

我正在为我发布的包编写文档我发现您的文档越全面人们就越容易找到您的包来使用废话实际上我在充满爱心地编写代码的所有功能和细节方面获得了很多乐趣然而我对如何为类级变量编写与 Sphinx 兼容的文档感到完全困惑特别是我有一些e
在 anaconda 环境下运行 qsub

我有一个程序通常在 Linux 的 conda 环境中运行因为我用它来管理我的库指令如下 source activate my environment python hello world py 我怎样才能跑你好世界 py在与 PBS
在不同的 GPU 上同时训练多个 keras/tensorflow 模型

我想在 Jupyter Notebook 中同时在多个 GPU 上训练多个模型我正在使用 4GPU 的节点上工作我想将一个 GPU 分配给一个模型并同时训练 4 个不同的模型现在我通过例如为一台笔记本选择 GPU import
PYTHON：从 txt 文件中删除 POS 标签

我有以下 txt 文件其中包含 POS 词性 http en wikipedia org wiki Part of speech tagging 每个单词的标签不用 jj到说 vb 我 ppss是 bedz愤怒 jj在在 dt无与伦
OSError: [WinError 193] %1 不是有效的 Win32 应用程序，同时使用 CTypes 在 python 中读取自定义 DLL

我正在尝试编写用 python 封装 C 库的代码我计划使用 CTypes 来完成此操作并使用 Visual Studio 来编译我的 DLL 我从一个简单的函数开始在 Visual Studio 内的标头中添加了以下内容然后将其构
防止 Ada DLL 中的名称损坏

有没有一种简单的方法可以防止在创建 Ada DLL 时 Ada 名称被破坏这是我的 adb 代码 with Ada Text IO package body testDLL is procedure Print Call is begin

随机推荐

哔哩哔哩移动端项目：Vue3.2 + TS + Axios入门到实战

课程简介 Vue3 2 终于定稿了
ingress-nginx k8s.gcr.io 替换中国镜像

使用国内镜像源下载镜像列表 Docker Hub 去docker hub 上搜索 docker pull liangjw ingress nginx controller v1 1 2 GENERATED FOR K8S 1 20 api
JeecgBoot中的JEditableTable组件复杂自定义功能：修改其中一个字段时其他字段也同时做相应改变，级联选择器的使用

参考官方技术文档 http jeecg boot mydoc io t 345687 初始需求如下图当我修改产品名称的选项时产品编码和产品规格的字段会根据我给出的数组自动改变 1 首先我们需要将column类型改为slot才能自定义产
Kendo UI开发教程(20): Kendo MVVM 数据绑定(九) Text

Text绑定可以使用ViewModel来设置DOM元素的文本属性如果需要设置input textarea 或select的显示需要使用value属性 1
unity 实战功能TileMap在真机无法显示或者代码无法创建

unity 版本 2017 4 3f 解决方案这位博主跟我遇到的问题一致最后也是采用了他的解决方案直接先在编辑器里扫一遍透明度为0的tile 最后在真机上使用 refreshtile刷新图 gettile获取之后更改颜色或者sprit
微信小程序优惠券列表领取（send-coupon插件）

官方领取流程插件配置和引入请参考官方文档 https pay weixin qq com wiki doc apiv3 apis chapter9 3 1 shtml
【报错解决】ModuleNotFoundError: No module named ‘kornia‘

pip install kornia 0 6 0
Spring监听器的处理过程

监听容器是如何初始化的当程序执行到AbstractAutowireCapableBeanFactory BeanPostProcessor的实现类保证后续bean可能被包装 initializeBean 方法时轮询执行postProc
【GB28181】PJSIP库（六）使用视频：获取图像、本地预览、发送接收视频等

目录郭老二博文之图像视频汇总 1 简介 PJSUA2 的媒体对象均派生自抽象基类pj Media 媒体对象是指能够生成或读取媒体的对象类pj VideoMedia派生自pj Media 代表视频媒体 PJSUA2 支持多种类型的视频媒
VScode通过remote ssh连接虚拟机 & 报错过程试图写入的管道不存在（已解决）

因为在windows上VSCode使用的默认ssh工具存在实现上的问题导致一旦我们直接使用默认ssh连接会有报错过程试图写入的管道不存在 The process tried to write to a nonexistent pipe
ElasticSearch 实践过程中遇到的几个小问题

ulimit 不生效有一台机器的在启动 ES 的时候始终报错 1 max file descriptors 65000 for elasticsearch process is too low 但是我已经在 etc security li
git使用-去除merge branch ’master‘提交

git去除merge branch master 提交问题现象在项目开发中经常会有这样的情况发生更新上游项目代码时操作 Tom localhost dev gw ac git remote add upstream http Tom
关于百度地图marker的点击事件

在最开始学习使用百度地图 marker的点击事件很容易实现点击弹出框像这样 baiduMap setOnMarkerClickListener new BaiduMap OnMarkerClickListener Override pu
关于MinGW和MSYS

MinGW是Minimalist GNU for Windows的缩写是本地Windows应用的极简开发环境 MinGW为本地MS Windows应用的开发提供了完整的开源编程工具集而且不依赖于第三方C运行时DLL 它确实依赖于很多由微
gitee新建仓库并上传项目

目录一新建仓库二克隆仓库到本地三本地添加项目文件四上传至gitee仓库五提交代码更新至远程仓库一新建仓库 1 点击右上角的号新建仓库 2 填写仓库信息点击创建即可二克隆仓库到本地 1 点击克隆下载 S
python 用随机森林模型补充数值变量缺失值

对数据建模之前填补缺失值是必不可少的一步这里把用随机森林模型快速预测缺失值的方法总结如下以方便日后的工作 data df DataFrame类型的数据 obj column 待填补缺失值的列名 missing other column
Android开发亲测可用--多种方式获取手机短信验证码自动填入

Android开发静态注册动态注册短信中心库监控获取手机验证码自动复制到剪切板或或填入输入框友情提醒初学者这是广播接收器的类写在xml中静态注册或写在启动类的Oncreate方法下动态注册即可有新短信通知就会触发若使用正常
Latex常用数学符号

相关参考常用数学符号的 LaTeX 表示方法 Latex所有常用数学符号整理 CSDN latex插入图片的文章 Side by side figures in LaTex 可以复制的特殊符号表数学模式重音符号 hat a hat a
电路分析笔记-电阻电路的等效变换

电路的等效变换两端网络网络任何一个复杂的电路向外引出两个端钮且从一个端子流入的电流等于从另一端子流出的电流则称这一电路为二端网络或一端口网络两端电路等效的概念两个两端电路端口具有相同的电压电流关系则称它们是等效的电路
T5的整体介绍【代码实战】

T5的整体介绍代码实战 0 前言 1 Header 2 summary 3 T5 model 3 1 forward 3 2 预训练任务 3 2 1 multi sentence pairs 3 3 完成 tasks 0 前言本文是对T

T5的整体介绍【代码实战】

T5的整体介绍【代码实战】

0、前言

1.Header

2.summary

3 T5 model

3.1 forward

3.2 预训练任务

3.2.1 multi sentence pairs

3.3 完成 tasks

T5的整体介绍【代码实战】 的相关文章

随机推荐

热门标签

T5的整体介绍【代码实战】的相关文章