【keras】将loss作为模型的一个层

2023-05-16

使用Keras编写复杂的loss时一般会将loss作为模型的一个层，则模型的输入包括原始输入和y_true，模型输出即为loss，例如yolov3keras版本的代码就是将loss作为模型的一层，因其计算loss比较复杂。下面介绍一个简单的例子，利用神经网络预测Iris数据集

需要的包

import keras.backend as K
from keras import layers, Model
import numpy as np
from keras.utils import plot_model, to_categorical

首先加载并划分数据集，Iris共有4个特征，最后一列为label（减一是为了从0开始），文末附上数据集

data = np.loadtxt('Iris.txt')
np.random.shuffle(data)
x_train, y_train = data[:100, :4], data[:100, 4]-1  # (100,4) (100,)
x_valid, y_valid = data[100:, :4], data[100:, 4]-1  # (50,4) (50,)

在这里插入图片描述
将label转换为onehot

y_train = to_categorical(y_train, num_classes=3) # (100,3)
y_valid = to_categorical(y_valid, num_classes=3) # (50,3)

构建预测网络，网络输出3为的向量，即3个类别的概率

def net():
    inputs = layers.Input(shape=[4, ], name='feature_input')
    dense1 = layers.Dense(32, activation='relu', name='dense1')(inputs)
    out = layers.Dense(3, activation='linear', name='out1')(dense1)
    return Model(inputs, out)

接下来编写自己的loss函数，传入的args为[y_true, y_pred]，这里实现了一个简单的交叉熵。

def my_loss(args, batch_size, C):
    """
    args: [y_true, y_pred], (batch_size, C)
    batch_size: 
    C: classes num.
    """
    y_true, y_pred = args[0], args[1]
    log_pred = K.log(y_pred)
    loss = -K.sum(y_true*log_pred, axis=1)
    batch_tensor = K.cast(batch_size, dtype=float)
    return loss/batch_tensor

构建训练模型，loss_body即为loss层，arguments是预先传入需要的参数，train_model为训练模型，输入包括两个，一个是原始输入一个是y_true，输出为loss，注意compile时lambda y_true, y_pred:的顺序别反了，否则模型模型跟踪不到梯度，会报错，fit的时候y要喂入一个和x大小一样的0数组或其它的也行，反正它不起作用，但一定要喂入。保存模型时我们保留预测模型而不是训练模型，并且使用预测模型来预测。

# build model
predict_body = net()
y_true = layers.Input(shape=[3, ])  # one hot label
loss_body = layers.Lambda(my_loss, output_shape=(1,), name='my_loss',
                          arguments={'batch_size': 16, 'C': 3})([y_true, predict_body.output])

train_model = Model(inputs=[predict_body.input, y_true], outputs=loss_body)
train_model.compile(optimizer='adam',
                    loss={'my_loss': lambda y_true, y_pred: y_pred})  # y_true, y_pred别反了, 因为模型送入的输入就是这样,否则跟踪不了梯度 
plot_model(train_model, show_shapes=True)
train_model.fit(x=[x_train, y_train], y=np.zeros(len(x_train)), batch_size=16, epochs=20,
                validation_data=([x_valid, y_valid], np.zeros(len(x_valid))))
predict_body.save_weights('predict.h5') # 保存预测模型

用plot_model画出train_model如下：
在这里插入图片描述
模型开始训练：

不过这样有个缺点，在训练过程中我们无法看到准确率，只能看到loss，因此可以做如下修改：

将模型变为多输出，输出层除了loss（out2）外再加上预测模型的输出（out1），预测模型的输出用来计算准确率不参与loss计算，因为keras会将两个loss相加，所以out1的loss直接返回0。metric计算out1的accuracy即可。由于算accuracy又要用到y_true，所以在fit的时候y传入[np.zeros(len(x_valid)), y_valid]，前面的zeros用来传给loss，y_valid用来传给metric计算准确率。

import keras.backend as K
from keras import layers, Model
import numpy as np
from keras.utils import plot_model, to_categorical


def net():
    inputs = layers.Input(shape=[4, ], name='feature_input')
    dense1 = layers.Dense(32, activation='relu', name='dense1')(inputs)
    out = layers.Dense(3, activation='linear', name='out1')(dense1)
    return Model(inputs, out)


def my_loss(args, batch_size, C):
    """
    args: [y_true, y_pred], (batch_size, C)
    batch_size: 
    C: classes num.
    """
    y_true, y_pred = args[0], args[1]
    soft_pred = K.softmax(y_pred)
    log_pred = K.log(soft_pred+0.0000001)
    loss = -K.sum(y_true*log_pred, axis=1)
    batch_tensor = K.cast(batch_size, dtype=float)
    return loss/batch_tensor


data = np.loadtxt('Iris.txt')
np.random.shuffle(data)
x_train, y_train = data[:100, :4], data[:100, 4]-1  # (100,4) (100,)
x_valid, y_valid = data[100:, :4], data[100:, 4]-1  # (50,4) (50,)

y_train = to_categorical(y_train, num_classes=3)
y_valid = to_categorical(y_valid, num_classes=3)

# build model
predict_body = net()
y_true = layers.Input(shape=[3, ], name='y_true')  # one hot label
loss_body = layers.Lambda(my_loss, output_shape=(1,), name='out2',
                          arguments={'batch_size': 16, 'C': 3})([y_true, predict_body.output])

train_model = Model(inputs=[predict_body.input, y_true],
                    outputs=[loss_body, predict_body.output])
train_model.compile(optimizer='adam',
                    loss={'out2': lambda y_true, y_pred: y_pred,
                          'out1': lambda y_true, y_pred: K.constant(0)},
                    metrics={'out1': 'accuracy'})
train_model.fit(x=[x_train, y_train],
                y=[np.zeros(len(x_train)), y_train], batch_size=16, epochs=50,
                validation_data=[
                    [x_valid, y_valid],
                    [np.zeros(len(x_valid)), y_valid]])

训练过程就可以看到accuracy了
在这里插入图片描述
虽然这种写法不太优雅，但还是能看到metric了

Iris.txt
5.1 3.5 1.4 0.2 1
4.9 3.0 1.4 0.2 1
4.7 3.2 1.3 0.2 1
4.6 3.1 1.5 0.2 1
5.0 3.6 1.4 0.2 1
5.4 3.9 1.7 0.4 1
4.6 3.4 1.4 0.3 1
5.0 3.4 1.5 0.2 1
4.4 2.9 1.4 0.2 1
4.9 3.1 1.5 0.1 1
5.4 3.7 1.5 0.2 1
4.8 3.4 1.6 0.2 1
4.8 3.0 1.4 0.1 1
4.3 3.0 1.1 0.1 1
5.8 4.0 1.2 0.2 1
5.7 4.4 1.5 0.4 1
5.4 3.9 1.3 0.4 1
5.1 3.5 1.4 0.3 1
5.7 3.8 1.7 0.3 1
5.1 3.8 1.5 0.3 1
5.4 3.4 1.7 0.2 1
5.1 3.7 1.5 0.4 1
4.6 3.6 1.0 0.2 1
5.1 3.3 1.7 0.5 1
4.8 3.4 1.9 0.2 1
5.0 3.0 1.6 0.2 1
5.0 3.4 1.6 0.4 1
5.2 3.5 1.5 0.2 1
5.2 3.4 1.4 0.2 1
4.7 3.2 1.6 0.2 1
4.8 3.1 1.6 0.2 1
5.4 3.4 1.5 0.4 1
5.2 4.1 1.5 0.1 1
5.5 4.2 1.4 0.2 1
4.9 3.1 1.5 0.1 1
5.0 3.2 1.2 0.2 1
5.5 3.5 1.3 0.2 1
4.9 3.1 1.5 0.1 1
4.4 3.0 1.3 0.2 1
5.1 3.4 1.5 0.2 1
5.0 3.5 1.3 0.3 1
4.5 2.3 1.3 0.3 1
4.4 3.2 1.3 0.2 1
5.0 3.5 1.6 0.6 1
5.1 3.8 1.9 0.4 1
4.8 3.0 1.4 0.3 1
5.1 3.8 1.6 0.2 1
4.6 3.2 1.4 0.2 1
5.3 3.7 1.5 0.2 1
5.0 3.3 1.4 0.2 1
7.0 3.2 4.7 1.4 2
6.4 3.2 4.5 1.5 2
6.9 3.1 4.9 1.5 2
5.5 2.3 4.0 1.3 2
6.5 2.8 4.6 1.5 2
5.7 2.8 4.5 1.3 2
6.3 3.3 4.7 1.6 2
4.9 2.4 3.3 1.0 2
6.6 2.9 4.6 1.3 2
5.2 2.7 3.9 1.4 2
5.0 2.0 3.5 1.0 2
5.9 3.0 4.2 1.5 2
6.0 2.2 4.0 1.0 2
6.1 2.9 4.7 1.4 2
5.6 2.9 3.6 1.3 2
6.7 3.1 4.4 1.4 2
5.6 3.0 4.5 1.5 2
5.8 2.7 4.1 1.0 2
6.2 2.2 4.5 1.5 2
5.6 2.5 3.9 1.1 2
5.9 3.2 4.8 1.8 2
6.1 2.8 4.0 1.3 2
6.3 2.5 4.9 1.5 2
6.1 2.8 4.7 1.2 2
6.4 2.9 4.3 1.3 2
6.6 3.0 4.4 1.4 2
6.8 2.8 4.8 1.4 2
6.7 3.0 5.0 1.7 2
6.0 2.9 4.5 1.5 2
5.7 2.6 3.5 1.0 2
5.5 2.4 3.8 1.1 2
5.5 2.4 3.7 1.0 2
5.8 2.7 3.9 1.2 2
6.0 2.7 5.1 1.6 2
5.4 3.0 4.5 1.5 2
6.0 3.4 4.5 1.6 2
6.7 3.1 4.7 1.5 2
6.3 2.3 4.4 1.3 2
5.6 3.0 4.1 1.3 2
5.5 2.5 4.0 1.3 2
5.5 2.6 4.4 1.2 2
6.1 3.0 4.6 1.4 2
5.8 2.6 4.0 1.2 2
5.0 2.3 3.3 1.0 2
5.6 2.7 4.2 1.3 2
5.7 3.0 4.2 1.2 2
5.7 2.9 4.2 1.3 2
6.2 2.9 4.3 1.3 2
5.1 2.5 3.0 1.1 2
5.7 2.8 4.1 1.3 2
6.3 3.3 6.0 2.5 3
5.8 2.7 5.1 1.9 3
7.1 3.0 5.9 2.1 3
6.3 2.9 5.6 1.8 3
6.5 3.0 5.8 2.2 3
7.6 3.0 6.6 2.1 3
4.9 2.5 4.5 1.7 3
7.3 2.9 6.3 1.8 3
6.7 2.5 5.8 1.8 3
7.2 3.6 6.1 2.5 3
6.5 3.2 5.1 2.0 3
6.4 2.7 5.3 1.9 3
6.8 3.0 5.5 2.1 3
5.7 2.5 5.0 2.0 3
5.8 2.8 5.1 2.4 3
6.4 3.2 5.3 2.3 3
6.5 3.0 5.5 1.8 3
7.7 3.8 6.7 2.2 3
7.7 2.6 6.9 2.3 3
6.0 2.2 5.0 1.5 3
6.9 3.2 5.7 2.3 3
5.6 2.8 4.9 2.0 3
7.7 2.8 6.7 2.0 3
6.3 2.7 4.9 1.8 3
6.7 3.3 5.7 2.1 3
7.2 3.2 6.0 1.8 3
6.2 2.8 4.8 1.8 3
6.1 3.0 4.9 1.8 3
6.4 2.8 5.6 2.1 3
7.2 3.0 5.8 1.6 3
7.4 2.8 6.1 1.9 3
7.9 3.8 6.4 2.0 3
6.4 2.8 5.6 2.2 3
6.3 2.8 5.1 1.5 3
6.1 2.6 5.6 1.4 3
7.7 3.0 6.1 2.3 3
6.3 3.4 5.6 2.4 3
6.4 3.1 5.5 1.8 3
6.0 3.0 4.8 1.8 3
6.9 3.1 5.4 2.1 3
6.7 3.1 5.6 2.4 3
6.9 3.1 5.1 2.3 3
5.8 2.7 5.1 1.9 3
6.8 3.2 5.9 2.3 3
6.7 3.3 5.7 2.5 3
6.7 3.0 5.2 2.3 3
6.3 2.5 5.0 1.9 3
6.5 3.0 5.2 2.0 3
6.2 3.4 5.4 2.3 3
5.9 3.0 5.1 1.8 3

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

【keras】将loss作为模型的一个层的相关文章

Keras conv1d 层参数：过滤器和 kernel_size

我对 keras 的 conv1d 层中的这两个参数感到非常困惑 https keras io layers convolutional conv1d https keras io layers convolutional conv1d 文
Keras IndexError：索引超出范围

我是 Keras 新手我尝试在数据集上执行二进制 MLP 并且不断使索引超出范围但不知道为什么 from keras models import Sequential from keras layers core import Dens
使用 Keras 和 fit_generator 绘制 TensorBoard 分布和直方图

我正在使用 Keras 使用 fit generator 函数训练 CNN 这似乎是一个已知问题 https github com fchollet keras issues 3358TensorBoard 在此设置中不显示直方图和分布有
Keras 显示 GPU 训练速度没有任何改进（部分 GPU 使用？！）

我正在尝试在我的 Jupyter Notebook 的 AWS p2 xlarge 实例上的 GPU 而不是 CPU 上训练我的模型我正在使用tensorflow gpu后端仅tensorflow gpu已安装并在中提到requirem
使用 CNN 和 pytorch 计算每个类别的准确度

我可以使用此代码计算每个时期后的准确性但是我想最后计算每个班级的准确性我怎样才能做到这一点我有两个文件夹 train 和 val 每个文件夹有 7 个不同类别的 7 个文件夹 train 文件夹用于训练否则 val 文件夹用于测试
“Dense”对象没有属性“op”[关闭]

Closed 这个问题是无法重现或由拼写错误引起 help closed questions 目前不接受答案我正在尝试使用tensorflow keras制作一个完全连接的模型这是我的代码 from tensorflow keras m
如何使用分布式 Dask 和预先训练的 Keras 模型进行模型预测？

我正在加载预训练的 keras 模型然后尝试使用 dask 并行化大量输入数据不幸的是我遇到了一些与我如何创建 dask 数组有关的问题任何指导将不胜感激 Setup 首先我从这个仓库克隆https github com sanch
为什么对数损失为负？

我刚刚将 sklearn 中的对数损失应用于逻辑回归 http scikit learn org stable modules generated sklearn metrics log loss html http scikit lear
Keras：加载多个模型并在不同线程中进行预测

我正在使用带有张量流核心的 Keras 我想在构造函数中加载 2 个不同的模型然后在不同的线程中进行预测根据请求我尝试在张量流图上下文中加载这些模型但它不起作用我的代码 from keras models import load
conv1D 中形状的尺寸

我尝试过构建一个只有一层的 CNN 但遇到了一些问题事实上编译器告诉我 ValueError 检查模型输入时出错预期的 conv1d 1 input 具有 3 个维度但得到形状为 569 30 的数组这是代码 import num
如何从 R keras 中的类似生成器的数据中评估（）和预测（）

我有以下代码数据集可以下载here https www dropbox com s qjt5o31oyqj10m8 data tar gz dl 0 or here https www kaggle com c dogs vs cats
获取 Keras model.summary() 作为表

我在 Keras 中创建了相当大的模型我正在用 LaTeX 写一篇关于它的文章为了很好地描述 LaTeX 中的 keras 模型我想用它创建一个 LaTeX 表我可以手动实现它但我想知道是否有任何更好的方法来实现这一点我四处
使用 VGGFace 权重微调 VGG 模型

我正在使用经过微调的 VGG16 模型该模型使用预训练的 VGGFace 权重来处理野外标记面孔 LFW 数据集问题是经过一个时期的训练大约 0 0037 后我得到的准确率非常低即模型根本没有学习我认为这与我的架构有关我的架
模型返回错误 - ValueError：logits 和标签必须具有相同的形状 ((None, 18) vs (None, 1))

我正在使用基于 keras 的多标签分类器我创建了一个加载训练和测试数据的函数然后在函数本身内处理拆分 X Y 我在运行模型时遇到错误但不太确定其含义这是我的代码 def KerasClassifer df train df te
Keras 中 Adam 优化器的衰减参数

我认为 Adam 优化器的设计可以自动调整学习率但是 Keras 中有一个选项可以明确提及 Adam 参数选项中的衰减我想澄清衰减对 Keras 中 Adam 优化器的影响如果我们在 lr 0 001 上使用衰减例如 0 01 编译
如何正确使用vgg模型的中间层

我所做的是 from keras applications vgg16 import VGG16 from keras layers import from keras models import Model import numpy as
如何编辑Google Colaboratory库？

我编辑了 Keras optimizer and layers本地模块但 Colab 使用自己的 Keras 和 TensorFlow 库然后使用编辑后的库进行上传将相当涉及每个路径和包交互并且对于一些小的编辑来说是过度的我最接近访
Keras 多输入 AttributeError：“NoneType”对象没有属性“inbound_nodes”

我正在尝试构建一个模型如下图所示这个想法是采用多个分类特征 one hot 向量并分别嵌入它们然后将这些嵌入向量与 LSTM 的 3D 张量组合起来在以下代码中Keras2 0 2 当创建Model 具有多个输入的对象它会引发A
无法保存自定义子类模型

灵感来自tf keras Model 子类化 https www tensorflow org guide keras model subclassing我创建了自定义模型我可以训练它并获得成功的结果但是我无法保存它我使用 pytho
TensorFlow 的 Print 或 K.print_tensor 不会在损失函数中打印中间张量

我为 Keras 模型编写了一个相当复杂的损失函数并且它不断返回nan训练时因此我需要在训练时打印中间张量我知道你不能在损失函数中执行 K eval 因为张量未初始化不过我都尝试过K print tensor and tf Pr

随机推荐

深度学习训练神经网络时一些名词的意思

假设训练数据集合 T T T 包含 N N N 个样本 xff0c 将数据集划分为 B
【keras】将标签转化为one-hot

将标签转换为onehot span class token keyword from span keras span class token punctuation span utils span class token keyword i
交叉熵

参考自 xff0c 文章在度量两个概率分布的差距时 xff0c 我们通常会使用KL 散度 xff0c 例如对于 p p p q q q 两个概率分布 xff0c 其KL 散度为
深度学习分类损失函数

损失函数softmax cross entropy binary cross entropy sigmoid cross entropy之间的区别与联系 xff0c 以及其在tensorflow中的用法文章 softmax cross e
图像分割任务中的图像增强

对数据进行增强是一种常用的操作 xff0c 用来生成更多的数据 xff0c 提高模型的泛化能力 xff0c 对图像数据增强的常用方法有 xff1a 弹性形变旋转加入噪声等在图像分割任务中 xff0c 除了需要对图像进行变换外 xff0
Python保存图片

image为三维数组 xff08 RGB xff09 或一维数组 xff08 灰白 xff09 span class token keyword import span matplotlib span class token punctua
【keras】多GPU训练模型及保存

下面是一个小demo xff0c 需要注意的地方是保存模型的时候只保存单个GPU的模型 xff0c 不要保存多GPU训练的模型 xff0c 否则加载时会报错的 span class token keyword from span sklea
LSTM参数个数计算方法

计算LSTM的参数个数需要了解其工作原理 xff0c 如下 xff1a 可以看到若不加偏置 xff0c 该LSTM层共有4个权重参数 xff0c 图中输入的一个timestep的特征大小为3 xff08 绿色单元 xff09 xff0c 隐
【keras】搭建多层lstm层

记住中间的lstm层需要返回所有timestep的输出作为下一层lstm的输入 xff0c 所以除了最后一层lstm外其它层的return sequences 61 True span class token keyword from sp
【keras】利用LSTM做简单的时间序列预测

项目地址首先加载时间序列数据集 xff0c 数据集 span class token keyword import span pandas span class token keyword as span pd data span cla
advanced installer 使用常见问题整理

1 安装文件没有被覆盖问题解决 xff1a Files and Folders 选择所有文件右键 Properties Operations Always overwrite existing file 2 ocx或dll文件的注册解
【keras】使用fit_generator训练超大数据集

对于小规模的数据集我们可以将其一次性读入内存 xff08 CPU xff09 中然后再分batch让GPU去训练 xff0c 只要简单地使用fit函数即可 xff1b 然而当数据集规模超大时 xff0c 一次性读入所有数据会使得内存溢出 x
【keras】使用tensorboard可视化

只需要在fit或fit generator前加上这几句代码即可 xff0c log filepath前要加r解析为路径 xff0c 否则会报错 log filepath span class token operator 61 span r
【tf】tf.tile()

https blog csdn net tsyccnh article details 82459859
【keras】layers.ZeroPadding2D()

ZeroPadding2D xff0c 传入的参数如果是一个二维的tuple xff0c top pad bottom pad left pad right pad xff0c 它表示在上下左右分别补多少层零 span class toke
【tf】tf.TensorArray和tf.while_loop组合使用

TensorArray TensorArray可以看做是具有动态size功能的Tensor数组通常都是跟while loop或map fn结合使用常用方法有 write index value xff1a 将value写入TensorA
labelImg下载

labelImg下载 xff0c 选择对应的系统 labelImg可执行文件存放路径不要有中文参考 xff1a labelImg环境配置及使用步骤
linux杀死所有进程

killall5
【keras】实现加权交叉熵（多分类）

在做图像分割任务时由于背景类别占比很大 xff0c 导致网络倾向于预测背景 xff0c 虽然准确率很高 xff0c 但是目标区域完全没有被预测 xff0c 因此考虑修改loss函数交叉熵 xff0c 将背景类别的权重降低实现交叉熵计算交
【keras】将loss作为模型的一个层

使用Keras编写复杂的loss时一般会将loss作为模型的一个层 xff0c 则模型的输入包括原始输入和y true xff0c 模型输出即为loss xff0c 例如yolov3keras版本的代码就是将loss作为模型的一层 xff0

【keras】将loss作为模型的一个层

【keras】将loss作为模型的一个层 的相关文章

随机推荐

热门标签

【keras】将loss作为模型的一个层的相关文章