计算复合损失函数各部分的梯度范数

2024-04-26

假设我有以下损失函数:

loss_a = tf.reduce_mean(my_loss_fn(model_output, targets))
loss_b = tf.reduce_mean(my_other_loss_fn(model_output, targets))
loss_final = loss_a + tf.multiply(alpha, loss_b)

可视化梯度范数loss_final可以这样做:

optimizer = tf.train.AdamOptimizer(learning_rate=0.001)
grads_and_vars = optimizer.compute_gradients(loss_final)
grads, _ = list(zip(*grads_and_vars))
norms = tf.global_norm(grads)
gradnorm_s = tf.summary.scalar('gradient norm', norms)
train_op = optimizer.apply_gradients(grads_and_vars, name='train_op')

但是,我想绘制梯度的范数loss_a and loss_b分别地。我怎样才能在最有效率的方式?我需要打电话吗compute_gradients(..)双方loss_a and loss_b分别,然后将这两个梯度添加在一起,然后将它们传递给optimizer.apply_gradients(..)?我知道由于求和规则,这在数学上是正确的,但它看起来有点麻烦,而且我也不知道如何正确实现梯度求和。还,loss_final很简单,因为它只是一个求和。如果什么loss_final更复杂,例如一个部门?

我正在使用张量流0.12。


你是对的,组合渐变可能会变得混乱。相反,只需计算每个损失的梯度以及最终损失。因为张量流优化了有向无环图 (DAG) https://stackoverflow.com/questions/2283757/can-someone-explain-in-simple-terms-to-me-what-a-directed-acyclic-graph-is在编译之前,这不会导致重复工作。

例如:

import tensorflow as tf

with tf.name_scope('inputs'):
    W = tf.Variable(dtype=tf.float32, initial_value=tf.random_normal((4, 1), dtype=tf.float32), name='W')
    x = tf.random_uniform((6, 4), dtype=tf.float32, name='x')

with tf.name_scope('outputs'):
    y = tf.matmul(x, W, name='y')

def my_loss_fn(output, targets, name):
    return tf.reduce_mean(tf.abs(output - targets), name=name)

def my_other_loss_fn(output, targets, name):
    return tf.sqrt(tf.reduce_mean((output - targets) ** 2), name=name)

def get_tensors(loss_fn):

    loss = loss_fn(y, targets, 'loss')
    grads = tf.gradients(loss, W, name='gradients')
    norm = tf.norm(grads, name='norm')

    return loss, grads, norm

targets = tf.random_uniform((6, 1))

with tf.name_scope('a'):
    loss_a, grads_a, norm_a = get_tensors(my_loss_fn)

with tf.name_scope('b'):
    loss_b, grads_b, norm_b = get_tensors(my_loss_fn)

with tf.name_scope('combined'):
    loss = tf.add(loss_a, loss_b, name='loss')
    grad = tf.gradients(loss, W, name='gradients')

with tf.Session() as sess:
    tf.global_variables_initializer().run(session=sess)

    writer = tf.summary.FileWriter('./tensorboard_results', sess.graph)
    res = sess.run([norm_a, norm_b, grad])

    print(*res, sep='\n')

Edit:针对您的评论...您可以使用tensorboard检查张量流模型的DAG https://www.tensorflow.org/get_started/graph_viz。我已经更新了存储图表的代码。

Run tensorboard --logdir $PWD/tensorboard_results在终端中并导航到命令行上打印的 url(通常是http://localhost:6006/)。然后单击“GRAPH”选项卡查看 DAG。您可以递归扩展张量、操作、命名空间以查看子图,从而查看各个操作及其输入。

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

计算复合损失函数各部分的梯度范数 的相关文章

随机推荐