默认情况下,仅保留叶变量的梯度。非叶变量的梯度不会保留以供以后检查。这是
按设计完成,以节省内存。
-苏米特·金塔拉
See: https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94 https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94
选项1:
Call y.retain_grad()
x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y
y.retain_grad()
z.backward()
print(y.grad)
#Variable containing:
# 8
#[torch.FloatTensor of size 1]
Source: https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/16 https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/16
选项2:
注册一个hook
,这基本上是计算梯度时调用的函数。然后你可以保存它、分配它、打印它,等等......
from __future__ import print_function
import torch
from torch.autograd import Variable
x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y
y.register_hook(print) ## this can be anything you need it to be
z.backward()
output:
Variable containing: 8 [torch.FloatTensor of size 1
Source: https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/2 https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/2
另请参阅:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/7 https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/7