In Pytorch RNN 实现 http://pytorch.org/docs/master/nn.html?highlight=rnn#torch.nn.RNN,有两个偏差,b_ih
and b_hh
。
为什么是这样?它与使用一种偏差有什么不同吗?如果是,怎么办?会影响性能或效率吗?
实际上,之前(已接受)的答案是错误的。仅由于与 CuDNN 兼容,才需要第二个偏差参数。见同代码文档 https://pytorch.org/docs/master/_modules/torch/nn/modules/rnn.html#RNNBase:
class RNNBase(Module):
...
def __init__(self, ...):
...
w_ih = Parameter(torch.empty((gate_size, layer_input_size), **factory_kwargs))
w_hh = Parameter(torch.empty((gate_size, real_hidden_size), **factory_kwargs))
b_ih = Parameter(torch.empty(gate_size, **factory_kwargs))
# Second bias vector included for CuDNN compatibility. Only one <--- this
# bias vector is needed in standard definition. <--- comment
b_hh = Parameter(torch.empty(gate_size, **factory_kwargs))
...
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)