我正在尝试使用词嵌入使用 Bi-LSTM 制作注意力模型。我碰到如何在keras中添加注意力机制? https://stackoverflow.com/questions/42918446/how-to-add-an-attention-mechanism-in-keras, https://github.com/philipperemy/keras-attention-mechanism/blob/master/attention_lstm.py https://github.com/philipperemy/keras-attention-mechanism/blob/master/attention_lstm.py and https://github.com/keras-team/keras/issues/4962 https://github.com/keras-team/keras/issues/4962.
但是,我对实施感到困惑Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification
. So,
_input = Input(shape=[max_length], dtype='int32')
# get the embedding layer
embedded = Embedding(
input_dim=30000,
output_dim=300,
input_length=100,
trainable=False,
mask_zero=False
)(_input)
activations = Bidirectional(LSTM(20, return_sequences=True))(embedded)
# compute importance for each step
attention = Dense(1, activation='tanh')(activations)
我在这里很困惑哪个方程与论文中的方程是什么。
attention = Flatten()(attention)
attention = Activation('softmax')(attention)
RepeatVector 会做什么?
attention = RepeatVector(20)(attention)
attention = Permute([2, 1])(attention)
sent_representation = merge([activations, attention], mode='mul')
现在,我再次不确定为什么这条线在这里。
sent_representation = Lambda(lambda xin: K.sum(xin, axis=-2), output_shape=(units,))(sent_representation)
由于我有两个类,我将最终的 softmax 为:
probabilities = Dense(2, activation='softmax')(sent_representation)