*更新在底部
我尝试使用 3 个类别中的 2 个类别的召回率作为指标,即 A、B、C 类中的 B 类和 C 类。
(其本质是我的模型在类别中高度不平衡[〜90%是A类],因此当我使用准确度时,每次预测A类时我都会得到〜90%的结果)
model.compile(
loss='sparse_categorical_crossentropy', #or categorical_crossentropy
optimizer=opt,
metrics=[tf.keras.metrics.Recall(class_id=1, name='recall_1'),tf.keras.metrics.Recall(class_id=2, name='recall_2')]
)
history = model.fit(train_x, train_y, batch_size=BATCH, epochs=EPOCHS, validation_data=(validation_x, validation_y), callbacks=[tensorboard, checkpoint])
这会抛出一个错误:
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (None, 3) and (None, 1) are incompatible
模型总结为:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm (LSTM) (None, 120, 32) 19328
_________________________________________________________________
dropout (Dropout) (None, 120, 32) 0
_________________________________________________________________
batch_normalization (BatchNo (None, 120, 32) 128
_________________________________________________________________
lstm_1 (LSTM) (None, 120, 32) 8320
_________________________________________________________________
dropout_1 (Dropout) (None, 120, 32) 0
_________________________________________________________________
batch_normalization_1 (Batch (None, 120, 32) 128
_________________________________________________________________
lstm_2 (LSTM) (None, 32) 8320
_________________________________________________________________
dropout_2 (Dropout) (None, 32) 0
_________________________________________________________________
batch_normalization_2 (Batch (None, 32) 128
_________________________________________________________________
dense (Dense) (None, 32) 1056
_________________________________________________________________
dropout_3 (Dropout) (None, 32) 0
_________________________________________________________________
dense_1 (Dense) (None, 3) 99
=================================================================
Total params: 37,507
Trainable params: 37,315
Non-trainable params: 192
请注意,如果使用以下命令,模型可以正常工作,不会出现错误:
metrics=['accuracy']
but this https://github.com/tensorflow/tensorflow/issues/37104 and this https://github.com/tensorflow/tensorflow/issues/42383让我觉得有些东西还没有按照 tf.metrics.SparseCategorical 的方式实现Recall()
from
tf.metrics.SparseCategoricalAccuracy()
因此,我转向了一个自定义指标,该指标陷入了其他问题的兔子洞,因为我在类和装饰器方面非常文盲。
我从一个自定义指标示例中把它搞砸了(我不知道如何使用sample_weight,所以我将其注释掉以便稍后再回来):
class RelevantRecall(tf.keras.metrics.Metric):
def __init__(self, name="Relevant_Recall", **kwargs):
super(RelevantRecall, self).__init__(name=name, **kwargs)
self.joined_recall = self.add_weight(name="B/C Recall", initializer="zeros")
def update_state(self, y_true, y_pred, sample_weight=None):
y_pred = tf.argmax(y_pred, axis=1)
report_dictionary = classification_report(y_true, y_pred, output_dict = True)
# if sample_weight is not None:
# sample_weight = tf.cast(sample_weight, "float32")
# values = tf.multiply(values, sample_weight)
# self.joined_recall.assign_add(tf.reduce_sum(values))
self.joined_recall.assign_add((float(report_dictionary['1.0']['recall'])+float(report_dictionary['2.0']['recall']))/2)
def result(self):
return self.joined_recall
def reset_states(self):
# The state of the metric will be reset at the start of each epoch.
self.joined_recall.assign(0.0)
model.compile(
loss='sparse_categorical_crossentropy', #or categorical_crossentropy
optimizer=opt,
metrics=[RelevantRecall()]
)
history = model.fit(train_x, train_y, batch_size=BATCH, epochs=EPOCHS, validation_data=(validation_x, validation_y), callbacks=[tensorboard, checkpoint])
这个目标是返回一个指标[recall(b)+recall(c)/2]
。我想像这样分别返回两次召回metrics=[recall(b),recall(c)]
会更好,但无论如何我都无法让前者工作。
我收到一个张量布尔错误:OperatorNotAllowedInGraphError: using a 'tf.Tensor' as a Python 'bool' is not allowed: AutoGraph did convert this function. This might indicate you are trying to use an unsupported feature.
哪个谷歌搜索让我添加:@tf.function
高于我的自定义指标类。
这导致了旧类类型与新类类型错误:
super(RelevantRecall, self).__init__(name=name, **kwargs)
TypeError: super() argument 1 must be type, not Function
由于班级有一个对象,我没有看到我是如何实现的?
正如我所说,我对这方面的各个方面都很陌生,因此任何关于如何使用仅选择预测类的度量来实现(以及如何最好地实现)的帮助将非常感激。
OR
如果我的想法完全错误,请告诉我/引导我找到正确的资源
理想情况下,我想采用以前的使用方法tf.keras.metrics.Recall(class_id=1....
因为如果它有效的话,这似乎是最简洁的方法。
在模型的回调部分使用类似的函数时,我能够获得每个类的召回率,但这似乎更密集,因为我必须在每个时期结束时对 val/test 数据进行 model.predict。
还不清楚这是否告诉模型专注于改进所选的类(即在度量与回调中实现它的差异)
回调代码:
class MetricsCallback(Callback):
def __init__(self, test_data, y_true):
# Should be the label encoding of your classes
self.y_true = y_true
self.test_data = test_data
def on_epoch_end(self, epoch, logs=None):
# Here we get the probabilities - longer process
y_pred = self.model.predict(self.test_data)
# Here we get the actual classes
y_pred = tf.argmax(y_pred,axis=1)
report_dictionary = classification_report(self.y_true, y_pred, output_dict = True)
print ("\n")
print (f"Accuracy: {report_dictionary['accuracy']} - Holds: {report_dictionary['0.0']['recall']} - Sells: {report_dictionary['1.0']['recall']} - Buys: {report_dictionary['2.0']['recall']}")
self._data = (float(report_dictionary['1.0']['recall'])+float(report_dictionary['2.0']['recall']))/2
return
metrics_callback = MetricsCallback(test_data = validation_x, y_true = validation_y)
history = model.fit(train_x, train_y, batch_size=BATCH, epochs=EPOCHS, validation_data=(validation_x, validation_y), callbacks=[tensorboard, checkpoint, metrics_callback)
更新 19/07/2021
- 我已经求助于使用
categorical_crossentropy
for loss
代替sparse_categorical_crossentropy
.
- 对我的类/目标数组进行单热编码。
- 使用 tf 召回:
[tf.keras.metrics.Recall(class_id=1, name='recall_1')
我现在使用下面的代码。
train_y = tf.one_hot(train_y, 3)
validation_y = tf.one_hot(validation_y, 3)
test_y = tf.one_hot(test_y, 3)
model.compile(
loss='categorical_crossentropy',
optimizer=opt,
metrics=[tf.keras.metrics.Recall(class_id=1, name='No'),tf.keras.metrics.Recall(class_id=2, name='Yes')]
) #tf.keras.metrics.Recall(class_id=0, name='Wait')
history = model.fit(train_x, train_y, batch_size=BATCH, epochs=EPOCHS, validation_data=(validation_x, validation_y), callbacks=[tensorboard, checkpoint])
谢谢阿布舍克·普拉贾帕特
这实现了相同的总体目标,并且由于少量的互斥类,可能对性能有非常小的差异/影响,
but在存在大量互斥类的情况下,我仍然没有解决方案来实现与上述相同的目标sparse_categorical_crossentropy