使用 CustomCallback() 类在训练时实现冻结层

2024-04-30

我正在尝试在 TensorFlow 中训练自定义 CNN 模型。我想以某种方式在训练仍在运行时冻结特定时期模型的某些层。我已经实现了冻结层,但我必须在某些时期训练模型,然后在我想要冻结的特定层中将可训练属性更改为 False,然后编译模型,然后再次开始训练。

我尝试使用 CustomCallback() 类来实现它,并在某些时期冻结某些层,但似乎这不起作用。至于 TensorFlow 提到更改层的 .trainable 属性,那么您必须编译模型才能将更改应用于模型,但出现错误“TypeError:‘NoneType’对象不可调用” 。

这是我的代码:

加载库

import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
from keras.callbacks import ModelCheckpoint, EarlyStopping
from tensorflow.keras.utils import Sequence
from keras.models import load_model

加载数据集

#Load dataset
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.cifar10.load_data()
#Normalize
X_train, X_test = X_train/255.0, X_test/255.0

搭建模型

cnn = models.Sequential([
    
    layers.Conv2D(filters = 32, kernel_size = (1,1), padding = "same", activation = "relu", input_shape = (32,32,3)),
    layers.Conv2D(filters = 64, kernel_size = (3,3), padding = "same", activation = "relu"),
    layers.MaxPool2D(pool_size = (2,2)),
    
    layers.Conv2D(filters = 64, kernel_size = (3,3), padding = "same", activation = "relu"),
    layers.Conv2D(filters = 128, kernel_size = (5,5), padding = "same", activation = "relu"),
    layers.MaxPool2D(pool_size = (2,2)),
    
    layers.Flatten(),
    layers.Dense(64, activation = "relu"),
    layers.Dense(128, activation = "relu"),
    layers.Dense(64, activation = "relu"),
    layers.Dense(10, activation = "softmax")  
])

创建自定义回调类

class CustomCallback(tf.keras.callbacks.Callback):
    def on_epoch_begin(self, epoch, logs = None):
        if epoch == 5:
            cnn.layers[0].trainable, cnn.layers[1].trainable, cnn.layers[2].trainable = (False, False, False)
            cnn.compile(optimizer = optimizer, loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])
        elif epoch == 10:
            cnn.layers[3].trainable, cnn.layers[4].trainable, cnn.layers[5].trainable = (False, False, False)
            cnn.compile(optimizer = optimizer, loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])
        elif epoch == 15:
            cnn.layers[6].trainable, cnn.layers[7].trainable, cnn.layers[8].trainable = (False, False, False)
            cnn.compile(optimizer = optimizer, loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])

定义优化器并编译

#Define the optimizer
optimizer = tf.keras.optimizers.Adam(learning_rate = 0.001)

#Compile the model
cnn.compile(optimizer = optimizer , loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])

火车模型

results = cnn.fit(X_train, y_train, epochs = 20, validation_data = (X_test, y_test), batch_size = 1024, callbacks = [CustomCallback()])

弹出错误“TypeError: 'NoneType' object is not callable”。 如果我在冻结某些层后不编译模型,它似乎不会出现错误,但在训练时所有层都会在所有时期更新。


正如所指出的,为了更改层的状态,必须重新编译模型。所以我所做的就是训练模型 5 个 epoch。我将权重保存到文件中。然后我将第 7 层设置为不可训练。然后我重新编译了模型。然后我将保存的权重加载到模型中,然后再运行 5 个 epoch。在这些时期结束时,我将重量与我加载的重量进行了比较,它们是相同的。因此,模型编译后开始的代码如下所示:

print('{0:^8s}{1:^80s}{2:^12s}'. format('Layer', 'Layer Description', 'Trainable'))
for i, layer in enumerate(cnn.layers):    
    print( '{0:^8s}{1:^80s}{2:^12s}'. format(str(i), str(layer), str(layer.trainable)))

这只是根据下面所示的打印输出给出了模型中每一层的信息

Layer                                 Layer Description                                 Trainable  
   0            <keras.layers.convolutional.Conv2D object at 0x00000261CCB7A370>            True    
   1            <keras.layers.convolutional.Conv2D object at 0x00000261E55F4700>            True    
   2            <keras.layers.pooling.MaxPooling2D object at 0x00000261E55F4970>            True    
   3            <keras.layers.convolutional.Conv2D object at 0x00000261E567B160>            True    
   4            <keras.layers.convolutional.Conv2D object at 0x00000261E567B280>            True    
   5            <keras.layers.pooling.MaxPooling2D object at 0x00000261E55F44C0>            True    
   6            <keras.layers.core.flatten.Flatten object at 0x00000261E567B700>            True    
   7              <keras.layers.core.dense.Dense object at 0x00000261E567BD30>              True    
   8              <keras.layers.core.dense.Dense object at 0x00000261E5680070>              True    
   9              <keras.layers.core.dense.Dense object at 0x00000261E56802B0>              True    
   10             <keras.layers.core.dense.Dense object at 0x00000261E56805B0>              True    

然后我对模型进行了 5 个 epoch 的训练,并打印出权重和偏差代码如下

history=cnn.fit(x=train_gen,   epochs=5, verbose=1,   validation_data=valid_gen,
                   validation_steps=None,  shuffle=True,  initial_epoch=0) # train the model
weights_and_biases=cnn.layers[7].get_weights()
weights=weights_and_biases[0]
print ('shape of weights is= ',weights.shape) # has 64 nodes receiving 131072 inputs from the flatten layer
biases=weights_and_biases[1]
print ('shape of biases is- ',biases.shape)
first_10_weights=weights[0][0:10]
print (first_10_weights)
first_10_biases=biases[0:10]
print (first_10_biases)

第 5 个 epoch 结束时的权重和偏差的打印输出如下所示

shape of weights is=  (131072, 64)
shape of biases is-  (64,)
[-0.00171461 -0.00061654 -0.0004427   0.006399    0.00065272  0.00117902
  0.00206342 -0.00248441 -0.00172774  0.00399113]
[-0.0098094  -0.01114658 -0.00550008  0.00675221 -0.00647649  0.01904665
  0.0103933   0.01889692 -0.01373082  0.00189758]

然后我将权重保存到文件中。我将第 7 层的状态更改为不可训练并重新编译模型。编译后,我将保存的权重加载到模型中,并再次打印出权重和偏差,以确保它们正确加载。代码如下

filepath=r'C:\DATASETS\spiders\run1.h5' # save the weights at the end of 5 epochs to this file
cnn.save_weights(filepath, overwrite=True, save_format=None, options=None) # save the weights
cnn.layers[7].trainable=False # make layer 7 not trainable
cnn.compile(optimizer = optimizer , loss = "categorical_crossentropy", metrics = ["accuracy"]) # compile the model
cnn.load_weights(filepath, by_name=False, skip_mismatch=False, options=None) # load the model with the saved weights
weights_and_biases=cnn.layers[7].get_weights() #get the weights to make sure they are the same as at the end of epoch 5
weights=weights_and_biases[0] # print out the weights
print ('shape of weights is= ',weights.shape) # has 64 nodes receiving 131072 inputs from the flatten layer
biases=weights_and_biases[1]
print ('shape of biases is- ',biases.shape)
first_10_weights=weights[0][0:10]
print (first_10_weights)
first_10_biases=biases[0:10]
print (first_10_biases)

打印结果如下所示,符合预期

shape of weights is=  (131072, 64)
shape of biases is-  (64,)
[-0.00171461 -0.00061654 -0.0004427   0.006399    0.00065272  0.00117902
  0.00206342 -0.00248441 -0.00172774  0.00399113]
[-0.0098094  -0.01114658 -0.00550008  0.00675221 -0.00647649  0.01904665
  0.0103933   0.01889692 -0.01373082  0.00189758]

然后我又训练了 5 个 epoch。在这些 epoch 结束时,我打印出了不应该改变的第 7 层权重。代码如下所示

history=cnn.fit(x=train_gen,   epochs=5, verbose=1,   validation_data=valid_gen,
                   validation_steps=None,  shuffle=True,  initial_epoch=0) # train the model
weights_and_biases=cnn.layers[7].get_weights()
weights=weights_and_biases[0]
print ('shape of weights is= ',weights.shape) # has 64 nodes receiving 131072 inputs from the flatten layer
biases=weights_and_biases[1]
print ('shape of biases is- ',biases.shape)
first_10_weights=weights[0][0:10]
print (first_10_weights)
first_10_biases=biases[0:10]
print (first_10_biases)

下面显示的打印结果显示权重和偏差没有改变

shape of weights is=  (131072, 64)
shape of biases is-  (64,)
[-0.00171461 -0.00061654 -0.0004427   0.006399    0.00065272  0.00117902
  0.00206342 -0.00248441 -0.00172774  0.00399113]
[-0.0098094  -0.01114658 -0.00550008  0.00675221 -0.00647649  0.01904665
  0.0103933   0.01889692 -0.01373082  0.00189758]

所以这个过程就是构建和编译你的模型。运行 N 个 epoch。将权重保存到文件中。然后更改各层的训练状态。重新编译模型。加载保存的权重。继续训练。

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

使用 CustomCallback() 类在训练时实现冻结层 的相关文章

随机推荐