Regularization and Dropout

When training a model for 10 epochs, then the model will see the same image 10 times. If there is a special sign like a logo, the model recognize it and tend to declare everything as t-shirt that has that logo. That means the model could make many mistakes in classification in validation data. That leads to a worse generalization capability. Instead the model should look at more general things like shapes instead of specific details.

What if we could randomly hide a part of the image? That is the main idea behind dropout. But in dropout we’re not really hiding a part of the image but a part of the input. That means it applies the idea to inner layers.

Let’s assume one fully connected dense layer with four inputs and three outputs. Dropout means that we freeze a part of this layer. That means that the frozen part does not get updated when running the current iteration. In the next iteration another part is frozen. By doing this we force the neural network to focus on the bigger picture (shape instead of details). But the output of the neural network still gets all flares. That means the output layer still looks at all parts – also the frozen ones.

Regularization means that we introduce something that doesn’t let the neural network overfit to some patterns that might not exist. droprate=0.5 means that in each iteration we freeze 50% of this layer. Dropout keeps the dimensionality of the layer.

We’ll look at these points:

Regularizing by freezing a part of the network
Adding dropout to our model
Experimenting with different values

def make_model(learning_rate=0.01, size_inner=100, droprate=0.5):
    base_model = Xception(
        weights='imagenet',
        include_top=False,
        input_shape=(150, 150, 3)
    )

    base_model.trainable = False

    #########################################

    inputs = keras.Input(shape=(150, 150, 3))
    base = base_model(inputs, training=False)
    vectors = keras.layers.GlobalAveragePooling2D()(base)
    
    inner = keras.layers.Dense(size_inner, activation='relu')(vectors)
    drop = keras.layers.Dropout(droprate)(inner)
    
    outputs = keras.layers.Dense(10)(drop)
    
    model = keras.Model(inputs, outputs)
    
    #########################################

    optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
    loss = keras.losses.CategoricalCrossentropy(from_logits=True)

    model.compile(
        optimizer=optimizer,
        loss=loss,
        metrics=['accuracy']
    )
    
    return model

The downside of dropout is that you’ll need more iterations to learn something. Therefor we change the value from 10 to 30.

learning_rate = 0.001
size = 100

scores = {}

for droprate in [0.0, 0.2, 0.5, 0.8]:
    print(droprate)

    model = make_model(
        learning_rate=learning_rate,
        size_inner=size,
        droprate=droprate
    )

    history = model.fit(train_ds, epochs=30, validation_data=val_ds)
    scores[droprate] = history.history

    print()
    print()

# Output:
# 0.0
# Epoch 1/30
# 96/96 [==============================] - 128s 1s/step - loss: 0.9759 - accuracy: 0.6659 - val_loss: 0.6152 - val_accuracy: 0.7859
# Epoch 2/30
# 96/96 [==============================] - 102s 1s/step - loss: 0.5078 - accuracy: 0.8233 - val_loss: 0.6221 - val_accuracy: 0.7859
# Epoch 3/30
# 96/96 [==============================] - 104s 1s/step - loss: 0.3430 - accuracy: 0.8908 - val_loss: 0.5946 - val_accuracy: 0.7977
...
# Epoch 29/30
# 96/96 [==============================] - 103s 1s/step - loss: 0.0145 - accuracy: 0.9971 - val_loss: 0.7599 - val_accuracy: 0.8299
# Epoch 30/30
# 96/96 [==============================] - 103s 1s/step - loss: 0.0158 - accuracy: 0.9974 - val_loss: 0.7727 - val_accuracy: 0.8182

# 0.5
# Epoch 1/30
# 96/96 [==============================] - 110s 1s/step - loss: 1.2975 - accuracy: 0.5593 - val_loss: 0.7498 - val_accuracy: 0.7537
# Epoch 2/30
# 96/96 [==============================] - 101s 1s/step - loss: 0.8512 - accuracy: 0.7053 - val_loss: 0.6457 - val_accuracy: 0.8006
# Epoch 3/30
# 96/96 [==============================] - 102s 1s/step - loss: 0.7035 - accuracy: 0.7487 - val_loss: 0.5934 - val_accuracy: 0.8123
...
# Epoch 29/30
# 96/96 [==============================] - 100s 1s/step - loss: 0.6695 - accuracy: 0.7402 - val_loss: 0.5769 - val_accuracy: 0.8211
# Epoch 30/30
# 96/96 [==============================] - 100s 1s/step - loss: 0.6575 - accuracy: 0.7324 - val_loss: 0.5613 - val_accuracy: 0.8240

for droprate, hist in scores.items():
    plt.plot(hist['val_accuracy'], label=('val=%s' % droprate))

plt.ylim(0.78, 0.86)
plt.legend()

hist = scores[0.0]
plt.plot(hist['val_accuracy'], label=0.0)

hist = scores[0.2]
plt.plot(hist['val_accuracy'], label=0.2)

plt.legend()
#plt.plot(hist['accuracy'], label=('val=%s' % droprate))

ML Zoomcamp 2023 – Deep Learning – Part 11

Regularization and Dropout

Leave a comment Cancel reply

Regularization and Dropout

Teilen mit:

Related

Leave a comment Cancel reply