ML Zoomcamp 2023 – Deep Learning – Part 6

This part is devided into 2 sections. This article is about the first part where we want to read image data using the ImageDataGenerator. The second article covers the second part where we want to train an Xception model.

  1. Transfer Learning – Part 1/2
    1. Reading data with ImageDataGenerator

Transfer Learning – Part 1/2

This time we want to use an already trained network that was trained on ImageNet. The filters that were learned are quite generic so they can be used for many purposes. The model learned to take an image and convert it into a vector representation. This part (without dense layers) is quite generic and we don’t really change it for our task. To train these filters in the convolution layers is very difficult because a lots of images is needed. Then there is a bunch of dense layers after the conversion into vector representation for making the final prediction. This second part of dense layers this is specific to the dataset that is used. In this case its specific to ImageNet and that dataset is trained on 1,000 different classes, which we don’t need for out task. While the vector representation is very useful the dense layers are not. So we don’t need the second part. That means we keep the convolutional layers but we want to train new dense layers. This is the idea behind transfer learning.

Reading data with ImageDataGenerator

Use image size of 150×150 to experiment faster, because to train 299×299 needs four times more. That means to train model faster we will use smaller images and then at the end we will retrain bigger images. batch_size means how many images are needed at once. The shape will be (32, 150, 150, 3). Then the batch will go through convolutional layers and then we will have 32 vectors and also 32 predictions at the end.

from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_gen = ImageDataGenerator(preprocessing_function=preprocess_input)

train_ds = train_gen.flow_from_directory(
    './clothing-dataset-small/train',
    target_size=(150, 150),
    batch_size=32
)

# Output: Found 3068 images belonging to 10 classes.

For now we see that 3,068 were found belonging to 10 classes. To see which classes are there we can use:

train_ds.class_indices

# Output: 
# {'dress': 0,
#  'hat': 1,
#  'longsleeve': 2,
#  'outwear': 3,
#  'pants': 4,
#  'shirt': 5,
#  'shoes': 6,
#  'shorts': 7,
#  'skirt': 8,
#  't-shirt': 9}

The names are inferred from the folder structure. That means everything that is inside the folder t-shirt is put under the t-shirt class.

# !ls -l clothing-dataset-small/train
!ls clothing-dataset-small/train

# Output: dress  hat  longsleeve	outwear  pants	shirt  shoes  shorts  skirt  t-shirt

Then we can look at what this dataset generates. We use an iterator to be able to get the next batch. (A for-loop does this internally).

next(train_ds)

# Output:
# (array([[[[ 0.03529418, -0.09803921, -0.30196077],
#                  [ 0.05098045, -0.08235294, -0.2862745 ],
#                  [ 0.06666672, -0.06666666, -0.27058822],
#                    ...,
#                  [ 0.07450986, -0.05882353, -0.26274508],
#                  [ 0.02745104, -0.10588235, -0.3098039 ],
#                  [ 0.04313731, -0.09019607, -0.29411763]],
#        
#                 [[ 0.04313731, -0.09019607, -0.29411763],
#                  [ 0.06666672, -0.06666666, -0.27058822],
#                  [ 0.082353  , -0.05098039, -0.25490195],
#                 ...,
#                  [ 0.09803927, -0.03529412, -0.23921567],
#                  [ 0.14509809,  0.01176476, -0.19215685],
#                  [ 0.09803927, -0.03529412, -0.23921567]],
#       
#                ...,
#       ...
#                 [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
#                 [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
#                 [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
#                 [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
#                 [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.]], dtype=float32))

This returns two things. It returns our features the images and then y which are the labels.

X, y = next(train_ds)

The output of X is nothing new. What happens here is a bunch of images is loaded and then the function “preprocess_input” function is applied. To look at the shape of X, we see that is exactly what we expect here.

X

# Output:
# array([[[[ 0.30980396,  0.20784318,  0.13725495],
#            [ 0.30980396,  0.16078436,  0.20784318],
#            [ 0.22352946,  0.04313731,  0.10588241],
#        ...,
# ...
#            [ 0.30980396,  0.06666672, -0.64705884],
#         ...,
#            [ 0.32549024,  0.05882359, -0.70980394],
#            [ 0.3176471 ,  0.07450986, -0.6392157 ],
#            [ 0.3176471 ,  0.09019613, -0.6313726 ]]]], dtype=float32)

X.shape

# Output: (32, 150, 150, 3)

For the labels one-hot encoding is used. As we have seen before the last column with index 9 is for t-shirts. So we have 3 t-shirts, 1 pents, and 1 shoes examples. This is how we do multi-class classification. You can also think of this as 10 different binary variables and then fitting 10 different models. In real it’s only one but you can conceptually think like this.

train_ds.class_indices

# Output: 
# {'dress': 0,
#  'hat': 1,
#  'longsleeve': 2,
#  'outwear': 3,
#  'pants': 4,
#  'shirt': 5,
#  'shoes': 6,
#  'shorts': 7,
#  'skirt': 8,
#  't-shirt': 9}

y[:5]

# Output:
# array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
#             [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
#             [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
#             [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
#             [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.]], dtype=float32)

Let’s do the same for validation data.

val_gen = ImageDataGenerator(preprocessing_function=preprocess_input)

val_ds = val_gen.flow_from_directory(
    './clothing-dataset-small/validation',
    target_size=(150, 150),
    batch_size=32,
    shuffle=False
)

# Output: Found 341 images belonging to 10 classes.

The output is very similar as seen before. It founds 341 images belonging to the same 10 known classes.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.