Exploring Transfer Learning for Image Classification with MobileNet
Written on
Chapter 1: Introduction to Transfer Learning
In today's rapidly advancing world of data science, the utilization of various methods and algorithms is on the rise. Companies are increasingly integrating data science into their operations for purposes such as cost efficiency, customer interaction, and product innovation. A significant area of focus within this domain is computer vision, particularly image classification. Developing a model from the ground up can be resource-intensive and time-consuming.
This article aims to introduce you to the concept and application of transfer learning using Python. With transfer learning, you can leverage pre-existing convolutional neural network (CNN) models for image classification, achieving commendable performance without the need to build a model from scratch. Read on to discover and try this tutorial for yourself!
Chapter 1.1: Personal Experience in Image Classification
Reflecting on my time at university, I recall participating in a national data mining competition two years ago, where I encountered an image classification challenge. As a Statistics major, the complexity of this subject was daunting at first. However, I pursued knowledge through online resources and collaborated with friends from the computer science program.
My team was unaware of transfer learning, so we built a CNN model from scratch, training it for only ten epochs on 800 images across five classes. With just three hours for image preprocessing, model development, and presentation preparation, we ended up ranking 4th out of 10, achieving a mere 16.5% mean average precision (mAP).
After the competition, I spoke with the winning team and learned they had employed transfer learning, which significantly contributed to their success.
Chapter 1.2: Understanding Transfer Learning
Transfer learning involves utilizing a pre-trained model, which has been trained on a substantial dataset, to address a different yet related problem. This process can be likened to the relationship between a teacher and a student, where the pre-trained model imparts knowledge to help solve a new challenge.
Transfer learning gained traction with the ImageNet competition launched in 2010, which utilized over 1.2 million images for training, 50,000 for validation, and 100,000 for testing, categorized into 1,000 classes. Each year, the top-performing models are recognized, with CoAtNet-7 achieving an impressive accuracy of 90.88% in 2021.
Chapter 1.3: Types of Transfer Learning Methods
There are three primary methods of transfer learning, each suitable for different scenarios:
- Fixed Feature Extractor: This method involves using a pre-trained model as a feature extractor. The weights in the feature extraction layer are fixed while the fully connected layer is removed.
- Fine-tuning: Here, the architecture of a pre-trained model is preserved, and the weights in the feature extraction layer are initialized, allowing for training.
- Hybrid Approach: This combines both fixed feature extraction and fine-tuning, where some layers are frozen while others are trained.
Selecting the appropriate transfer learning method depends on the nature of the problem and the data at hand.
Chapter 1.4: Choosing the Right Method
To determine the best method for your scenario, consider the following quadrants:
- Quadrant 1: Large dataset with low similarity – better to build a model from scratch.
- Quadrant 2: Large dataset with high similarity – consider training some layers while freezing others.
- Quadrant 3: Small dataset with low similarity – similar to Quadrant 2.
- Quadrant 4: Small dataset with high similarity – implement the fixed feature extractor method.
The greater the similarity between your data and that of the pre-trained model, the more layers can be frozen.
Chapter 2: The Case for MobileNet
Returning to my university experience, the winning team utilized MobileNet as their pre-trained model. They achieved an mAP of approximately 85% on images of leaf types, making MobileNet an excellent starting point for this tutorial.
According to M. Hollemans (2018), MobileNet offers several advantages:
- Lightweight architecture, reducing computational demands.
- Effective for deployment on mobile devices.
However, these benefits come with a trade-off, as the best accuracy for MobileNet V2 is around 74.70%.
Chapter 2.1: Implementing Transfer Learning
To begin, download the dataset consisting of two classes (cats and dogs) with 8,000 training images and 2,000 validation images. The Jupyter notebook containing the implementation details is available at the end of this article.
Organize your files and folders as follows:
TRANSFER LEARNING
├── data
│ ├── train
│ │ ├── cats
│ │ │ ├── cat.1.jpg
│ │ │ ├── cat.2.jpg
│ │ │ ├── ...
│ │ │ └── cat.100.jpg
│ │ └── dogs
│ │ ├── dog.1.jpg
│ │ ├── dog.2.jpg
│ │ ├── ...
│ │ └── dog.100.jpg
│ └── validation
│ ├── cats
│ │ ├── cat.4001.jpg
│ │ ├── cat.4002.jpg
│ │ ├── ...
│ │ └── cat.4030.jpg
│ └── dogs
│ ├── dog.4001.jpg
│ ├── dog.4002.jpg
│ ├── ...
│ └── dog.4030.jpg
└── Transfer Learning - Feature Extraction.ipynb
If Keras and TensorFlow are not installed on your machine, consider using Google Colab, where you can upload the dataset to Google Drive.
Note: This hands-on tutorial utilizes Google Colab.
Chapter 2.2: Setting Up the Environment
import numpy as np
import os
import matplotlib.pyplot as plt
from plotnine import *
import tensorflow as tf
from tensorflow.keras.preprocessing import image_dataset_from_directory
from tensorflow.keras.layers.experimental.preprocessing import RandomFlip, RandomRotation, Rescaling
from tensorflow.keras.applications import MobileNetV2
import pandas as pd
import warnings
warnings.filterwarnings('ignore', category=FutureWarning)
Mount your Google Drive and set the directory paths as follows:
from google.colab import drive
drive.mount('/content/gdrive')
root_path = 'gdrive/MyDrive/TRANSFER LEARNING/'
train_dir = os.path.join(root_path, 'data/train')
validation_dir = os.path.join(root_path, 'data/validation')
Load images using the image_dataset_from_directory method:
BATCH_SIZE = 32
IMG_SIZE = (160, 160)
train_dataset = image_dataset_from_directory(
train_dir,
shuffle=True,
batch_size=BATCH_SIZE,
image_size=IMG_SIZE
)
validation_dataset = image_dataset_from_directory(
validation_dir,
shuffle=True,
batch_size=BATCH_SIZE,
image_size=IMG_SIZE
)
Visualize the loaded images to confirm successful import:
class_names = train_dataset.class_names
plt.figure(figsize=(10, 10))
for images, labels in train_dataset.take(1):
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
plt.imshow(images[i].numpy().astype('uint8'))
plt.title(class_names[labels[i]])
plt.axis('off')
Chapter 2.3: Preparing the Data
Given the high similarity of the dog and cat classes to the ImageNet database, we can proceed with the fixed feature extractor method. Next, we can create a testing set from the validation images, determining its size based on our needs:
val_batches = tf.data.experimental.cardinality(validation_dataset)
test_dataset = validation_dataset.take(val_batches // 5)
validation_dataset = validation_dataset.skip(val_batches // 5)
print('Number of validation batches: {}'.format(
tf.data.experimental.cardinality(validation_dataset))
)
print('Number of test batches: {}'.format(
tf.data.experimental.cardinality(test_dataset))
)
To prepare the images for MobileNet, we need to scale the pixel values:
rescale = Rescaling(scale=1./127.5, offset=-1)
Load the MobileNet model as follows:
IMG_SHAPE = IMG_SIZE + (3,)
base_model = MobileNetV2(
input_shape=IMG_SHAPE,
include_top=False,
weights='imagenet'
)
base_model.trainable = False
Chapter 2.4: Building the Model
Define the model architecture, including data preprocessing and the feature extraction layer:
inputs = tf.keras.Input(shape=(160, 160, 3))
x = RandomFlip('horizontal')(inputs)
x = RandomRotation(0.2)(x)
x = rescale(x)
x = base_model(x, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dropout(0.2)(x)
outputs = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs, outputs)
Compile the model with Adam optimizer and binary cross-entropy loss:
base_learning_rate = 0.0001
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=base_learning_rate),
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy']
)
model.summary()
Chapter 3: Training the Model
Train the model using the training dataset, setting a modest number of epochs:
initial_epochs = 10
final_model = model.fit(
train_dataset,
epochs=initial_epochs,
validation_data=validation_dataset
)
The final_model object contains the history of the model's performance throughout training.
Chapter 3.1: Evaluating Model Performance
Plot the accuracy and loss for both training and validation datasets:
accuracy_train = final_model.history['accuracy']
accuracy_val = final_model.history['val_accuracy']
loss_train = final_model.history['loss']
loss_val = final_model.history['val_loss']
data_type = ['Training'] * initial_epochs + ['Validation'] * initial_epochs
df = pd.DataFrame({
'Data': data_type,
'Epoch': list(range(1, initial_epochs + 1)) * 2,
'Accuracy': accuracy_train + accuracy_val,
'Loss': loss_train + loss_val
})
df['Epoch'] = df['Epoch'].astype('category')
df.head()
Visualize the training and validation accuracy:
plotnine.options.figure_size = (10, 4.8)
(
ggplot(data=df) +
geom_line(aes(x='Epoch', y='Accuracy', color='Data', group='Data'), size=1.5, show_legend=True) +
scale_color_manual(name=' ', values=['#80797c', '#981220'], labels=['Training', 'Validation']) +
labs(title='Training and Validation Accuracy') +
xlab('Number of Epoch') +
ylab('Accuracy') +
theme_minimal()
)
Similarly, visualize the loss metrics:
plotnine.options.figure_size = (10, 4.8)
(
ggplot(data=df) +
geom_line(aes(x='Epoch', y='Loss', color='Data', group='Data'), size=1.5, show_legend=True) +
scale_color_manual(name=' ', values=['#80797c', '#981220'], labels=['Training', 'Validation']) +
labs(title='Training and Validation Loss') +
xlab('Loss') +
ylab('Number of Epoch') +
theme_minimal()
)
Chapter 3.2: Testing the Model
Upon evaluating the model on the testing set, it achieved an accuracy of 98.18%. This indicates that the model is performing robustly, accurately predicting the vast majority of new images.
loss, accuracy = model.evaluate(test_dataset)
print('Test accuracy: {}'.format(accuracy))
To visualize the model's predictions on the testing set, you can use the following code:
image_batch, label_batch = test_dataset.as_numpy_iterator().next()
predictions = model.predict_on_batch(image_batch).flatten()
predictions = tf.nn.sigmoid(predictions)
predictions = tf.where(predictions < 0.5, 0, 1)
plt.figure(figsize=(10, 10))
for i in range(9):
text_pred = 'cats' if predictions[i] == 0 else 'dogs'
ax = plt.subplot(3, 3, i + 1)
plt.imshow(image_batch[i].astype('uint8'))
plt.title('Prediction: {}'.format(text_pred))
plt.axis('off')
Chapter 4: Additional Resources
To deepen your understanding of transfer learning and MobileNet, check out the following video tutorials:
This video provides a hands-on guide to using transfer learning with MobileNet V2 for image classification.
This tutorial focuses on image classification using MobileNet V3 with transfer learning techniques.
References
[2] K. Dong et al. 2020. MobileNetV2 Model for Image Classification. 2020 2nd International Conference on Information Technology and Computer Application (ITCA). pp: 476–480.
[6] W. Dai et al. 2020. A Flower Classification Approach with MobileNetV2 and Transfer Learning. The 9th International Symposium on Computational Intelligence and Industrial Applications (ISCIIA2020). pp: 1–5.