Transfer Learning with KERAS

Learning is a never-ending process, but it’s more important to use previously gained knowledge in a new experiment.

Deep Patel
4 min readJul 21, 2023

Transfer Learning as the name suggests, is a technique to use previously gained knowledge gained to train new similar models. This technique can also be regarded as a shortcut to solve both machine learning and deep learning problems and it’s proved to be the future of machine learning.

Machine learning expert Andrew Ng on transfer learning said: “Transfer Learning leads to Industrialisation”.

Transfer learning in Machine learning is completely inspired by humans’ way of learning new things. We human beings, always use our prior knowledge to perform new tasks. Some real-life examples are:

  • Knowing a bicycle makes it easy to learn the motorcycle.
  • Knowing Deer makes it easier to recognize Antelope (Antelopes are considered the sister group to deer).

In this blog, we will be covering the following topics:

  1. The need for transfer learning.
  2. How to transfer learning works with a few examples of transfer learning in deep learning with KERAS.
  3. When to use transfer learning?

The need for transfer learning

In many ways, transfer learning is a boon, if applied to an appropriate task.

  • Reduce the effort to build a complete model from scratch
  • Effective against a small dataset
  • Saves time
  • Save a lot of money involved in a high-cost GPU for retraining the mode

As you can see that transfer learning has many benefits, but don’t be too excited. It too has some limitation, these are discussed in the last part of this story. First, let’s look at the working and implementation part.

How does transfer learning work?

Photo by Marvin Meyer on Unsplash

Transfer learning mainly works with two approaches:

Develop model approach — In this approach, you make a complete model (let's say it as A) from scratch with a huge dataset. Extract the features from A and further, uses these features to train another model B. Sometimes, model A may require some changes before use. This approach is used when we have a large dataset and we need to use that model for other similar experiments in the future.

Pre-trained model approach — Here, a pre-trained model is chosen from the list of models. Many research institutions release models on large and challenging datasets that may be included in the pool of candidate models from which to choose. The selected model can be used for task B and also, just like the first approach, may involve slight modifications to be done.

Most of such models are developed during the competition and then made public under their license. These models are rigorously trained with high-end GPU and up to 1000 classes classification creating a highly efficient model.

Famous examples of pre-trained models used for image classification:

  • Oxford VGG Model
  • Google Inception Model
  • Microsoft ResNet Model

Pre-trained model to classify text data: Mapping of words to a high-dimensional continuous vector space where different words with a similar meaning have a similar vector representation is done efficiently with these models.

  • Google’s word2vec Model
  • Stanford’s GloVe Model

Now let’s build an actual model

  • Setup

import numpy as np
import tensorflow as tf
from tensorflow import keras

  • Import Pre-trained model — Import the pre-trained model from Keras (here, we used ImageNet, ImageNet is a large database or dataset of over 14 million images). The number of neurons in the last layer must be equal to the number of classes. This is done by setting (IncludeTop=False) when importing the model because the original ImageNet model has thousands of classes.
base_model = keras.applications.Xception(
weights='imagenet', # Load weights pre-trained on ImageNet.
input_shape=(150, 150, 3),
include_top=False) # Do not include the ImageNet classifier at the top.
  • Load training data — ImageDataGenerator is an iterator to load images and fit them into the model. <keras_preprocessing.image.dataframe_iterator.DataFrameIterator at 0x7fb1f697a510>

Refer to Keras Documentation for more information on this.

  • Compile and Fit — The final step is to compile the model with a modified input layer and output layer. Remember base_model.trainable = False as the model is already trained with lots of data and GPU power.
#First free the base layer to add other layers.
base_model.trainable = False
#Create a new model on top
inputs = keras.Input(shape=(150, 150, 3))
# We make sure that the base_model is running in inference mode here,
# by passing `training=False`. This is important for fine-tuning, as you will
# learn in a few paragraphs.
x = base_model(inputs, training=False)# Convert features of shape `base_model.output_shape[1:]` to vectors
x = keras.layers.GlobalAveragePooling2D()(x)
# A Dense classifier with a single unit (binary classification)
outputs = keras.layers.Dense(1)(x)
model = keras.Model(inputs, outputs)
#Train the new model on the data
model.compile(optimizer=keras.optimizers.Adam(),
loss=keras.losses.BinaryCrossentropy(from_logits=True),
metrics=[keras.metrics.BinaryAccuracy()])
model.fit(new_dataset, epochs=20, callbacks=..., validation_data=...)

But it’s not always that transfer learning can be used.

When to use transfer learning?

  1. Higher start. The higher initial skill on the source model.
  2. Higher slope. Steep slope during the training of the model.
  3. Higher asymptote. The overall accuracy of the model is better than the model without transfer learning.
Image from “Transfer Learning”

So, it’s not always that transfer learning is a success, but on some problems where you may not have very much data, transfer learning can enable you to develop skilful models that you simply could not develop in the absence of transfer learning.

This was all about Transfer Learning. Thanks for reading and I hope, you liked it.

--

--

Deep Patel
Deep Patel

Written by Deep Patel

Hi everyone, I am a software engineer at American Express. I am here to share my experience in tech.

Responses (1)