Modifying the DeepLab code to train on your own dataset for object segmentation in images.
I work as a Research Scientist at FlixStock, focusing on Deep Learning solutions to generate and/or edit images. We identify coherent regions belonging to various objects in an image using Semantic Segmentation.
DeepLab is an ideal solution for Semantic Segmentation. The code is available in TensorFlow.
In this article, I will be sharing how we can train a DeepLab semantic segmentation model for our own data-set in TensorFlow. But before we begin…
What is DeepLab?
DeepLab is one of the most promising techniques for semantic image segmentation with Deep Learning. Semantic segmentation is understanding an image at the pixel level, then assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.
Installation
DeepLab implementation in TensorFlow is available on GitHub here.
Preparing Dataset
Before you create your own dataset and train DeepLab, you should be very clear about what you want to want to do with it. Here are the two scenarios:
- Training the model from scratch: you are free to have any number of classes of objects (number of labels) for segmentation. This needs a very long time for training.
- Use the pre-trained model: you are free to have any number of classes of objects for segmentation. Use the pre-trained model and only update your classifier weights with transfer learning. This will take far less time for training compared to the prior scenario.
Let us name your new dataset as “PQR”. Create a new folder “PQR”as:
tensorflow/models/research/deeplab/datasets/PQR
.
To start, all you need is input images and their pre-segmented images as ground-truth for training. Input images need to be colour images and the segmented images need to be colour indexed images. Refer to the PASCAL dataset.
Create a folder named “dataset” inside “PQR”. It should have the following directory structure:
+ dataset
-JPEGImages
-SegmentationClass
-ImageSets
+ tfrecord
JPEGImages
It contains all the input colour images in format*.jpg
.
SegmentationClass
This folder contains all the semantic segmentation annotations images for each of the colour input images, which is the ground truth for the semantic segmentation.
These images should be colour indexed. Each colour index represents a unique class (with unique colour) known as a colour map.
Note: Files in the “SegmentationClass” folder should have the same name as in the “JPEGImage” folder for corresponding image-segmentation file pair.
ImageSets
This folder contains:
- train.txt: list of image names for the training set
- val.txt: list of image names for the validation set
- trainval.txt: list of image names for training + validation set
Sample *.txt
file looks something like this:
pqr_000032
pqr_000039
pqr_000063
pqr_000068
pqr_000121
Remove the colour-map in the ground truth annotations
If your segmentation annotation images are RGB images instead of colour indexed images. Here is a Python script that will be of help.
import tensorflow as tf
from PIL import Image
from tqdm import tqdm
import numpy as np
import os, shutil
# palette (color map) describes the (R, G, B): Label pair
palette = {(0, 0, 0) : 0 ,
(255, 0, 0) : 1
}
def convert_from_color_segmentation(arr_3d):
arr_2d = np.zeros((arr_3d.shape[0], arr_3d.shape[1]), dtype=np.uint8)
for c, i in palette.items():
m = np.all(arr_3d == np.array(c).reshape(1, 1, 3), axis=2)
arr_2d[m] = i
return arr_2d
label_dir = ‘./PQR/data/SegmentationClass/’
new_label_dir = ‘./PQR/data/SegmentationClassRaw/’
if not os.path.isdir(new_label_dir):
print(“creating folder: “,new_label_dir)
os.mkdir(new_label_dir)
else:
print(“Folder alread exists. Delete the folder and re-run the code!!!”)
label_files = os.listdir(label_dir)
for l_f in tqdm(label_files):
arr = np.array(Image.open(label_dir + l_f))
arr = arr[:,:,0:3]
arr_2d = convert_from_color_segmentation(arr)
Image.fromarray(arr_2d).save(new_label_dir + l_f)
Here, the palette defines the “RGB:LABEL” pair. In this sample code (0,0,0):0 is background and (255,0,0):1 is the foreground class. Note, the new_label_dir is the location where the raw segmentation data is stored.
Next, the task is to convert the image dataset to a TensorFlow record. Make a new copy of the script file
./dataset/download_and_convert_voc2012.sh
as ./dataset/convert_pqr.sh
.
Below is the modified script.
CURRENT_DIR=$(pwd)
WORK_DIR="./PQR"
PQR_ROOT="${WORK_DIR}/dataset"
SEG_FOLDER="${PQR_ROOT}/SegmentationClass"
SEMANTIC_SEG_FOLDER="${PQR_ROOT}/SegmentationClassRaw"
# Build TFRecords of the dataset.
OUTPUT_DIR="${WORK_DIR}/tfrecord"
mkdir -p "${OUTPUT_DIR}"
IMAGE_FOLDER="${PQR_ROOT}/JPEGImages"
LIST_FOLDER="${PQR_ROOT}/ImageSets"
echo "Converting PQR dataset..."
python ./build_new_pqr_data.py \
--image_folder="${IMAGE_FOLDER}" \
--semantic_segmentation_folder="${SEMANTIC_SEG_FOLDER}" \
--list_folder="${LIST_FOLDER}" \
--image_format="jpg" \
--output_dir="${OUTPUT_DIR}"
The converted dataset will be saved at ./deeplab/datasets/PQR/tfrecord
Defining the dataset description
Open the file segmentation_dataset.py present in the research/deeplab/datasets/ folder. Add the following code segment defining the description for your PQR dataset.
_PQR_SEG_INFORMATION = DatasetDescriptor(
splits_to_sizes={
'train': 11111, # number of file in the train folder
'trainval': 22222,
'val': 11111,
},
num_classes=2, # number of classes in your dataset
ignore_label=255, # white edges that will be ignored to be class
)
Make the following changes as shown below:
_DATASETS_INFORMATION = {
'cityscapes': _CITYSCAPES_INFORMATION,
'pascal_voc_seg': _PASCAL_VOC_SEG_INFORMATION,
'ade20k': _ADE20K_INFORMATION,
'pqr': _PQR_SEG_INFORMATION
}
Training
In order to train the model on your dataset, you need to run the train.py file in the research/deeplab/ folder. So, we have written a script file train-pqr.sh to do the task for you.
cd ..
# Set up the working environment.
CURRENT_DIR=$(pwd)
WORK_DIR="${CURRENT_DIR}/deeplab"
DATASET_DIR="datasets"
# Set up the working directories.
PQR_FOLDER="PQR"
EXP_FOLDER="exp/train_on_trainval_set"
INIT_FOLDER="${WORK_DIR}/${DATASET_DIR}/${PQR_FOLDER}/${EXP_FOLDER}/init_models"
TRAIN_LOGDIR="${WORK_DIR}/${DATASET_DIR}/${PQR_FOLDER}/${EXP_FOLDER}/train"
DATASET="${WORK_DIR}/${DATASET_DIR}/${PQR_FOLDER}/tfrecord"
mkdir -p “${WORK_DIR}/${DATASET_DIR}/${PQR_FOLDER}/exp”
mkdir -p “${TRAIN_LOGDIR}”
NUM_ITERATIONS=20000
python "${WORK_DIR}"/train.py \
--logtostderr \
--train_split="train" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size=513 \
--train_crop_size=513 \
--train_batch_size=4 \
--training_number_of_steps="${NUM_ITERATIONS}" \
--fine_tune_batch_norm=true \
--tf_initial_checkpoint="${INIT_FOLDER}/deeplabv3_pascal_train_aug/model.ckpt" \
--train_logdir="${TRAIN_LOGDIR}" \
--dataset_dir="${DATASET}"
Here, we have used xception_65 for your local training. You can specify the number of training iterations to the variable NUM_ITERATIONS. and set “ — tf_initial_checkpoint” to the location where you have downloaded or pre-trained the model *.ckpt. After training, the final trained model can be found in the TRAIN_LOGDIR directory.
Finally, run the above script from the …/research/deeplab directory.
# sh ./train-pqr.sh
Voilà! You have successfully trained DeepLab on your dataset.
In the coming months, I will be sharing more of my experiences with Images & Deep Learning. Stay tuned and don’t forget to spare some claps if you like this article. It will encourage me immensely.