Tensorflow Object Detection : Training with Custom Dataset (Windows)

Tom Jubb

Feb 16, 20205 min read

In this tutorial, I will explain how to get the Tensorflow object detection API training one of its many object detector CNN models on a custom dataset. This can be a quick way to get some trained neural networks on your own dataset. Generally we want to re-train a model when the data is sufficiently different to the data it was originally trained on; which in the Tensorflow API is mostly RGB photographs of various objects.

This is fiddly to get working so I thought it would be worth creating a tutorial. This will cover Windows 10 (64-bit) systems.

We focus on one particular model (faster-RCNN with resent50); but any other model can be used with very little change. The code for this tutorial can be found on my GitHub.

Prerequisites

Familiarity with Python, Anaconda is assumed. The main prerequisite to this tutorial is to have the Tensorflow object detection API installed. I have written a post on this topic.

You should make sure you have the following things :

A conda environment with tensorflow version above 1.14, in this tutorial it will be called "tf_detect".
Jupyter notebooks installed on the above conda environment.

Step 1 : Dependencies

In the previous tutorial we created a Conda environment as well as a root folder which has the source code for the object detection API. On my system this root folder is;

Let's start there; open Anaconda prompt

This should install all the packages we need. We will create a folder structure to hold al the various files

The purpose of these folders will be explained as we go.

The main leg work for us now is to bundle up the image data into a format that Tensorflow recognises, a so called "record" format. I found the simplest way to do this is using a Jupyter notebook.

Step 2 : Set up the Image Data

Now we need to gather the data we are using. I will assume the data is in .jpg form; and all of your jpg images should be placed in the images folder

You also need to store the ground truth of the objects in the images. I find an easy way to do this is to simply have a text file which has one object per row, along with the filename of the file containing the object, the coordinates of the bounding box, and the name of the class of object. Something like below;

We need to create another text file which contains the names of the classes. The format of this one is fixed so copy from below and adapt to the number of classes in your data

Save the resulting file as

Step 3 : Setup Config and Model Checkpoint

The use of the object detection API requires some files to be downloaded. First we need the config file which contains all the hyperparameters for the model, as well as various other metadata. This allows the training to be modified to suit your own problem. These config files can be found in the source code

Pick your model and copy the relevant config file into the training/ folder. For this tutorial we are using

Now enter the relevant information into this file. Below are the parameters that must be changed;

(the full file can be found from the linked GitHub repo, and you can just copy that, but make sure you change the filepaths).

Note that the number of classes should be the number of classes from the .pbtxt file plus one to account for the "background" class.

The model checkpoint contains multiple files including pre-trained weights and therefore is quite a large size folder. It needs to be downloaded from the model zoo. Download the zip file for the relevant model from

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md

For our tutorial it will be this one. Place the unzipped folder in

Note : that the config file points to "model.ckpt", even though this specific file does not exist. Don't worry about it; tensorflow uses this file to go and find the various files it needs to load the pretrained weights.

Step 4 : Prepare the Data

Now we need to convert the jpg image data (along with their ground truth object labels) into the expected format of Records. This tutorial deals with a small number of images for simplicity; in reality you will be dealing with thousands or tens of thousands and you should consider sharding.

I have provided a Jupyter notebook with some code to help create the .record files for the training and test data. You can see the code step by step below but if you wish, you can simply download the notebook and skip to step 5.

You can scroll through the above to see how this works. You only need to modify the variables in cell 2 to your own file paths.

Step 5 : Train the Model

let's quickly recap the folders and files we have accumulated by now

We are now ready to train the model. To do that copy the file

into the tf_training_examples folder. We can train the model by running this file; we can do this either fully in the Anaconda Prompt, or using PyCharm

5.1 Pycharm

I find this the easiest way if you already use PyCharm, since all the arguments and setup is done once and is then saved in the PyCharm state. The method using the prompt requires several commands each time you want to train. This tutorial is not about using PyCharm (I will write one though!), so if you are not comfortable with it already, use the prompt and skip this section.

There are several steps needed:

Set the python interpreter to the correct environment
Add the slim folder to the list of sources

Add the following parameters to the run configuration

This should be all that's needed to run the training through the console.

5.2 Anaconda Prompt

First we want to work from the correct environment which was created in the previous post. Open up an Anaconda Prompt and activate our environment;

Now we want to add the slim folder to the python path variable; so that python knows about it. This is necessary as the training code requires access to some of the modules in the slim folder.

This can be done using the following command;

You can check that this step has been successful using python (through the prompt, since if you use a different prompt the PYTHONPATH will be reset). So remember that the new PYTHONPATH only exists in this prompt.

This will print a list of all the folders that python "knows about". As long as the slim folder is there we are ready to go.

Now we can run the training with a couple of parameters.

Part 6 : Concluding Hints

Either way you run the training, you should start seeing output. Depending on the processing power, you should start to see the training begin after a while with a periodically updated loss. My output shows a LOT of warning messages but it seems most of these aren't critical and the training progresses nonetheless.

If you've followed this guide to the letter; I hope that this works, but there are so many possible ways in which your code bugs out. If that happens, check out a couple of the blog posts linked at the end and see if you can figure out how to fix the problem.

Many of my crashes happened because there was a mistake in my config file; in this scenario the error messages aren't very helpful. So, always check the config file for mistakes first.

This example contains very few images; and is therefore not designed to give good results (just simply get you to the point of being able to). Now you can extend the code from this tutorial, have fun!

References

The official tensorflow guide:

https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html

Hello

About Me

Let’s Chat