An Edge TPU demo project

Stan Callewaert
ML6team
Published in
6 min readApr 3, 2019

--

Last blogpost, the dark secrets of how the Edge TPU works were unveiled. In this blogpost, we’ll use the Edge TPU to create our very own demo project!

The goal of this blogpost is to give you a step-by-step guide of how to perform object detection on the Edge TPU. At the end of this blogpost we will be able to detect a set of tools: screwdrivers, cutters and pliers. The knowledge in this blogpost can, however, be used to detect whatever kind of objects you want.

All of that on that small device called the Edge TPU at a lightning fast speed! We will also need a Pi camera¹ and a little know-how, which can be acquired in this blogpost.

Extracting our training data

In the beginning, there were only videos of tools.

We start from videos of the tools we want to perform object detection upon. If you want to make your own videos and make the object detection robust, you should try to change the angle, lighting, background, … as much as possible. For this demo project however, the goal was to detect these objects on a white background.

To be able to train our object detection model later, we’ll need images that are annotated with the coordinates and the label of the tool.

So firstly, the images should be extracted from the videos using ffmpeg:

This will create files named image0001.jpg, image0002.jpg, … These represent the images which are taken with an interval of “VIDEO_LENGTH_IN_SECONDS/NUMBER_OF_IMAGES” seconds.

Try to keep the NUMBER_OF_IMAGES small enough to make sure that the images differ a significant amount from each other. If we just take two consecutive images from a video, we’ll notice that they are very alike since they only vary a couple of milliseconds.

Also, try to keep the NUMBER_OF_IMAGES big enough to have at least around 50 images to train our object detection model.

Making longer videos can help to resolve these problems, as the time between the images will grow if we keep the NUMBER_OF_IMAGES equal.

Annotating our training data

As a next step, we started annotating our images of tools using the open-source LabelImg software:

This is how to spin it up:

Let the labeling begin!

Converting our training data to TFRecords

Once all images are labeled, we can notice that each image has a matching XML file of its labeled data. Moving forward, we will create a TFRecords file which combines our images with their labeled data. Creating that TFRecords file can once again be done with an open-source library:

Once we’ve generated this TFRecords file, we can repeat the same thing (starting from test videos) if we want to test our accuracy after training our object detection model.

Choosing a Tensorflow object detection model

Our data is ready! We should now start choosing our object detection model. The available Tensorflow detection models can be found in the Tensorflow detection model zoo. For the rest of this blogpost we’ll be using the ssd_mobilenet_v2_quantized_coco model:

The MobileNetV2 architecture. Credits for this image go to Google.

Once we download the ssd_mobilenet_v2_quantized_coco model from the Tensorflow detection model zoo, we get a pipeline.config file and model.ckpt files which we’ll use later in this blogpost.

Configuring our object detection model

Before MobileNet is trained, it should be configured. Configuring a Tensorflow detection model is only a matter of creating and modifying some text files.

We’ll start off by creating a label_map.pbtxt file. This file maps our numeric values to our labels (we already defined this mapping in the data_to_tfrecords code block above, so make sure it is consistent). We’ve used the following label_map.pbtxt file:

We now have the necessary files to start modifying our pipeline.config file (found in the zip file downloaded from the Tensorflow detection model zoo) which orchestrates the whole training process. Since we are going to perform our training on Google Cloud, we should upload all the files we’ve created so far to a Google Storage bucket. These files are:

  • train.tfrecords
  • test.tfrecords
  • label_map.pbtxt
  • The model.ckpt files
  • The pipeline.config once we finished it

To train our model, we should modify the following lines in pipeline.config:

Notice that our demo has 3 classes (plier, cutter and screwdriver), a batch size of 24 to train our network and the first 8 layers of its model are frozen (transfer learning).

Training our object detection model

Just before we start training our model, we shouldn’t forget to install Tensorflow object detection. Instructions on how to do this are in installation.md.

Moving forward, we should package the Object Detection API, pycocotools, and TF Slim. This should be done to train the Tensorflow detection model in the cloud. It can be performed with the following commands:

Training our model is now as simple as executing the gcloud command below:

As we can notice in the command above, we are training on a cloud TPU. The reason for this is that a cloud TPU is cheaper to fully train our network. It can do many more multiply-add operations per second (discover why in our previous blogpost) and that’s why it compensates for its more expensive price per hour.

Once the training finished we can let our trained object detection model loose on a couple of test images. Below are some results:

If the results look promising — like the ones above — we can move on to the final step.

Converting the object detection model for Edge TPU

We now have a .ckpt object detection model. This should, however, become a quantized TFLite model to deploy it on the Edge TPU.

The following commands create a quantized TFLite model:

Once we’ve created the model file we can deploy and use it on the Edge TPU with the following result:

Conclusion

We started from only a couple of videos and ended with a quantized TFLite object detection model on the Edge TPU. This all happened with writing only a very minimal amount of code. As you can see the model perfectly detects the cutter with an accuracy of 99–100%. Not only are the predictions very accurate, they are also very fast because of the Edge TPU. We believe that this demo project is a start to solve lots of manufacturing and retail use cases.

About ML6

We are a team of AI experts and the fastest growing AI company in Belgium. With offices in Ghent, Amsterdam, Berlin and London, we build and implement self learning systems across different sectors to help our clients operate more efficiently. We do this by staying on top of research, innovation and applying our expertise in practice. To find out more, please visit www.ml6.eu.

¹ The Pi camera is only necessary for our demo project since it processes videos. The Pi camera can be plugged into the Edge TPU, just like it can be plugged into a Raspberry Pi.

--

--