Train YOLOv4 with Specific Categories of Images in MS COCO dataset

3 min readMar 17, 2021

Introduction

YOLOv4 is a famous and convenient tool for object detection, but the author only provides a pretrained model which is trained on whole MS COCO dataset, which includes 80 categories of objects.

Usually, we only need some specific classes in those 80 categories, let’s say we’re only interested in cars, motorcycles and pedestrians in a street scene video, the other 77 categories are unnecessary. If we use the pretrained YOLOv4 model without retraining it, it will cost lots of extra memories and have a worse performance.

In this article, I will explain how to download and train YOLOv4 model with only specific categories of images in MS COCO dataset.

Method

Clone Darknet Repository

First, you should clone darknet repository from here. This is the official repository of YOLOv4 maintained by AlexeyAB. You need to follow the README to compile darknet in your environment.

Clone COCO YOLO Parser

Then, “COCO YOLO Parser” comes in handy. This tool has two purposes:

Download only specific categories of images from COCO dataset using COCO api. You can first take a look at the categories list in COCO, and choose some classes that you’re interested in.
Convert the annotations format from COCO to YOLOv4 acceptable.

For more detail usage, please refer to the github README.

Set YOLO Configuration

After downloading all training images and generating labels for them, you need to place them into right directory, and set some configuration of YOLO. For more details, please take a look at this tutorial.

Download pretrained weights of YOLOv4 from here. Retraining your customized model from a pretrained model often makes your model converge faster and have better performance.
Copy yolov4-custom.cfg to yolov4-obj.cfg, and modify some content.
• batchan subdivision should be set properly, or you’ll encounter OOM
• max_batches should be max(classes*2000, 6000)• steps should be 80% and 90% of max_batches
• classes=80 should be modified to the number of target categories
• filters=255 in yolo layer should be modified to (classes+5)*3
Move obj.names generated by COCO YOLO parser to darknet/data/ .
Create and put obj.data in darknet/data/ . You could copy the content from the darknet README.
Create a folder named obj in darknet/data/ , and put all images and labels into it.
Place train.txt generated by COCO YOLO parser to darknet/data/ .

Start Training

Command
$ ./darknet detector train <data file> <cfg file> <initial weights>For example:
$ ./darknet detector train data/obj.data cfg/yolov4-obj.cfg   yolov4.conv.137 Some optional parameters:
-map: calculate mAP during training process
-dont_show: dont show output if you're training on a remote machine

Result

You will obtain a chart image like

The trained weights file will be stored in darknet/backup/ , or the path you’ve set in obj.data .

Conclusion

COCO YOLO parser can help you to download the classes of images you’re interested in, and convert the annotations to YOLO acceptable format to retrain YOLO conveniently. By following this article, you can easily trained a customized YOLO model with specific categories of images which are provided by COCO dataset.