Train Yolov 5 with your own data

We want a unique object detection model

You can easily train the object detection model Yolov5

You can detect your favorite object by learning the data you prepared.


Data is

  • Images
  • Annotations-text files corresponding to images

Separate the image folder and the text folder and prepare them with the same name.
For example, create an annotation text file called image1.txt that corresponds to image1.jpg.

In the annotation text file, write the following, one line for each object.

Box coordinates must be in normalized xywh format (0 to 1).
If the box is in pixels, divide by the width of the image and the height of the image.
Class numbers start at 0.

Make the image box and label text using an annotation service like the one below.

After the annotation is finished, put it together in the following directory structure.

The train is the data used for training, and
the val is the data used for verifying the learning status of the model
, and I think that it is common to distribute the entire data at about 8: 2.

It is said that it is faster to train the data set locally on the training machine than to mount and refer to Google Drive etc.

Create a file (**.yaml) that directs the training configuration.

・ Image path
・ Number of classes

・ Array of class names

is described.

Start the training script with the following:

The arguments are:

・ Image size (horizontal and vertical of square)
・ Training ・ Number of batches (number of data to be input to the model at one time)
・ Epoch (how many training loops to run)
・ Data configuration file ・ Path (configuration file created above)
・Pre-trained weights (using weights trained in common datasets)

* When training with random weights from 1 without using pre-trained weights, you can specify
— weights’’ — cfg yolov5s.yaml and the argument, but learning from random weights Not recommended.

Pre-trained weights are available from the official repository.

You can choose from several models depending on processing speed, accuracy and model size

The image size can be any size (square), but it seems better to resize it to fit the model’s default size (size in the table above) . Even if you scale down from the original image, you can get reasonable results if the amount of data is sufficient.

With 10 epochs in Colab, it only takes a few minutes.
If you have enough data, you can detect it well with this amount of learning.

The results are recorded in
runs / train / exp .

It stores Weights and Training log (Precision, Recall, mAP, etc.) .

To perform inference on the trained model, run the inference script with Weight as follows:

The procedure to export and use for iOS is the following article.


I’m a freelance engineer.
Work consultation
Please feel free to contact us with a brief development description.

I am making an app that uses Core ML and ARKit.
We send machine learning / AR related information.





Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store