yolov5_reduced_classes

Reducing the number of Yolov5 classes in custom model to increase detection accuracy.

https://github.com/alexeydzyubap/yolov5_reduced_classes

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.6%) to scientific vocabulary
Last synced: 9 months ago · JSON representation ·

Repository

Reducing the number of Yolov5 classes in custom model to increase detection accuracy.

Basic Info
  • Host: GitHub
  • Owner: AlexeyDzyubaP
  • License: agpl-3.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 9.85 MB
Statistics
  • Stars: 3
  • Watchers: 1
  • Forks: 0
  • Open Issues: 5
  • Releases: 0
Created about 3 years ago · Last pushed over 2 years ago
Metadata Files
Readme Contributing License Citation

README.md

About

Reducing the number of Yolov5 classes in custom model to increase detection accuracy.

Introduction

The aim of this project is to train yolov5 model on a custom dataset with reduced number of classes to achive an increase in accuracy.

The dataset used in this work is a filtered COCO dataset with 3 classes: Person, Pet (Cat+Dog), Vehicle (car+bus+truck).

Results comparison with the base model

Let's look at the detection examples and compare them to the results of the base model, image by image. As we compare the models using test images we can obeserve minor differences in most cases (slight change in confidence score and bounding box size and position). However, in some cases the custom model appears to have an advantage.

When comparing detection results LEFT is base model RIGHT is custom model.

Contrast

The custom model seems to show better results compared to the base YOLOv5 in cases where the object doesn't stand out against the background. A good example for such case is a vehicle driver/passenger covered by the reflection of a window. With another complication in the form of dim lighting inside the vehicle, we get a hardly recognizable upper body silhouette. App Screenshot App Screenshot

Partial visibility

In cases where only part of an object is visible the custom model also shows promising results, it is especially apparent when the visible part is the upper body of a person or a pet. App Screenshot App Screenshot App Screenshot App Screenshot Although there are cases where the base model performed better when the visible part is a lower part (for example when only legs are visible). App Screenshot

Crowd detection

This is, in my opinion, the most interesting result that came out during this experiment. In COCO dataset there are a many pictures of crowded places such as sporting events or popular public places. Using these cases the model seems to develop a "subclass" of a person class for cases where there are a lot of people close to each other.

App Screenshot App Screenshot App Screenshot App Screenshot

In some cases we can see that this improvement alows the model to detect multiple people when detecting them one by one is nearly impossible (like a stand full of people watching a sport event shot from a long distance). This could prove to be quite usefull if only we could reliably separate the "crowd" detection instaces from a regular person detection instances. The solution might be to separate these 2 classes by the size of the bounding box. In most applications of a YOLO type model the objects are rarely shot from a close distance, therefore, the bounding box for a person-type object usually takes up no more than 10-20% of an image. Therefore, if we detect a person-type object with a bounding box for than a set percentage of an image we can relabel it as a crowd-type object. This method could futher be improved by comparing the largest bounding box with the others in the same image or by counting the number of others person-type objects inside the bounding box.

Validation time

For validation custom model showed a significant improvement in prediction time of ~22%. Prediction time was calculated as preprocess time + inference + NMS.

| Model | Preprocess | Inference | NMS | | --- | --- | --- | --- | | Custom model | 1.1ms | 65.5ms | 1.2ms | | YOLOv5 | 1.1ms | 76.0ms | 10.0ms |

Speed: 1.1ms pre-process, 65.5ms inference, 1.2ms NMS per image at shape (32, 3, 640, 640) Speed: 1.1ms pre-process, 76.0ms inference, 10.0ms NMS per image at shape (32, 3, 640, 640) 87.1 0.871

Validation metrics

Here are validation results comparison with the base YOLO model. For base model I take avarage result for classes in super category considering the number of instances.

Custom model results: | Class | Images | Instances | P | R | mAP50 | mAP50-95| | --- | --- | --- | --- | --- | --- | --- | | person | 40504 | 88153 | 0.736 | 0.653 | 0.718 | 0.45 | | pet | 40504 | 3621 | 0.81 | 0.654 | 0.739 | 0.488 | |vehicle | 40504 | 20379 | 0.743 | 0.54 | 0.624 | 0.373 |

YOLOv5 model results: | Class | Images | Instances | P | R | mAP50 | mAP50-95| | --- | --- | --- | --- | --- | --- | --- | | person | 40504 | 88153 | 0.712 | 0.641 | 0.694 | 0.415 | | pet | 40504 | 3621 | 0.72 | 0.62 | 0.68 | 0.444 | |vehicle | 40504 | 20379 | 0.617 | 0.504 | 0.538 | 0.318 |

Full Base YOLO model

| Class | Images | Instances | P | R | mAP50 | mAP50-95 | | --- | --- | --- | --- | --- | --- | --- | | person | 40504 | 88153 | 0.712 | 0.641 | 0.694 | 0.415 | | cat | 40504 | 1669 | 0.74 | 0.682 | 0.735 | 0.464 | | dog | 40504 | 1952 | 0.703 | 0.566 | 0.633 | 0.426 | | car | 40504 | 15014 | 0.607 | 0.511 | 0.539 | 0.301 | | bus | 40504 | 2027 | 0.767 | 0.654 | 0.722 | 0.534 | | truck | 40504 | 3338 | 0.573 | 0.384 | 0.424 | 0.263 |

Here we can see an improvemnt in every metric for every class. We can also see that the custom model improvement is more noticeable for non-person classes. The reason for that may be the disproportionate nature of the dataset where the "person" class is overrepresented.

Model creation

1. Creating a custom dataset

To create a custom dataset dataset customdataset/filter.py is used with COCO labels file (json). Example: **python filter.py --inputjson c:\users\you\annotations\instancestrain2017.json --outputjson c:\users\you\annotations\filtered.json --categories person dog cat**. Then, using custom_dataset/coco2yolo.ipynb, the filtered COCO format labels are converted into yolo format labels (json -> txt).

2. Training yolov5 model using custom dataset

For training the yolov5 model a data config file must be created (dataset.yaml). Dataset config file is a file that defines 1) the dataset root directory path and relative paths to train / val / test image directories (or .txt files with image paths) and 2) a class names dictionary. After that the model yolov5n model was trained in 200 epochs.

3. Validation for custom trained and original models

Conclusion

Owner

  • Login: AlexeyDzyubaP
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
preferred-citation:
  type: software
  message: If you use YOLOv5, please cite it as below.
  authors:
  - family-names: Jocher
    given-names: Glenn
    orcid: "https://orcid.org/0000-0001-5950-6979"
  title: "YOLOv5 by Ultralytics"
  version: 7.0
  doi: 10.5281/zenodo.3908559
  date-released: 2020-5-29
  license: AGPL-3.0
  url: "https://github.com/ultralytics/yolov5"

GitHub Events

Total
Last Year