yolo_tsixv

https://github.com/hendripermana2021/yolo_tsixv

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.1%) to scientific vocabulary

Last synced: 6 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: hendripermana2021
License: agpl-3.0
Language: Python
Default Branch: main
Size: 86.6 MB

Statistics

Stars: 3
Watchers: 1
Forks: 2
Open Issues: 5
Releases: 0

Created over 2 years ago · Last pushed about 2 years ago

Metadata Files

Readme Contributing License Citation

SPARK (Smart Monitoring Parking)

Project Description

(Smart Monitoring Parking) is a project that aims to develop an intelligent system that can detect and analyze empty parking spaces. In a dense urban environment, finding a parking space can be a challenging and time-consuming task. SPARK is designed to provide an efficient, real-time solution to address these problems by leveraging computer vision and artificial intelligence technologies. uses YOLO as Object Detection modeling which will be applied to application projects, but there are several modifications made to the architecture, with the aim of developing a model that is more efficient, precise and lighter.

YOLOV5-ti-lite is a version of YOLOV5 from TI for efficient edge deployment. This naming convention is chosen to avoid conflict with future release of YOLOV5-lite models from Ultralytics.
Here is a brief description of changes that were made to get yolov5-ti-lite from yolov5:
- YOLOV5 introduces a Focus layer as the very first layer of the network. This replaces the first few heavy convolution layers that are present in YOLOv3. It reduces the complexity of the n/w by 7% and training time by 15%. However, the slice operations in Focus layer are not embedded friendly and hence we replace it with a light-weight convolution layer. Here is a pictorial description of the changes from YOLOv3 to YOLOv5 to YOLOv5-ti-lite:

* SiLU activation is not well-supported in embedded devices. it's not quantization friendly as well because of it's unbounded nature. This was observed for hSwish activation function while [quantizing efficientnet](https://blog.tensorflow.org/2020/03/higher-accuracy-on-vision-models-with-efficientnet-lite.html). Hence, SiLU activation is replaced with ReLU.

* Using GhostNet and BottleneckCSP to create a model that is not only efficient and light, but also maintains an optimal level of detection accuracy.

Contributor

| Full Name | Affiliation | Email | LinkedIn | Role | | --- | --- | --- | --- | --- | | M. Haswin Anugrah Pratama | Startup Campus, AI Track | ... | link | Supervisor | | Muhammad Fathurrahman | Universitas Negeri Semarang | fathur.031207@gmail.com | link | Team Lead | | Arya Gagasan | Universitas Negeri Jakarta | aryagagas56@gmail.com | link | Team Member | | Fiya Niswatus Sholihah | Universitas Airlangga | fiyaniswatussholihah@gmail.com | link | Team Member | | Laily Farkhah Adhimah | Universitas Amikom Purwokerto | lailyfarkhaha@gmail.com | link | Team Member | | Hendri Permana Putra | STMIK Triguna Dharma | hendripermana60@gmail.com | link | Team Member | | Muhammad Adib Ardianto | Universitas Muria Kudus | Adibardianto21@gmail.com | link | Team Member |

Setup

Prerequisite Packages (Dependencies)

pandas==2.1.0
openai==0.28.0
google-cloud-aiplatform==1.34.0
google-cloud-bigquery==3.12.0
matplotlib>=3.2.2
numpy
opencv-python
Pillow>=7.1.2
PyYAML>=5.3.1
requests>=2.23.0
scipy
tqdm>=4.64.0
protobuf<4.21.3 # https://github.com/ultralytics/yolov5/issues/8012
seaborn>=0.11.0
ipython # interactive notebook
psutil # system utilization
thop # FLOPs computation
streamlit
wget
ffmpeg-python
streamlit_webrtc
torch

Environment

| | | | --- | --- | | CPU | HP 240 G7 Notebook intel i5, 8-core CPU | | GPU | Intel(R) UHD Graphics | | ROM | 512 GB | | RAM | 8 GB | | OS | Windows 11 | | Another Environment | Google Colab for Train |

Dataset

The dataset collection process was carried out through a collective effort that involved taking photos from various sources, including images found on the internet, videos from the YouTube platform, and photos taken directly using a camera. In order to provide an overview, we include an example of one of the images that has been collected, which includes photos obtained online or through direct shooting. and annotation with Roboflow

Example Image for Dataset

| | | | | --- | --- | --- | |

| |

Results

Model Performance

Fundamentally, YOLOv5 has achieved very outstanding performance and has become very powerful in its official version. However, our exploration results show that when we made adjustments to the head and backbone in the layer configuration and parameters, we managed to achieve significant improvements in accuracy and more efficient model performance when running. This was primarily done by focusing the changes on the specific task we wanted to work on, namely car detection. We have directed the modifications towards improving the focus and SPP (Spatial Pyramid Pooling) module, aiming to ensure that our model provides optimal results and efficiently handles the task specifications.

Adding Parameters tuning for enhance model (adding to configuration yaml)

Training Configuration Parameters

Optimizer

type: Specifies the optimizer type, and in this case, it's set to Adam.
- lr (Learning Rate): Determines the step size at each iteration while moving toward a minimum of the loss function.
- weight_decay: Controls the amount of regularization applied to the model by penalizing large weights.

Learning Rate Scheduler

type: Specifies the learning rate scheduler type. Here, it employs Cosine Annealing with Warm Restarts.
- T_0 (Initial Restart Period): Represents the number of epochs before a restart in the learning rate schedule.
- T_mult (Multiplier for Next Restart Period): Multiplier applied to the initial restart period for subsequent restarts.
- eta_min (Minimum Learning Rate): Defines the minimum value the learning rate can reach.

Data Augmentation

train_mosaic: A boolean indicating whether to utilize the mosaic data augmentation technique during training. Mosaic combines multiple images to form a single training sample.
train_mixup: Determines the probability of applying mixup, a data augmentation technique that interpolates images and labels with other images and labels.

Weight Initialization

initialize: Specifies the weight initialization method used at the start of training.
- type: Indicates the initialization method. In this case, it's set to kaiminguniform, which refers to He initialization.

These parameters play a crucial role in shaping the training process and influencing the performance of the deep learning model. Adjusting them carefully can lead to better convergence and generalization.

Change Head and Backbone

| YOLOv5 Official | YOLOv5-lite (Modification) | | --- | --- | |

1. Metrics

| model | epoch | learningrate | batchsize | optimizer | valloss | valprecision | val_recall | mAP.0,5 | mAP.0,5:0,95 | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | yolov5s-lite | 100 | 0.001 | 64 | Adam | 0.0059 | 92.82% | 89.85% | 93.44% | 67.74% | | yolov5s-lite | 300 | 0.001 | 64 | Adam | 0.0037 | 94.25% | 90.18% | 95.29% | 75.96% | | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |

2. Ablation Study

In this segment, we carefully detail the series of layers that we applied during our experiments, with the main goal of achieving the most optimal accuracy results. As a first step, we will provide an in-depth explanation of the terms and symbols that we will use in the explanation in this section. This is done with the intention that readers have a better understanding before entering a detailed explanation regarding the layer configuration in our experiments. - f = filter - k = kernel - s = stride - p = parameter

Backbone

| no. ablation | model | layer 1 | layer 2 | layer 3 | layer 4 | layer 5 | layer 6 | layer 7 | layer 8 | layer 9 | layer 10 | | --- | --- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | | 1. | yolov5s | Focus(k=3, f=64) | Conv(k=3, f=128, s=2) | C3(f=128) | Conv(f=256, k=3, s=2) | C3(f=256) | Conv(f=512, k=3, s=2) | C3(f=512) | Conv(f=1024, k=3, s=2) | SPP(f=1024, k=(5, 9, 13)) | C3(f=1024, p=False] | | 2. | yolov5s | Focus(k=3, f=64) | Conv(k=3, f=128, s=2) | BottleneckCSP(f=128) | Conv(f=256, k=3, s=2) | GhostBottleneck(f=256) | GhostConv(f=512, k=3, s=2) | GhostBottleneck(f=512) | GhostConv(f=1024, k=3, s=2) | SPP(f=1024, k=(5, 9, 13)) | BottleneckCSP(f=1024, p=False] | | last | yolov5s | Focus(k=3, f=64) | Conv(k=3, f=128, s=2) | BottleneckCSP(f=128) | Conv(f=256, k=3, s=2) | BottleneckCSP(f=256) | GhostConv(f=512, k=3, s=2) | BottleneckCSP(f=512) | GhostConv(f=1024, k=3, s=2) | SPP(f=1024, k=(5, 9, 13)) | BottleneckCSP(f=1024, p=False] |

Head

| no. ablation | model | layer 1 | layer 2 | layer 3 | layer 4 | layer 5 | layer 6 | layer 7 | layer 8 | layer 9 | layer 10 | layer 11 | layer 12 | layer 13 | layer 14 | layer 15 | | --- | --- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | | 1. | yolov5s | Conv, [512, 1, 1] | nn.Upsample, [None, 2, 'nearest'] | Concat, [1] | C3, [512, False] | Conv, [256, 1, 1] | nn.Upsample, [None, 2, 'nearest'] | Concat, [1] | C3, [256, False] | Conv, [256, 3, 2] | Concat, [1] | C3, [512, False] | Conv, [512, 3, 2] | Concat, [1] | C3, [1024, False] | [17, 20, 23], 1, Detect, [nc, anchors] | | 2. | yolov5s | Conv, [512, 1, 1] | nn.Upsample, [None, 2, 'nearest'] | Concat, [1] | BottleneckCSP, [512, False] | Conv, [256, 1, 1] | nn.Upsample, [None, 2, 'nearest'] | Concat, [1] | BottleneckCSP, [256, False] | Conv, [256, 3, 2] | Concat, [1] | GhostBottleneck, [512, False] | Conv, [512, 3, 2] | Concat, [1] | GhostBottleneck, [1024, False] | [17, 20, 23], 1, Detect, [nc, anchors] | | last | yolov5s | Conv, [512, 1, 1] | nn.Upsample, [None, 2, 'nearest'] | Concat, [1] | BottleneckCSP, [512, False] | Conv, [256, 1, 1] | nn.Upsample, [None, 2, 'nearest'] | Concat, [1] | BottleneckCSP, [256, False] | GhostConv, [256, 3, 2] | Concat, [1] | BottleneckCSP, [512, False] | GhostConv, [512, 3, 2] | Concat, [1] | BottleneckCSP, [1024, False] | [17, 20, 23], 1, Detect, [nc, anchors] |

Precision and Validate

Berikut hasil yang didapat dari setiap pengujian dari ablasi pertama / percobaan pertama hingga final model, dengan epoch 300 : | no. ablation | model | valprecision | valrecall | | --- | --- | --- | --- | | 1. | yolov5s-lite | 93.10% | 88.85% | | 2. | yolov5s-lite | 89.68% | 86.57% | | last | yolov5s-lite | 96.25% | 97.18% |

3. Training/Validation Curve

Insert an image regarding your training and evaluation performances (especially their losses). The aim is to assess whether your model is fit, overfit, or underfit. * Training any model using this repo will take the above changes by default. Same commands as the official one can be used for training models from scartch. E.g. python train.py --data coco.yaml --cfg yolov5s6.yaml --weights '' --batch-size 64

Results after training :

| Epochs 100 | Epochs 300 | | --- | --- | |

Testing

Show some implementations (demos) of this model. Show at least 10 images of how your model performs on the testing data.

python detect.py --weights /content/YOLO_TSIXV/runs/train/weights/best.pt --img 256 --conf 0.4 --source pathImages

Results Image Detection :

| img | img | img | | --- | --- | --- | |

| |

Deployment (Optional)

The next stage is deployment, where the project we have developed will be published online. We chose Streamlit as a package for the deployment phase because this framework has a number of advantages that make it a popular choice, especially in projects that emphasize display and user interaction. Some of the advantages of Streamlit include: 1. Easy to use 2. Quick to create prototyping applications 3. Interactive and Realtime 4. Support for Data Science and Machine Learning 5. Light and minimalist

For the final deployment results from our team, we created 2 views to be used by two different user roles, namely ADMIN and also USER. Following are the results of the deployment that we have worked on: | ADMIN INTERFACE | USER INTERFACE | | --- | --- | |

| |

Supporting Documents

Presentation Deck

Link: Our Presentation Deck TSixV

Business Model Canvas

The Business Model Canvas (BMC) is a framework used to visualize, evaluate and develop business models with the aim of achieving value, both from a social and financial perspective, and other aspects. Therefore, we have included BMC in our project as a business model that not only adds value financially, but also socially. This also functions as an identification tool to assess the extent to which the business model we have designed provides significant value for business development.

Click for See our BMC

Short Video

To provide further understanding to all of you regarding the background of why we decided to create this project, we are sharing the explanation video link :) - Link: Our Background Video

References

Provide all links that support this final project, i.e., papers, GitHub repositories, websites, etc. - Link: Official YOLOV5 repository - Link: yolov5-improvements-and-evaluation, Roboflow - Link: Focus layer in YOLOV5 - Link: CrossStagePartial Network - Link: CSPNet: A new backbone that can enhance learning capability of cnn - Link: Path aggregation network for instance segmentation - Link: Efficientnet-lite quantization - Link: YOLOv5 Training video from Texas Instruments

Additional Comments

The model development process in YOLOv5s has improved as far as precision and accuracy, but we recognize the potential to continue optimizing these models. We feel that by applying custom convolutional layers that have been tested a lot out there, both on the backbone and the head, we can achieve further improvements in prediction quality.
The integration of GhostConv and BottleneckCSP became an integral part of our approach to achieving a lighter model. Our main focus was to ensure the model could perform well in the car detection task, which was the main focus of our project. Although this model has provided satisfactory results for car detection, we realize that this implementation may be less effective when faced with more complex object detection tasks. Therefore, further exploration of these layers is expected to provide a deeper understanding of their potential and limitations.

Citation

@inproceedings{ghostnet, title={GhostNet: More Features from Cheap Operations}, author={Han, Kai and Wang, Yunhe and Tian, Qi and Guo, Jianyuan and Xu, Chunjing and Xu, Chang}, booktitle={CVPR}, year={2020} } @inproceedings{tinynet, title={Model Rubik’s Cube: Twisting Resolution, Depth and Width for TinyNets}, author={Han, Kai and Wang, Yunhe and Zhang, Qiulin and Zhang, Wei and Xu, Chunjing and Zhang, Tong}, booktitle={NeurIPS}, year={2020} } @inproceedings{han2022vig, title={Vision GNN: An Image is Worth Graph of Nodes}, author={Kai Han and Yunhe Wang and Jianyuan Guo and Yehui Tang and Enhua Wu}, booktitle={NeurIPS}, year={2022} } @article{tang2022ghostnetv2, title={GhostNetV2: Enhance Cheap Operation with Long-Range Attention}, author={Tang, Yehui and Han, Kai and Guo, Jianyuan and Xu, Chang and Xu, Chao and Wang, Yunhe}, journal={arXiv preprint arXiv:2211.12905}, year={2022} }

License

For academic and non-commercial use only.

Acknowledgement

This project entitled "SPARK (Smart Monitoring Parking)" is supported and funded by Startup Campus Indonesia and Indonesian Ministry of Education and Culture through the "Kampus Merdeka: Magang dan Studi Independen Bersertifikasi (MSIB)" program.

Owner

Login: hendripermana2021
Kind: user
Location: Medan
Company: Pesantren Daarul Istiqlal

Repositories: 3
Profile: https://github.com/hendripermana2021

my name is hendri permana putra, iam still learning to be a great programmer, especially in a website designed

Citation (CITATION.cff)

cff-version: 1.2.0
preferred-citation:
  type: software
  message: If you use YOLOv5, please cite it as below.
  authors:
  - family-names: Jocher
    given-names: Glenn
    orcid: "https://orcid.org/0000-0001-5950-6979"
  title: "YOLOv5 by Ultralytics"
  version: 7.0
  doi: 10.5281/zenodo.3908559
  date-released: 2020-5-29
  license: AGPL-3.0
  url: "https://github.com/ultralytics/yolov5"

GitHub Events

Total

Watch event: 1

Last Year

Watch event: 1

Dependencies

.github/workflows/ci-testing.yml actions

actions/checkout v4 composite
actions/setup-python v4 composite
slackapi/slack-github-action v1.24.0 composite

.github/workflows/codeql-analysis.yml actions

actions/checkout v4 composite
github/codeql-action/analyze v2 composite
github/codeql-action/autobuild v2 composite
github/codeql-action/init v2 composite

.github/workflows/docker.yml actions

actions/checkout v4 composite
docker/build-push-action v5 composite
docker/login-action v3 composite
docker/setup-buildx-action v3 composite
docker/setup-qemu-action v3 composite

.github/workflows/greetings.yml actions

actions/first-interaction v1 composite

.github/workflows/links.yml actions

actions/checkout v4 composite
nick-invision/retry v2 composite

.github/workflows/stale.yml actions

actions/stale v8 composite

utils/docker/Dockerfile docker

pytorch/pytorch 2.0.0-cuda11.7-cudnn8-runtime build

utils/google_app_engine/Dockerfile docker

gcr.io/google-appengine/python latest build

requirements.txt pypi

Pillow >=10.0.1
PyYAML >=5.3.1
gitpython >=3.1.30
matplotlib >=3.3
numpy >=1.22.2
opencv-python >=4.1.1
pandas >=1.1.4
psutil *
requests >=2.23.0
scipy >=1.4.1
seaborn >=0.11.0
setuptools >=65.5.1
thop >=0.1.1
torchvision >=0.9.0
tqdm >=4.64.0
ultralytics >=8.0.147

utils/google_app_engine/additional_requirements.txt pypi

Flask ==2.3.2
gunicorn ==19.10.0
pip ==23.3
werkzeug >=3.0.1