https://github.com/ai-llm2/yolo3d-lightning

YOLO3D: 3D Object Detection with YOLO

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.4%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

YOLO3D: 3D Object Detection with YOLO

Basic Info

Host: GitHub
Owner: AI-LLM2
Default Branch: main
Size: 6.02 MB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Fork of ruhyadi/yolo3d-lightning

Created almost 4 years ago · Last pushed almost 4 years ago

https://github.com/AI-LLM2/yolo3d-lightning/blob/main/



# YOLO3D: 3D Object Detection with YOLO












##   Cautions
> This repository currently under development

##   Demo


![demo](./docs/assets/demo.gif)



##   Introduction

YOLO3D is inspired by [Mousavian et al.](https://arxiv.org/abs/1612.00496) in their paper **3D Bounding Box Estimation Using Deep Learning and Geometry**. YOLO3D uses a different approach, as the detector uses [YOLOv5](https://github.com/ultralytics/yolov5) which previously used Faster-RCNN, and Regressor uses ResNet18/VGG11 which was previously VGG19.

##   Quickstart
> YOLO3D use hydra as the config manager; please follow [official website](https://hydra.cc/) or [ashleve/lightning-hydra-template](https://github.com/ashleve/lightning-hydra-template).

###   Inference
You can use pretrained weight from [Release](https://github.com/ruhyadi/yolo3d-lightning/releases), you can download it using script `get_weights.py`:
```bash
# download pretrained model
python script/get_weights.py \
  --tag v0.1 \
  --dir ./weights
```
Inference with `inference.py`:
```bash
python inference.py \
  source_dir="./data/demo/images" \
  detector.model_path="./weights/detector_yolov5s.pt" \
  regressor_weights="./weights/regressor_resnet18.pt"
```

##  Training
There are two models that will be trained here: **detector** and **regressor**. For now, the detector model that can be used is only **YOLOv5**, while the regressor model can use all models supported by **Torchvision**.

###  Dataset Preparation
For now, YOLO3D only supports the [KITTI dataset](http://www.cvlibs.net/datasets/kitti/). Going forward, we will try to add support to the Lyft and nuScene datasets.
#### 1. Download KITTI Dataset
You can download KITTI dataset from [official website](http://www.cvlibs.net/datasets/kitti/). After that, extract dataset to `data/KITTI`. Since we will be using two models, it is highly recommended to rename `images_2` to `images`.

```bash
.
 data
  KITTI
      calib
      images # original images_2
      labels_2
```

#### 2. Generate YOLO Labels

The kitti label format on `labels` is different from the format required by the YOLO model. Therefore, we have to create a YOLO format from a KITTI format. The author has provided a `script/kitti_to_yolo.py` that can be used.

```bash
python script/kitti_to_yolo.py \
  --dataset_path ./data/KITTI \
  --classes car, van, truck, pedestrian, cyclist \
  --img_width 1224 \
  --img_height 370
```
The script will generate a `labels` folder containing the labels for each image in YOLO format.

```bash
.
 data
  KITTI
      calib
      images    # original images_2
|        labels_2  # kitti labels
      labels    # yolo labels
```

The next thing is to generate a sets of images/labels training and validation, these sets are also used as partitions to divide the dataset. The author has provided a `script/generate_sets.py` that can be used.

```bash
python script/generate_sets.py \
  --images_path ./data/KITTI/images \
  --dump_dir ./data/KITTI \
  --postfix _yolo \
  --train_size 0.8 \
  --is_yolo
```

###  Training Detector Model
> Right now author just use YOLOv5 model

For YOLOv5 training on a **single GPU**, you can use the command below:

```bash
cd yolov5
python train.py \
    --data ../configs/detector/yolov5_kitti.yaml \
    --weights yolov5s.pt \
    --img 640 
```

As for training on **multiple GPUs**, you can use the command below:

```bash
cd yolov5
python -m torch.distributed.launch \
    --nproc_per_node 4 train.py \
    --epochs 10 \
    --batch 64 \
    --data ../configs/detector/yolov5_kitti.yaml \
    --weights yolov5s.pt \
    --device 0,1,2,3
```

###  Training Regessor Model
>  Under development

You can use all the models available on **Torchvision** by adding some configuration to `src/models/components/base.py`. The current author has provided **ResNet18** and **VGG11** which can be used directly.

```bash
python src/train.py \
  experiment=sample
```

##   Acknowledgement

- [YOLOv5 by Ultralytics](https://github.com/ultralytics/yolov5)
- [skhadem/3D-BoundingBox](https://github.com/skhadem/3D-BoundingBox)
- [Mousavian et al.](https://arxiv.org/abs/1612.00496)
```
@misc{mousavian20173d,
      title={3D Bounding Box Estimation Using Deep Learning and Geometry}, 
      author={Arsalan Mousavian and Dragomir Anguelov and John Flynn and Jana Kosecka},
      year={2017},
      eprint={1612.00496},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
```