diffbev

Official PyTorch implementation for a conditional diffusion probability model in BEV perception

https://github.com/jiayuzou2020/diffbev

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.6%) to scientific vocabulary

Keywords

3d-detection bev-perception diffusion-models semantic-segmentation

Last synced: 6 months ago · JSON representation ·

Repository

Official PyTorch implementation for a conditional diffusion probability model in BEV perception

Basic Info

Host: GitHub
Owner: JiayuZou2020
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://arxiv.org/abs/2303.08333
Size: 5 MB

Statistics

Stars: 245
Watchers: 6
Forks: 12
Open Issues: 13
Releases: 0

Topics

3d-detection bev-perception diffusion-models semantic-segmentation

Created almost 3 years ago · Last pushed almost 3 years ago

Metadata Files

Readme License Citation

DiffBEV: Conditional Diffusion Model for Bird's Eye View Perception

Conditional diffusion probability model for BEV perception

Arxiv Abstract Dataset Installation Visualization Citation Acknowledgement

Abstract

BEV perception is of great importance in the field of autonomous driving, serving as the cornerstone of planning, controlling, and motion prediction. The quality of the BEV feature highly affects the performance of BEV perception. However, taking the noises in camera parameters and LiDAR scans into consideration, we usually obtain BEV representation with harmful noises. Diffusion models naturally have the ability to denoise noisy samples to the ideal data, which motivates us to utilize the diffusion model to get a better BEV representation. In this work, we propose an end-to-end framework, named DiffBEV, to exploit the potential of diffusion model to generate a more comprehensive BEV representation. To the best of our knowledge, we are the first to apply diffusion model to BEV perception. In practice, we design three types of conditions to guide the training of the diffusion model which denoises the coarse samples and refines the semantic feature in a progressive way. What's more, a cross-attention module is leveraged to fuse the context of BEV feature and the semantic content of conditional diffusion model. DiffBEV achieves a 25.9% mIoU on the nuScenes dataset, which is 6.2% higher than the best-performing existing approach. Quantitative and qualitative results on multiple benchmarks demonstrate the effectiveness of DiffBEV in BEV semantic segmentation and 3D object detection tasks. framework

Dataset

Download Datasets From Official Websites

Extensive experiments are conducted on the nuScenes, [KITTI Raw](https://www.cvlibs.net/datasets/kitti/rawdata.php), [KITTI Odometry](https://www.cvlibs.net/datasets/kitti/evalodometry.php), and [KITTI 3D Object](https://www.cvlibs.net/datasets/kitti/eval3dobject.php)_ benchmarks.

Prepare Depth Maps

Follow the script to generate depth maps for KITTI datasets. The depth maps of KITTI datasets are available at Google Drive and Baidu Net Disk. We also provide the script to get the depth map for nuScenes dataset. Replace the dataset path in the script accroding to your dataset directory.

Dataset Processing

After downing these datasets, we need to generate the annotations in BEV. Follow the instructions below to get the corresponding annotations.

nuScenes

Run the script makenusceneslabels to get the BEV annotation for the nuScenes benchmark. Please follow here to generate the BEV annotation (annbevdir) for KITTI datasets.

KITTI Datasets

Follow the instruction to get the BEV annotations for KITTI Raw, KITTI Odometry, and KITTI 3D Object datasets.

The datasets' structure is organized as follows. data ├── nuscenes ├── img_dir ├── train ├── val ├── ann_bev_dir ├── train ├── val ├── train_depth ├── val_depth ├── calib.json ├── kitti_processed ├── kitti_raw ├── img_dir ├── train ├── val ├── ann_bev_dir ├── train ├── val ├── train_depth ├── val_depth ├── calib.json ├── kitti_odometry ├── img_dir ├── train ├── val ├── ann_bev_dir ├── train ├── val ├── train_depth ├── val_depth ├── calib.json ├── kitti_object ├── img_dir ├── train ├── val ├── ann_bev_dir ├── train ├── val ├── train_depth ├── val_depth ├── calib.json

Prepare Calibration Files

For the camera parameters on each dataset, we write them into the corresponding calib.json file. For each dataset, we upload the _calib.json to _Google Drive and Baidu Net Disk.

Please change the dataset path according to the real data directory in the [nuScenes, KITTI Raw, KITTI Odometry, and KITTI 3D Object dataset configurations](https://github.com/JiayuZou2020/DiffBEV/tree/main/configs/base/datasets). Modify the path of pretrained model in model configurations.

Installation

DiffBEV is tested on: * Python 3.7/3.8 * CUDA 11.1 * Torch 1.9.1

Please check install for installation. * Create a conda environment for the project. python conda create -n diffbev python=3.7 conda activate diffbev * Install Pytorch following the instruction. conda install pytorch torchvision -c pytorch * Install mmcv

python pip install -U openmim mim install mmcv-full * Git clone this repository

python git clone https://github.com/JiayuZou2020/DiffBEV.git

Install and compile the required packages. python cd DiffBEV pip install -v -e .

Visualization

vis

Citation

If you find our work is helpful for your research, please consider citing as follows. @article{zou2023diffbev, title={DiffBEV: Conditional Diffusion Model for Bird's Eye View Perception}, author={Jiayu, Zou and Zheng, Zhu and Yun, Ye and Xingang, Wang}, journal={arXiv preprint arXiv:2303.08333}, year={2023} }

Acknowledgement

Our work is partially based on the following open-sourced projects: mmsegmentation, VPN, PYVA, PON, LSS. Thanks for their contribution to the research community of BEV perception.

Owner

Login: JiayuZou2020
Kind: user

Repositories: 1
Profile: https://github.com/JiayuZou2020

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - name: "MMSegmentation Contributors"
title: "OpenMMLab Semantic Segmentation Toolbox and Benchmark"
date-released: 2020-07-10
url: "https://github.com/open-mmlab/mmsegmentation"
license: Apache-2.0

GitHub Events

Total

Issues event: 2
Watch event: 23
Fork event: 2

Last Year

Issues event: 2
Watch event: 23
Fork event: 2

Dependencies

docker/Dockerfile docker

pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build

docker/serve/Dockerfile docker

pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build

mmsegmentation.egg-info/requires.txt pypi

cityscapesscripts *
codecov *
flake8 *
interrogate *
isort ==4.3.21
matplotlib *
numpy *
packaging *
prettytable *
pytest *
xdoctest >=0.10.0
yapf *

requirements/docs.txt pypi

docutils ==0.16.0
myst-parser *
sphinx ==4.0.2
sphinx_copybutton *
sphinx_markdown_tables *

requirements/mminstall.txt pypi

mmcv-full >=1.3.1,<=1.4.0

requirements/optional.txt pypi

cityscapesscripts *

requirements/readthedocs.txt pypi

mmcv *
prettytable *
torch *
torchvision *

requirements/runtime.txt pypi

matplotlib *
numpy *
packaging *
prettytable *

requirements/tests.txt pypi

codecov * test
flake8 * test
interrogate * test
isort ==4.3.21 test
pytest * test
xdoctest >=0.10.0 test
yapf * test

requirements.txt pypi

Babel ==2.7.0
Bottleneck ==1.2.1
Cython ==0.29.13
Flask ==1.1.1
HeapDict ==1.0.1
Jinja2 ==2.10.3
KNN-CUDA ==0.2
Markdown ==3.4.1
MarkupSafe ==2.1.1
PIMS ==0.6.0
Pillow ==8.4.0
PyOpenGL ==3.1.0
PySocks ==1.7.1
PyTurboJPEG ==1.6.6
PyWavelets ==1.0.3
PyYAML ==5.1.2
Pygments ==2.4.2
QtAwesome ==0.6.0
QtPy ==1.9.0
SQLAlchemy ==1.3.9
SecretStorage ==3.1.1
Send2Trash ==1.5.0
Shapely ==1.8.0
SoundFile ==0.10.3.post1
Sphinx ==5.0.2
Werkzeug ==2.2.0
XlsxWriter ==1.2.1
absl-py ==1.2.0
addict ==2.4.0
alabaster ==0.7.12
anaconda-client ==1.7.2
anaconda-navigator ==1.9.7
anaconda-project ==0.8.3
antlr4-python3-runtime ==4.8
appdirs ==1.4.4
asn1crypto ==1.0.1
astroid ==2.3.1
astropy ==3.2.2
atomicwrites ==1.3.0
attrs ==19.2.0
audioread ==2.1.9
autopep8 ==1.6.0
backcall ==0.1.0
backports.functools-lru-cache ==1.5
backports.os ==0.1.1
backports.shutil-get-terminal-size ==1.0.0
backports.tempfile ==1.0
backports.weakref ==1.0.post1
beautifulsoup4 ==4.8.0
bitarray ==1.0.1
bkcharts ==0.2
black ==22.3.0
bleach ==3.1.0
blobfile ==2.0.0
bokeh ==1.3.4
boto ==2.49.0
cPython ==0.0.6
cachetools ==4.2.4
certifi ==2022.9.24
cffi ==1.12.3
chamfer ==2.0.0
chardet ==3.0.4
chumpy ==0.70
click ==8.1.1
cloudpickle ==1.2.2
clyent ==1.2.2
colorama ==0.4.1
colour ==0.1.5
conda ==22.11.0
conda-build ==3.23.2
conda-package-handling ==1.6.0
conda-verify ==3.4.2
contextlib2 ==0.6.0
coverage ==6.3.2
cryptography ==2.7
cycler ==0.10.0
cytoolz ==0.10.0
dask ==2.5.2
decorator ==4.4.0
decord ==0.6.0
defusedxml ==0.6.0
descartes ==1.1.0
dgl-cu111 ==0.6.1
dglgo ==0.0.1
distributed ==2.5.2
docutils ==0.15.2
easydict ==1.9
efficientnet-pytorch ==0.7.1
einops ==0.3.2
emd-ext ==0.0.0
entrypoints ==0.3
et-xmlfile ==1.0.1
fastcache ==1.1.0
filelock ==3.0.12
fire ==0.4.0
freetype-py ==2.2.0
fsspec ==0.5.2
future ==0.17.1
gevent ==1.4.0
glob2 ==0.7
gmpy2 ==2.0.8
google-auth ==2.9.1
google-auth-oauthlib ==0.4.6
greenlet ==0.4.15
grpcio ==1.47.0
h5py ==2.9.0
html5lib ==1.0.1
hydra-core ==1.1.0
idna ==2.8
imageio ==2.6.0
imageio-ffmpeg ==0.4.7
imagesize ==1.1.0
importlib-metadata ==4.12.0
importlib-resources ==5.4.0
interrogate ==1.5.0
ipykernel ==5.1.2
ipython ==7.8.0
ipython-genutils ==0.2.0
ipywidgets ==7.5.1
isort ==5.10.1
itsdangerous ==1.1.0
jdcal ==1.4.1
jedi ==0.15.1
jeepney ==0.4.1
joblib ==1.1.1
json-tricks ==3.15.5
json5 ==0.8.5
jsonschema ==3.0.2
jupyter ==1.0.0
jupyter-client ==5.3.3
jupyter-console ==6.0.0
jupyter-core ==4.5.0
jupyterlab ==1.1.4
jupyterlab-server ==1.0.6
keyring ==18.0.0
kiwisolver ==1.1.0
lap ==0.4.0
lazy-object-proxy ==1.4.2
libarchive-c ==2.8
librosa ==0.9.1
lief ==0.9.0
llvmlite ==0.29.0
locket ==0.2.0
lxml ==4.9.1
matplotlib ==3.1.1
mccabe ==0.6.1
mistune ==0.8.4
mkl-fft ==1.0.14
mkl-random ==1.1.0
mkl-service ==2.3.0
mmcv-full ==1.3.15
mock ==3.0.5
more-itertools ==7.2.0
motmetrics ==1.1.3
moviepy ==1.0.3
mpi4py ==3.0.3
mpmath ==1.1.0
msgpack ==0.6.1
multipledispatch ==0.6.0
munkres ==1.1.4
mypy-extensions ==0.4.3
navigator-updater ==0.2.1
nbconvert ==5.6.0
nbformat ==4.4.0
networkx ==2.3
nltk ==3.4.5
nose ==1.3.7
notebook ==6.0.1
numba ==0.45.1
numexpr ==2.7.0
numpy ==1.21.6
numpydoc ==1.5.0
nuscenes-devkit ==1.1.9
oauthlib ==3.2.0
olefile ==0.46
omegaconf ==2.1.0
open3d ==0.9.0.0
opencv-contrib-python ==4.0.0.21
opencv-python ==4.1.0.25
openpyxl ==3.0.0
packaging ==21.3
pandas ==0.25.1
pandocfilters ==1.4.2
parso ==0.5.1
partd ==1.0.0
path.py ==12.0.1
pathlib2 ==2.3.5
pathspec ==0.9.0
patsy ==0.5.1
pep8 ==1.7.1
pexpect ==4.7.0
pickleshare ==0.7.5
pkginfo ==1.5.0.1
platformdirs ==2.5.2
pluggy ==1.0.0
ply ==3.11
polars ==0.11.0
pooch ==1.6.0
prettytable ==2.2.1
proglog ==0.1.9
progressbar ==2.5
prometheus-client ==0.7.1
prompt-toolkit ==2.0.10
protobuf ==3.19.1
psutil ==5.6.3
ptyprocess ==0.6.0
py ==1.8.0
pyOpenSSL ==19.0.0
pyasn1 ==0.4.8
pyasn1-modules ==0.2.8
pycocotools ==2.0.4
pycodestyle ==2.8.0
pycosat ==0.6.3
pycparser ==2.19
pycrypto ==2.6.1
pycryptodomex ==3.16.0
pycurl ==7.43.0.3
pydantic ==1.9.1
pyflakes ==2.1.1
pyglet ==1.5.23
pylint ==2.4.2
pymongo ==4.1.1
pyntcloud ==0.1.5
pyodbc ==4.0.27
pyparsing ==2.4.2
pyquaternion ==0.9.9
pyrender ==0.1.45
pyrsistent ==0.15.4
pytest ==4.4.2
pytest-arraydiff ==0.3
pytest-astropy ==0.5.0
pytest-doctestplus ==0.4.0
pytest-openfiles ==0.4.0
pytest-remotedata ==0.3.2
python-dateutil ==2.8.0
pytz ==2019.3
pyzmq ==18.1.0
qtconsole ==4.5.5
regex ==2022.4.24
requests ==2.22.0
requests-oauthlib ==1.3.1
resampy ==0.2.2
rope ==0.14.0
rsa ==4.9
ruamel-yaml ==0.15.46
ruamel.yaml ==0.17.21
ruamel.yaml.clib ==0.2.6
scikit-image ==0.15.0
scikit-learn ==0.21.3
scipy ==1.7.3
seaborn ==0.9.0
shutup ==0.2.0
simplegeneric ==0.8.1
singledispatch ==3.4.0.3
six ==1.12.0
sklearn ==0.0
slicerator ==1.1.0
smplx ==0.1.28
snowballstemmer ==2.0.0
some-package ==0.1
sortedcollections ==1.1.2
sortedcontainers ==2.1.0
soupsieve ==1.9.3
sphinxcontrib-applehelp ==1.0.1
sphinxcontrib-devhelp ==1.0.1
sphinxcontrib-htmlhelp ==2.0.0
sphinxcontrib-jsmath ==1.0.1
sphinxcontrib-qthelp ==1.0.2
sphinxcontrib-serializinghtml ==1.1.5
sphinxcontrib-websupport ==1.1.2
spyder ==3.3.6
spyder-kernels ==0.5.2
statsmodels ==0.10.1
sympy ==1.4
tables ==3.5.2
tabulate ==0.8.9
tblib ==1.4.0
tensorboard ==2.9.1
tensorboard-data-server ==0.6.1
tensorboard-plugin-wit ==1.8.1
tensorboardX ==2.1
termcolor ==1.1.0
terminado ==0.8.2
terminaltables ==3.1.0
testpath ==0.4.2
timm ==0.3.2
toml ==0.10.2
tomli ==2.0.1
toolz ==0.10.0
torch ==1.9.1
torchvision ==0.10.1
tornado ==6.0.3
tqdm ==4.36.1
traitlets ==4.3.3
transforms3d ==0.4.1
trimesh ==3.10.8
ttach ==0.0.3
typed-ast ==1.5.3
typer ==0.4.1
typing-extensions ==3.10.0.2
unicodecsv ==0.14.1
urllib3 ==1.26.13
wcwidth ==0.1.7
webcolors ==1.11.1
webencodings ==0.5.1
widgetsnbextension ==3.5.1
wrapt ==1.11.2
wurlitzer ==1.0.3
xdoctest ==1.0.0
xlrd ==1.2.0
xlwt ==1.3.0
xtcocotools ==1.11.5
yapf ==0.31.0
zict ==1.0.0
zipp ==3.8.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science