emiff

EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection

https://github.com/bosszhe/emiff

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.1%) to scientific vocabulary

Last synced: 6 months ago · JSON representation ·

Repository

EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection

Basic Info

Host: GitHub
Owner: Bosszhe
License: apache-2.0
Language: Python
Default Branch: main
Homepage:
Size: 8.34 MB

Statistics

Stars: 77
Watchers: 2
Forks: 10
Open Issues: 3
Releases: 0

Created almost 3 years ago · Last pushed almost 2 years ago

Metadata Files

Readme License Citation

EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection

Project page | Paper | VIMI |

EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection Zhe Wang, Siqi Fan, Xiaoliang Huo, Tongda Xu, Yan Wang, Jingjing Liu, Yilun Chen, Ya-Qin Zhang.ICRA 2024.

This repository contains the official Pytorch implementation of training & evaluation code and the pretrained models for EMIFF/VIMI.

Abstract

In autonomous driving, cooperative perception makes use of multi-view cameras from both vehicles and infrastructure, providing a global vantage point with rich semantic context of road conditions beyond a single vehicle viewpoint. Currently, two major challenges persist in vehicle-infrastructure cooperative 3D (VIC3D) object detection: $1)$ inherent pose errors when fusing multi-view images, caused by time asynchrony across cameras; $2)$ information loss in transmission process resulted from limited communication bandwidth. To address these issues, we propose a novel camera-based 3D detection framework for VIC3D task, Enhanced Multi-scale Image Feature Fusion (EMIFF). To fully exploit holistic perspectives from both vehicles and infrastructure, we propose Multi-scale Cross Attention (MCA) and Camera-aware Channel Masking (CCM) modules to enhance infrastructure and vehicle features at scale, spatial, and channel levels to correct the pose error introduced by camera asynchrony. We also introduce a Feature Compression (FC) module with channel and spatial compression blocks for transmission efficiency. Experiments show that EMIFF achieves SOTA on DAIR-V2X-C datasets, significantly outperforming previous early-fusion and late-fusion methods with comparable transmission costs.

Methods

Architecture

Get Started

Benchmark and Model Zoo

Modality:Image

| Fusion | Method| Dataset | AP-3D (IoU=0.5) | AP-BEV (IoU=0.5) |Config|DownLoad| | :-----: | :--------: | :-------: | :----: | :----: | :----: | :-----: |
| Only-Veh | ImvoxelNet | VIC-Sync | 7.29 | 8.85 | config |\ | | Only-Inf | ImvoxelNet | VIC-Sync | 8.66 | 14.41 | config |\ | | Late-Fusion | ImvoxelNet | VIC-Sync | 11.08 | 14.76 | \ | \ | | Early-Fusion | BEVFormer_S | VIC-Sync | 8.80 | 13.45 | config | model/log|
| Early-Fusion | ImVoxelNet | VIC-Sync | 12.72 | 18.17 | config | model/log|
| Intermediate-Fusion| EMIFF | VIC-Sync | 15.61 | 21.44 | config | model/log |

We evaluate Only-Veh/Only-Inf/Late-Fusion model following OpenDAIRV2X.

Acknowledgement

This project is not possible without the following codebases. * OpenDAIRV2X * MMDetection3D

Citation

If you find our work useful in your research, please consider citing:

``` @misc{wang2023vimi, title={VIMI: Vehicle-Infrastructure Multi-view Intermediate Fusion for Camera-based 3D Object Detection}, author={Zhe Wang and Siqi Fan and Xiaoliang Huo and Tongda Xu and Yan Wang and Jingjing Liu and Yilun Chen and Ya-Qin Zhang}, year={2023}, eprint={2303.10975}, archivePrefix={arXiv}, primaryClass={cs.CV} }

@inproceedings{wang2024emiff, title={EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection}, author={Zhe Wang and Siqi Fan and Xiaoliang Huo and Tongda Xu and Yan Wang and Jingjing Liu and Yilun Chen and Ya-Qin Zhang}, booktitle = {2024 IEEE International Conference on Robotics and Automation (ICRA)}, year = {2024}} } ```

Owner

Login: Bosszhe
Kind: user

Repositories: 1
Profile: https://github.com/Bosszhe

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - name: "MMDetection3D Contributors"
title: "OpenMMLab's Next-generation Platform for General 3D Object Detection"
date-released: 2020-07-23
url: "https://github.com/open-mmlab/mmdetection3d"
license: Apache-2.0

GitHub Events

Total

Issues event: 2
Watch event: 12
Issue comment event: 6
Fork event: 3

Last Year

Issues event: 2
Watch event: 12
Issue comment event: 6
Fork event: 3

Dependencies

environment.yml pypi

absl-py ==1.4.0
addict ==2.4.0
ansi2html ==1.9.1
anyio ==3.6.2
appdirs ==1.4.4
argon2-cffi ==21.3.0
argon2-cffi-bindings ==21.2.0
attrs ==23.1.0
backcall ==0.2.0
beautifulsoup4 ==4.12.2
black ==23.3.0
bleach ==6.0.0
cachetools ==5.3.0
ccimport ==0.4.2
cffi ==1.15.1
charset-normalizer ==3.1.0
click ==8.1.3
colorama ==0.4.6
configargparse ==1.7
cumm ==0.4.11
cycler ==0.11.0
dash ==2.14.2
dash-core-components ==2.0.0
dash-html-components ==2.0.0
dash-table ==5.0.0
debugpy ==1.6.7
decorator ==5.1.1
defusedxml ==0.7.1
descartes ==1.1.0
docker-pycreds ==0.4.0
entrypoints ==0.4
exceptiongroup ==1.1.1
fastjsonschema ==2.16.3
fire ==0.5.0
flake8 ==3.9.2
flask ==2.2.5
fonttools ==4.38.0
fvcore ==0.1.5.post20221221
gitdb ==4.0.10
gitpython ==3.1.31
google-auth ==2.17.3
google-auth-oauthlib ==0.4.6
grpcio ==1.54.0
idna ==3.4
imageio ==2.28.1
importlib-metadata ==6.6.0
importlib-resources ==5.12.0
iniconfig ==2.0.0
iopath ==0.1.10
ipykernel ==6.16.2
ipython ==7.34.0
ipython-genutils ==0.2.0
ipywidgets ==8.0.6
itsdangerous ==2.1.2
jedi ==0.18.2
jinja2 ==3.1.2
joblib ==1.2.0
jsonschema ==4.17.3
jupyter ==1.0.0
jupyter-client ==7.4.9
jupyter-console ==6.6.3
jupyter-core ==4.12.0
jupyter-server ==1.24.0
jupyterlab-pygments ==0.2.2
jupyterlab-widgets ==3.0.7
kiwisolver ==1.4.4
lark ==1.1.8
llvmlite ==0.36.0
lyft-dataset-sdk ==0.0.8
markdown ==3.4.3
markdown-it-py ==2.2.0
markupsafe ==2.1.2
matplotlib ==3.5.2
matplotlib-inline ==0.1.6
mccabe ==0.6.1
mdurl ==0.1.2
mistune ==2.0.5
mmcls ==0.25.0
mmcv-full ==1.6.2
mmdet ==2.25.2
mmengine ==0.7.3
mmsegmentation ==0.29.0
model-index ==0.1.11
mypy-extensions ==1.0.0
nbclassic ==1.0.0
nbclient ==0.7.4
nbconvert ==7.3.1
nbformat ==5.7.0
nest-asyncio ==1.5.6
networkx ==2.2
ninja ==1.11.1.1
notebook ==6.5.4
notebook-shim ==0.2.3
numba ==0.53.0
numpy ==1.21.6
nuscenes-devkit ==1.1.10
oauthlib ==3.2.2
open3d ==0.17.0
opencv-python ==4.7.0.72
openmim ==0.3.7
ordered-set ==4.1.0
packaging ==23.1
pandas ==1.3.5
pandocfilters ==1.5.0
parso ==0.8.3
pathspec ==0.11.1
pathtools ==0.1.2
pccm ==0.4.11
pexpect ==4.8.0
pickleshare ==0.7.5
pillow ==9.5.0
pkgutil-resolve-name ==1.3.10
platformdirs ==3.5.0
plotly ==5.14.1
pluggy ==1.0.0
plyfile ==0.9
portalocker ==2.7.0
prettytable ==3.7.0
prometheus-client ==0.16.0
prompt-toolkit ==3.0.38
protobuf ==3.9.2
psutil ==5.9.5
ptyprocess ==0.7.0
pyasn1 ==0.5.0
pyasn1-modules ==0.3.0
pybind11 ==2.11.1
pycocotools ==2.0.6
pycodestyle ==2.7.0
pycparser ==2.21
pyflakes ==2.3.1
pygments ==2.15.1
pyparsing ==3.0.9
pyquaternion ==0.9.9
pyrsistent ==0.19.3
pytest ==7.3.1
python-dateutil ==2.8.2
pytz ==2023.3
pywavelets ==1.3.0
pyyaml ==6.0
pyzmq ==25.0.2
qtconsole ==5.4.3
qtpy ==2.3.1
requests ==2.30.0
requests-oauthlib ==1.3.1
retrying ==1.3.4
rich ==13.3.5
rsa ==4.9
scikit-image ==0.19.3
scikit-learn ==1.0.2
scipy ==1.7.3
send2trash ==1.8.2
sentry-sdk ==1.22.2
setproctitle ==1.3.2
setuptools ==59.5.0
shapely ==1.8.5
six ==1.16.0
smmap ==5.0.0
sniffio ==1.3.0
soupsieve ==2.4.1
spconv ==2.3.6
tabulate ==0.9.0
tenacity ==8.2.2
tensorboard ==2.11.2
tensorboard-data-server ==0.6.1
tensorboard-plugin-wit ==1.8.1
termcolor ==2.3.0
terminado ==0.17.1
terminaltables ==3.1.10
threadpoolctl ==3.1.0
tifffile ==2021.11.2
tinycss2 ==1.2.1
tomli ==2.0.1
torch ==1.9.1
torch-efficient-distloss ==0.1.3
torch-scatter ==2.1.1
torchaudio ==0.9.1
torchvision ==0.10.1
tornado ==6.2
tqdm ==4.65.0
traitlets ==5.9.0
trimesh ==2.35.39
typed-ast ==1.5.4
typing-extensions ==4.5.0
urllib3 ==1.26.15
wandb ==0.15.2
wcwidth ==0.2.6
webencodings ==0.5.1
websocket-client ==1.5.1
werkzeug ==2.2.3
widgetsnbextension ==4.0.7
yacs ==0.1.8
yapf ==0.33.0
zipp ==3.15.0

requirements/build.txt pypi

requirements/docs.txt pypi

docutils ==0.16.0
m2r *
mistune ==0.8.4
myst-parser *
sphinx ==4.0.2
sphinx-copybutton *
sphinx_markdown_tables *

requirements/mminstall.txt pypi

mmcv-full >=1.4.8,<=1.6.0
mmdet >=2.24.0,<=3.0.0
mmsegmentation >=0.20.0,<=1.0.0

requirements/optional.txt pypi

open3d *
spconv *
waymo-open-dataset-tf-2-1-0 ==1.2.0

requirements/readthedocs.txt pypi

mmcv >=1.4.8
mmdet >=2.24.0
mmsegmentation >=0.20.1
torch *
torchvision *

requirements/runtime.txt pypi

lyft_dataset_sdk *
networkx >=2.2,<2.3
numba ==0.53.0
numpy *
nuscenes-devkit *
plyfile *
scikit-image *
tensorboard *
trimesh >=2.35.39,<2.35.40

requirements/tests.txt pypi

asynctest * test
codecov * test
flake8 * test
interrogate * test
isort * test
kwarray * test
pytest * test
pytest-cov * test
pytest-runner * test
ubelt * test
xdoctest >=0.10.0 test
yapf * test

requirements.txt pypi

setup.py pypi

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

emiff

Science Score: 54.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection

Project page | Paper | VIMI |

Abstract

Methods

Get Started

Benchmark and Model Zoo

Acknowledgement

Citation

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies