drivelm

[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.5%) to scientific vocabulary

Keywords

autonomous-driving chain-of-thought graph-of-thoughts large-language-models llm prompt-engineering prompting tree-of-thoughts vision-language

Last synced: 8 months ago · JSON representation ·

Repository

[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering

Basic Info

Host: GitHub
Owner: OpenDriveLab
License: apache-2.0
Language: HTML
Default Branch: main
Homepage: https://opendrivelab.com/DriveLM/
Size: 274 MB

Statistics

Stars: 1,139
Watchers: 23
Forks: 73
Open Issues: 30
Releases: 0

Topics

autonomous-driving chain-of-thought graph-of-thoughts large-language-models llm prompt-engineering prompting tree-of-thoughts vision-language

Created almost 3 years ago · Last pushed 10 months ago

Metadata Files

Readme Funding License Code of conduct Citation

README.md

[!IMPORTANT] 🌟 Stay up to date at opendrivelab.com!

**DriveLM:** *Driving with **G**raph **V**isual **Q**uestion **A**nswering* `Autonomous Driving Challenge 2024` **Driving-with-Language** [Leaderboard](https://opendrivelab.com/challenge2024/#driving_with_language).

[![](https://img.shields.io/badge/Project%20Page-8A2BE2)](https://opendrivelab.com/DriveLM/) [![License: Apache2.0](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](#licenseandcitation) [![arXiv](https://img.shields.io/badge/arXiv-2312.14150-b31b1b.svg)](https://arxiv.org/abs/2312.14150) [![](https://img.shields.io/badge/Latest%20release-v1.1-yellow)](#gettingstarted) [![Hugging Face](https://img.shields.io/badge/Test%20Server-%F0%9F%A4%97-ffc107?color=ffc107&logoColor=white)](https://huggingface.co/spaces/AGC2024/driving-with-language-official)

https://github.com/OpenDriveLab/DriveLM/assets/54334254/cddea8d6-9f6e-4e7e-b926-5afb59f8dce2

Highlights

🔥 We instantiate datasets (DriveLM-Data) built upon nuScenes and CARLA, and propose a VLM-based baseline approach (DriveLM-Agent) for jointly performing Graph VQA and end-to-end driving.

🏁 DriveLM serves as a main track in the CVPR 2024 Autonomous Driving Challenge. Everything you need for the challenge is HERE, including baseline, test data and submission format and evaluation pipeline!

News

[2025/01/08] Drive-Bench release! In-depth analysis in what are DriveLM really benchmarking. Take a look at arxiv.
[2024/07/16] DriveLM official leaderboard reopen!
[2024/07/01] DriveLM got accepted to ECCV 2024! Congrats to the team!
[2024/06/01] Challenge ended up! See the final leaderboard.
[2024/03/25] Challenge test server is online and the test questions are released. Check it out!
[2024/02/29] Challenge repo release. Baseline, data and submission format, evaluation pipeline. Have a look!
[2023/08/25] DriveLM-nuScenes demo released.
[2023/12/22] DriveLM-nuScenes full v1.0 and paper released.

Highlights
Getting Started
- Prepare DriveLM-nuScenes
Current Endeavors and Future Horizons
TODO List
DriveLM-Data
License and Citation
Other Resources

Getting Started

To get started with DriveLM: - Prepare DriveLM-nuScenes - Challenge devkit - More content coming soon

(back to top)

Current Endeavors and Future Directions

The advent of GPT-style multimodal models in real-world applications motivates the study of the role of language in driving.

Date below reflects the arXiv submission date.

If there is any missing work, please reach out to us!

DriveLM attempts to address some of the challenges faced by the community.

Lack of data: DriveLM-Data serves as a comprehensive benchmark for driving with language.
Embodiment: GVQA provides a potential direction for embodied applications of LLMs / VLMs.
Closed-loop: DriveLM-CARLA attempts to explore closed-loop planning with language.

(back to top)

TODO List

[x] DriveLM-Data
- [x] DriveLM-nuScenes
- [x] DriveLM-CARLA
[x] DriveLM-Metrics
- [x] GPT-score
[ ] DriveLM-Agent
- [x] Inference code on DriveLM-nuScenes
- [ ] Inference code on DriveLM-CARLA

(back to top)

DriveLM-Data

We facilitate the Perception, Prediction, Planning, Behavior, Motion tasks with human-written reasoning logic as a connection between them. We propose the task of GVQA on the DriveLM-Data.

📊 Comparison and Stats

DriveLM-Data is the first language-driving dataset facilitating the full stack of driving tasks with graph-structured logical dependencies. <!--

-->

Links to details about GVQA task, Dataset Features, and Annotation.

(back to top)

License and Citation

All assets and code in this repository are under the Apache 2.0 license unless specified otherwise. The language data is under CC BY-NC-SA 4.0. Other datasets (including nuScenes) inherit their own distribution licenses. Please consider citing our paper and project if they help your research.

BibTeX @article{sima2023drivelm, title={DriveLM: Driving with Graph Visual Question Answering}, author={Sima, Chonghao and Renz, Katrin and Chitta, Kashyap and Chen, Li and Zhang, Hanxue and Xie, Chengen and Luo, Ping and Geiger, Andreas and Li, Hongyang}, journal={arXiv preprint arXiv:2312.14150}, year={2023} }

BibTeX @misc{contributors2023drivelmrepo, title={DriveLM: Driving with Graph Visual Question Answering}, author={DriveLM contributors}, howpublished={\url{https://github.com/OpenDriveLab/DriveLM}}, year={2023} }

(back to top)

Other Resources

(back to top)

Owner

Name: OpenDriveLab
Login: OpenDriveLab
Kind: organization
Email: contact@opendrivelab.com
Location: Hong Kong

Website: https://opendrivelab.com
Twitter: OpenDriveLab
Repositories: 2
Profile: https://github.com/OpenDriveLab

AI for Robotics and Autonomous Driving, affiliated at The University of Hong Kong (HKU).

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - name: "DriveLM Contributors"
title: "Drive on Language"
date-released: 2023-08-25
url: "https://github.com/OpenDriveLab/DriveLM/"
license: Apache-2.0

GitHub Events

Total

Issues event: 55
Watch event: 292
Issue comment event: 70
Push event: 6
Fork event: 21

Last Year

Issues event: 55
Watch event: 292
Issue comment event: 70
Push event: 6
Fork event: 21

Issues and Pull Requests

Last synced: 8 months ago

All Time

Total issues: 122
Total pull requests: 37
Average time to close issues: 16 days
Average time to close pull requests: about 6 hours
Total issue authors: 85
Total pull request authors: 5
Average comments per issue: 2.73
Average comments per pull request: 0.03
Merged pull requests: 36
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 48
Pull requests: 0
Average time to close issues: 16 days
Average time to close pull requests: N/A
Issue authors: 38
Pull request authors: 0
Average comments per issue: 1.44
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

piqiuni (6)
yabuke (5)
dszpr (4)
ymxyll (4)
junh0cho (3)
Wuzp0508 (3)
ZiangWu-77 (2)
fcjian (2)
LCTPalmer (2)
dongyang2011 (2)
ymlab (2)
Camellia-hz (2)
NaNaoiSong (2)
Sprinter1999 (2)
ayesha-ishaq (2)

Pull Request Authors

DevLinyan (48)
jeremyxu1998 (10)
BaranEkin (2)
linyanAI (2)
ChonghaoSima (1)
eltociear (1)

Top Labels

Issue Labels

documentation (2) feature (2) good first issue (2) bug (2) TODO (1)

Pull Request Labels

Dependencies

challenge/llama_adapter_v2_multimodal7b/requirements.txt pypi

Pillow *
fairscale *
gradio *
opencv-python *
sentencepiece *
torch ==2.0.0
torchvision ==0.15.1
tqdm *

environment.yml pypi

absl-py ==1.4.0
accelerate ==0.21.0
addict ==2.4.0
aiohttp ==3.8.5
aiosignal ==1.3.1
aliyun-python-sdk-core ==2.13.36
aliyun-python-sdk-kms ==2.16.1
ansi2html ==1.8.0
antlr4-python3-runtime ==4.9.3
anyio ==3.7.1
argon2-cffi ==23.1.0
argon2-cffi-bindings ==21.2.0
arrow ==1.2.3
asttokens ==2.2.1
async-lru ==2.0.4
async-timeout ==4.0.2
attrs ==23.1.0
babel ==2.12.1
backcall ==0.2.0
beautifulsoup4 ==4.12.2
bert-score *
bitsandbytes ==0.41.1
black ==23.7.0
bleach ==6.0.0
cachetools ==5.3.1
cchardet ==2.1.7
chardet ==5.2.0
charset-normalizer ==3.2.0
click ==8.1.6
cmake ==3.27.0
colorama ==0.4.6
colorlog ==6.7.0
comm ==0.1.4
configargparse ==1.7
contourpy ==1.1.0
crcmod ==1.7
cycler ==0.11.0
dash ==2.13.0
dash-core-components ==2.0.0
dash-html-components ==2.0.0
dash-table ==5.0.0
datasets ==2.14.3
debugpy ==1.6.7.post1
decorator ==5.1.1
defusedxml ==0.7.1
descartes ==1.1.0
dill ==0.3.7
docker-pycreds ==0.4.0
evaluate ==0.4.0
exceptiongroup ==1.1.3
executing ==1.2.0
fastjsonschema ==2.18.0
filelock ==3.12.2
fire ==0.5.0
flake8 ==6.1.0
flask ==2.2.5
fonttools ==4.42.0
fqdn ==1.5.1
frozenlist ==1.4.0
fsspec ==2023.6.0
gitdb ==4.0.10
gitpython ==3.1.32
google-auth *
google-auth-oauthlib *
grpcio ==1.56.2
huggingface-hub ==0.16.4
hydra-core ==1.3.2
imageio ==2.31.1
importlib-metadata ==6.8.0
importlib-resources ==6.0.0
iniconfig ==2.0.0
inquirerpy ==0.3.4
ipykernel ==6.25.1
ipython ==8.12.2
ipython-genutils ==0.2.0
ipywidgets ==8.1.0
isoduration ==20.11.0
itsdangerous ==2.1.2
jedi ==0.19.0
jinja2 ==3.1.2
jmespath ==0.10.0
joblib ==1.3.1
json5 ==0.9.14
jsonpointer ==2.4
jsonschema ==4.19.0
jsonschema-specifications ==2023.7.1
jupyter ==1.0.0
jupyter-client ==8.3.1
jupyter-console ==6.6.3
jupyter-core ==5.3.1
jupyter-events ==0.7.0
jupyter-lsp ==2.2.0
jupyter-server ==2.7.2
jupyter-server-terminals ==0.4.4
jupyterlab ==4.0.5
jupyterlab-pygments ==0.2.2
jupyterlab-server ==2.24.0
jupyterlab-widgets ==3.0.8
kiwisolver ==1.4.4
lazy-loader ==0.3
lightning-utilities ==0.9.0
line-profiler ==4.0.3
lit ==16.0.6
llvmlite ==0.31.0
lyft-dataset-sdk ==0.0.8
markdown ==3.4.4
markdown-it-py ==3.0.0
markupsafe ==2.1.3
matplotlib ==3.5.2
matplotlib-inline ==0.1.6
mccabe ==0.7.0
mdurl ==0.1.2
mistune ==2.0.5
model-index ==0.1.11
more-itertools ==10.1.0
mpmath ==1.3.0
multidict ==6.0.4
multiprocess ==0.70.15
mypy-extensions ==1.0.0
nbclient ==0.8.0
nbconvert ==7.4.0
nbformat ==5.5.0
nest-asyncio ==1.5.7
networkx ==2.2
nltk ==3.8.1
notebook ==7.0.2
notebook-shim ==0.2.3
numba ==0.48.0
numpy *
nuscenes-devkit ==1.1.10
nvidia-cublas-cu11 ==11.10.3.66
nvidia-cuda-cupti-cu11 ==11.7.101
nvidia-cuda-nvrtc-cu11 ==11.7.99
nvidia-cuda-runtime-cu11 ==11.7.99
nvidia-cudnn-cu11 ==8.5.0.96
nvidia-cufft-cu11 ==10.9.0.58
nvidia-curand-cu11 ==10.2.10.91
nvidia-cusolver-cu11 ==11.4.0.1
nvidia-cusparse-cu11 ==11.7.4.91
nvidia-nccl-cu11 ==2.14.3
nvidia-nvtx-cu11 ==11.7.91
oauthlib ==3.2.2
omegaconf ==2.3.0
opencv-python ==4.8.0.74
ordered-set ==4.1.0
oss2 ==2.17.0
overrides ==7.4.0
packaging ==23.1
pandas ==1.4.4
pandocfilters ==1.5.0
parso ==0.8.3
pathspec ==0.11.2
pathtools ==0.1.2
peft ==0.4.0
pexpect ==4.8.0
pfzy ==0.3.4
pickleshare ==0.7.5
pillow ==10.0.0
pkgutil-resolve-name ==1.3.10
platformdirs ==3.10.0
plotly ==5.16.1
pluggy ==1.3.0
plyfile ==1.0.1
prettytable ==3.8.0
prometheus-client ==0.17.1
prompt-toolkit ==3.0.39
protobuf ==4.23.4
psutil ==5.9.5
ptyprocess ==0.7.0
pure-eval ==0.2.2
pyarrow ==12.0.1
pyasn1 ==0.5.0
pyasn1-modules ==0.3.0
pycocotools ==2.0.7
pycodestyle ==2.11.0
pycryptodome ==3.18.0
pydeprecate ==0.3.2
pyflakes ==3.1.0
pygments ==2.16.1
pyparsing ==3.0.9
pyquaternion ==0.9.9
pytest ==7.4.0
python-dateutil ==2.8.2
python-json-logger ==2.0.7
pytorch-lightning ==1.7.0
pytz ==2023.3
pywavelets ==1.4.1
pyyaml ==6.0.1
pyzmq ==25.1.1
qtconsole ==5.4.3
qtpy ==2.4.0
referencing ==0.30.2
regex ==2023.6.3
requests *
requests-oauthlib *
responses ==0.18.0
retrying ==1.3.4
rfc3339-validator ==0.1.4
rfc3986-validator ==0.1.1
rich ==13.4.2
rouge-score ==0.1.2
rpds-py ==0.10.0
rsa ==4.9
safetensors ==0.3.1
scikit-image ==0.19.3
scikit-learn ==1.3.0
scipy ==1.7.3
send2trash ==1.8.2
sentencepiece ==0.1.99
sentry-sdk ==1.29.2
setproctitle ==1.3.2
setuptools ==60.2.0
shapely ==1.8.5
six ==1.16.0
smmap ==5.0.0
sniffio ==1.3.0
soupsieve ==2.4.1
stack-data ==0.6.2
sympy ==1.12
tabulate ==0.9.0
tenacity ==8.2.3
tensorboard ==2.13.0
tensorboard-data-server ==0.7.1
termcolor ==2.3.0
terminado ==0.17.1
terminaltables ==3.1.10
threadpoolctl ==3.2.0
tifffile ==2023.7.10
tinycss2 ==1.2.1
tokenizers ==0.13.3
tomli ==2.0.1
torch ==2.0.1
torchaudio ==2.0.2
torchmetrics ==0.11.1
torchvision ==0.15.2
tornado ==6.3.3
tqdm ==4.65.0
traitlets ==5.9.0
transformers ==4.31.0
trimesh ==2.35.39
triton ==2.0.0
typing-extensions ==4.7.1
tzdata ==2023.3
uri-template ==1.3.0
urllib3 ==2.0.4
wandb ==0.15.8
wcwidth ==0.2.6
webcolors ==1.13
webencodings ==0.5.1
websocket-client ==1.6.2
werkzeug ==2.2.3
widgetsnbextension ==4.0.8
xxhash ==3.3.0
yapf ==0.40.1
yarl ==1.9.2
zipp ==3.16.2

drivelm

Science Score: 54.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Highlights

News

Table of Contents

Getting Started

Current Endeavors and Future Directions

TODO List

DriveLM-Data

📊 Comparison and Stats

License and Citation

Other Resources

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies