drivelm
[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.5%) to scientific vocabulary
Keywords
Repository
[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering
Basic Info
- Host: GitHub
- Owner: OpenDriveLab
- License: apache-2.0
- Language: HTML
- Default Branch: main
- Homepage: https://opendrivelab.com/DriveLM/
- Size: 274 MB
Statistics
- Stars: 1,139
- Watchers: 23
- Forks: 73
- Open Issues: 30
- Releases: 0
Topics
Metadata Files
README.md
[!IMPORTANT] 🌟 Stay up to date at opendrivelab.com!
https://github.com/OpenDriveLab/DriveLM/assets/54334254/cddea8d6-9f6e-4e7e-b926-5afb59f8dce2
Highlights
🔥 We instantiate datasets (DriveLM-Data) built upon nuScenes and CARLA, and propose a VLM-based baseline approach (DriveLM-Agent) for jointly performing Graph VQA and end-to-end driving.
🏁 DriveLM serves as a main track in the CVPR 2024 Autonomous Driving Challenge. Everything you need for the challenge is HERE, including baseline, test data and submission format and evaluation pipeline!
News
[2025/01/08]Drive-Bench release! In-depth analysis in what are DriveLM really benchmarking. Take a look at arxiv.[2024/07/16]DriveLM official leaderboard reopen![2024/07/01]DriveLM got accepted to ECCV 2024! Congrats to the team![2024/06/01]Challenge ended up! See the final leaderboard.[2024/03/25]Challenge test server is online and the test questions are released. Check it out![2024/02/29]Challenge repo release. Baseline, data and submission format, evaluation pipeline. Have a look![2023/08/25]DriveLM-nuScenes demo released.[2023/12/22]DriveLM-nuScenes fullv1.0and paper released. <!-- > -[Early 2024]DriveLM-Agent inference code. --> <!-- > -Note:We plan to release a simple, flexible training code that supports multi-view inputs as a starter kit for the AD challenge (stay tuned for details). -->
Table of Contents
- Highlights
- Getting Started
- Current Endeavors and Future Horizons
- TODO List
- DriveLM-Data
- License and Citation
- Other Resources
Getting Started
To get started with DriveLM: - Prepare DriveLM-nuScenes - Challenge devkit - More content coming soon
Current Endeavors and Future Directions
- The advent of GPT-style multimodal models in real-world applications motivates the study of the role of language in driving.
- Date below reflects the arXiv submission date.
- If there is any missing work, please reach out to us!
DriveLM attempts to address some of the challenges faced by the community.
- Lack of data: DriveLM-Data serves as a comprehensive benchmark for driving with language.
- Embodiment: GVQA provides a potential direction for embodied applications of LLMs / VLMs.
- Closed-loop: DriveLM-CARLA attempts to explore closed-loop planning with language.
TODO List
- [x] DriveLM-Data
- [x] DriveLM-nuScenes
- [x] DriveLM-CARLA
- [x] DriveLM-Metrics
- [x] GPT-score
- [ ] DriveLM-Agent
- [x] Inference code on DriveLM-nuScenes
- [ ] Inference code on DriveLM-CARLA
DriveLM-Data
We facilitate the Perception, Prediction, Planning, Behavior, Motion tasks with human-written reasoning logic as a connection between them. We propose the task of GVQA on the DriveLM-Data.
📊 Comparison and Stats
DriveLM-Data is the first language-driving dataset facilitating the full stack of driving tasks with graph-structured logical dependencies. <!--
| Language Dataset | Base Dataset | Language Form | Perspectives | Scale | Release?| |:---------:|:-------------:|:-------------:|:------:|:--------------------------------------------:|:----------:| | BDD-X 2018 | BDD | Description | Perception & Reasoning | 8M frames, 20k text strings |:heavycheckmark:| | HAD 2019 | HDD | Advice | Goal-oriented & stimulus-driven advice | 5,675 video clips, 45k text strings |:heavycheckmark:| | DRAMA 2022 | - | Description | Perception & Planning results | 18k frames, 100k text strings | :heavycheckmark:| | Rank2Tell 2023 | - | Perception & Planning results | QA + Captions | 5k frames | :x: | | nuScenes-QA 2023 | nuScenes | QA | Perception Result | 30k frames, 460k generated QA pairs|:heavycheckmark:| | nuPrompt 2023 | nuScenes | Object Description | Perception Result | 30k frames, 35k semi-generated QA pairs| :x:| | DriveLM 2023 | nuScenes | :boom: QA + Scene Description | :boom:Perception, Prediction and Planning with Logic | 30k frames, 360k annotated QA pairs |:heavycheckmark: |
Links to details about GVQA task, Dataset Features, and Annotation.
License and Citation
All assets and code in this repository are under the Apache 2.0 license unless specified otherwise. The language data is under CC BY-NC-SA 4.0. Other datasets (including nuScenes) inherit their own distribution licenses. Please consider citing our paper and project if they help your research.
BibTeX
@article{sima2023drivelm,
title={DriveLM: Driving with Graph Visual Question Answering},
author={Sima, Chonghao and Renz, Katrin and Chitta, Kashyap and Chen, Li and Zhang, Hanxue and Xie, Chengen and Luo, Ping and Geiger, Andreas and Li, Hongyang},
journal={arXiv preprint arXiv:2312.14150},
year={2023}
}
BibTeX
@misc{contributors2023drivelmrepo,
title={DriveLM: Driving with Graph Visual Question Answering},
author={DriveLM contributors},
howpublished={\url{https://github.com/OpenDriveLab/DriveLM}},
year={2023}
}
Other Resources
OpenDriveLab - DriveAGI | UniAD | OpenLane-V2 | Survey on E2EAD - Survey on BEV Perception | BEVFormer | OccNet
Autonomous Vision Group - tuPlan garage | CARLA garage | Survey on E2EAD - PlanT | KING | TransFuser | NEAT
Owner
- Name: OpenDriveLab
- Login: OpenDriveLab
- Kind: organization
- Email: contact@opendrivelab.com
- Location: Hong Kong
- Website: https://opendrivelab.com
- Twitter: OpenDriveLab
- Repositories: 2
- Profile: https://github.com/OpenDriveLab
AI for Robotics and Autonomous Driving, affiliated at The University of Hong Kong (HKU).
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - name: "DriveLM Contributors" title: "Drive on Language" date-released: 2023-08-25 url: "https://github.com/OpenDriveLab/DriveLM/" license: Apache-2.0
GitHub Events
Total
- Issues event: 55
- Watch event: 292
- Issue comment event: 70
- Push event: 6
- Fork event: 21
Last Year
- Issues event: 55
- Watch event: 292
- Issue comment event: 70
- Push event: 6
- Fork event: 21
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 122
- Total pull requests: 37
- Average time to close issues: 16 days
- Average time to close pull requests: about 6 hours
- Total issue authors: 85
- Total pull request authors: 5
- Average comments per issue: 2.73
- Average comments per pull request: 0.03
- Merged pull requests: 36
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 48
- Pull requests: 0
- Average time to close issues: 16 days
- Average time to close pull requests: N/A
- Issue authors: 38
- Pull request authors: 0
- Average comments per issue: 1.44
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- piqiuni (6)
- yabuke (5)
- dszpr (4)
- ymxyll (4)
- junh0cho (3)
- Wuzp0508 (3)
- ZiangWu-77 (2)
- fcjian (2)
- LCTPalmer (2)
- dongyang2011 (2)
- ymlab (2)
- Camellia-hz (2)
- NaNaoiSong (2)
- Sprinter1999 (2)
- ayesha-ishaq (2)
Pull Request Authors
- DevLinyan (48)
- jeremyxu1998 (10)
- BaranEkin (2)
- linyanAI (2)
- ChonghaoSima (1)
- eltociear (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- Pillow *
- fairscale *
- gradio *
- opencv-python *
- sentencepiece *
- torch ==2.0.0
- torchvision ==0.15.1
- tqdm *
- absl-py ==1.4.0
- accelerate ==0.21.0
- addict ==2.4.0
- aiohttp ==3.8.5
- aiosignal ==1.3.1
- aliyun-python-sdk-core ==2.13.36
- aliyun-python-sdk-kms ==2.16.1
- ansi2html ==1.8.0
- antlr4-python3-runtime ==4.9.3
- anyio ==3.7.1
- argon2-cffi ==23.1.0
- argon2-cffi-bindings ==21.2.0
- arrow ==1.2.3
- asttokens ==2.2.1
- async-lru ==2.0.4
- async-timeout ==4.0.2
- attrs ==23.1.0
- babel ==2.12.1
- backcall ==0.2.0
- beautifulsoup4 ==4.12.2
- bert-score *
- bitsandbytes ==0.41.1
- black ==23.7.0
- bleach ==6.0.0
- cachetools ==5.3.1
- cchardet ==2.1.7
- chardet ==5.2.0
- charset-normalizer ==3.2.0
- click ==8.1.6
- cmake ==3.27.0
- colorama ==0.4.6
- colorlog ==6.7.0
- comm ==0.1.4
- configargparse ==1.7
- contourpy ==1.1.0
- crcmod ==1.7
- cycler ==0.11.0
- dash ==2.13.0
- dash-core-components ==2.0.0
- dash-html-components ==2.0.0
- dash-table ==5.0.0
- datasets ==2.14.3
- debugpy ==1.6.7.post1
- decorator ==5.1.1
- defusedxml ==0.7.1
- descartes ==1.1.0
- dill ==0.3.7
- docker-pycreds ==0.4.0
- evaluate ==0.4.0
- exceptiongroup ==1.1.3
- executing ==1.2.0
- fastjsonschema ==2.18.0
- filelock ==3.12.2
- fire ==0.5.0
- flake8 ==6.1.0
- flask ==2.2.5
- fonttools ==4.42.0
- fqdn ==1.5.1
- frozenlist ==1.4.0
- fsspec ==2023.6.0
- gitdb ==4.0.10
- gitpython ==3.1.32
- google-auth *
- google-auth-oauthlib *
- grpcio ==1.56.2
- huggingface-hub ==0.16.4
- hydra-core ==1.3.2
- imageio ==2.31.1
- importlib-metadata ==6.8.0
- importlib-resources ==6.0.0
- iniconfig ==2.0.0
- inquirerpy ==0.3.4
- ipykernel ==6.25.1
- ipython ==8.12.2
- ipython-genutils ==0.2.0
- ipywidgets ==8.1.0
- isoduration ==20.11.0
- itsdangerous ==2.1.2
- jedi ==0.19.0
- jinja2 ==3.1.2
- jmespath ==0.10.0
- joblib ==1.3.1
- json5 ==0.9.14
- jsonpointer ==2.4
- jsonschema ==4.19.0
- jsonschema-specifications ==2023.7.1
- jupyter ==1.0.0
- jupyter-client ==8.3.1
- jupyter-console ==6.6.3
- jupyter-core ==5.3.1
- jupyter-events ==0.7.0
- jupyter-lsp ==2.2.0
- jupyter-server ==2.7.2
- jupyter-server-terminals ==0.4.4
- jupyterlab ==4.0.5
- jupyterlab-pygments ==0.2.2
- jupyterlab-server ==2.24.0
- jupyterlab-widgets ==3.0.8
- kiwisolver ==1.4.4
- lazy-loader ==0.3
- lightning-utilities ==0.9.0
- line-profiler ==4.0.3
- lit ==16.0.6
- llvmlite ==0.31.0
- lyft-dataset-sdk ==0.0.8
- markdown ==3.4.4
- markdown-it-py ==3.0.0
- markupsafe ==2.1.3
- matplotlib ==3.5.2
- matplotlib-inline ==0.1.6
- mccabe ==0.7.0
- mdurl ==0.1.2
- mistune ==2.0.5
- model-index ==0.1.11
- more-itertools ==10.1.0
- mpmath ==1.3.0
- multidict ==6.0.4
- multiprocess ==0.70.15
- mypy-extensions ==1.0.0
- nbclient ==0.8.0
- nbconvert ==7.4.0
- nbformat ==5.5.0
- nest-asyncio ==1.5.7
- networkx ==2.2
- nltk ==3.8.1
- notebook ==7.0.2
- notebook-shim ==0.2.3
- numba ==0.48.0
- numpy *
- nuscenes-devkit ==1.1.10
- nvidia-cublas-cu11 ==11.10.3.66
- nvidia-cuda-cupti-cu11 ==11.7.101
- nvidia-cuda-nvrtc-cu11 ==11.7.99
- nvidia-cuda-runtime-cu11 ==11.7.99
- nvidia-cudnn-cu11 ==8.5.0.96
- nvidia-cufft-cu11 ==10.9.0.58
- nvidia-curand-cu11 ==10.2.10.91
- nvidia-cusolver-cu11 ==11.4.0.1
- nvidia-cusparse-cu11 ==11.7.4.91
- nvidia-nccl-cu11 ==2.14.3
- nvidia-nvtx-cu11 ==11.7.91
- oauthlib ==3.2.2
- omegaconf ==2.3.0
- opencv-python ==4.8.0.74
- ordered-set ==4.1.0
- oss2 ==2.17.0
- overrides ==7.4.0
- packaging ==23.1
- pandas ==1.4.4
- pandocfilters ==1.5.0
- parso ==0.8.3
- pathspec ==0.11.2
- pathtools ==0.1.2
- peft ==0.4.0
- pexpect ==4.8.0
- pfzy ==0.3.4
- pickleshare ==0.7.5
- pillow ==10.0.0
- pkgutil-resolve-name ==1.3.10
- platformdirs ==3.10.0
- plotly ==5.16.1
- pluggy ==1.3.0
- plyfile ==1.0.1
- prettytable ==3.8.0
- prometheus-client ==0.17.1
- prompt-toolkit ==3.0.39
- protobuf ==4.23.4
- psutil ==5.9.5
- ptyprocess ==0.7.0
- pure-eval ==0.2.2
- pyarrow ==12.0.1
- pyasn1 ==0.5.0
- pyasn1-modules ==0.3.0
- pycocotools ==2.0.7
- pycodestyle ==2.11.0
- pycryptodome ==3.18.0
- pydeprecate ==0.3.2
- pyflakes ==3.1.0
- pygments ==2.16.1
- pyparsing ==3.0.9
- pyquaternion ==0.9.9
- pytest ==7.4.0
- python-dateutil ==2.8.2
- python-json-logger ==2.0.7
- pytorch-lightning ==1.7.0
- pytz ==2023.3
- pywavelets ==1.4.1
- pyyaml ==6.0.1
- pyzmq ==25.1.1
- qtconsole ==5.4.3
- qtpy ==2.4.0
- referencing ==0.30.2
- regex ==2023.6.3
- requests *
- requests-oauthlib *
- responses ==0.18.0
- retrying ==1.3.4
- rfc3339-validator ==0.1.4
- rfc3986-validator ==0.1.1
- rich ==13.4.2
- rouge-score ==0.1.2
- rpds-py ==0.10.0
- rsa ==4.9
- safetensors ==0.3.1
- scikit-image ==0.19.3
- scikit-learn ==1.3.0
- scipy ==1.7.3
- send2trash ==1.8.2
- sentencepiece ==0.1.99
- sentry-sdk ==1.29.2
- setproctitle ==1.3.2
- setuptools ==60.2.0
- shapely ==1.8.5
- six ==1.16.0
- smmap ==5.0.0
- sniffio ==1.3.0
- soupsieve ==2.4.1
- stack-data ==0.6.2
- sympy ==1.12
- tabulate ==0.9.0
- tenacity ==8.2.3
- tensorboard ==2.13.0
- tensorboard-data-server ==0.7.1
- termcolor ==2.3.0
- terminado ==0.17.1
- terminaltables ==3.1.10
- threadpoolctl ==3.2.0
- tifffile ==2023.7.10
- tinycss2 ==1.2.1
- tokenizers ==0.13.3
- tomli ==2.0.1
- torch ==2.0.1
- torchaudio ==2.0.2
- torchmetrics ==0.11.1
- torchvision ==0.15.2
- tornado ==6.3.3
- tqdm ==4.65.0
- traitlets ==5.9.0
- transformers ==4.31.0
- trimesh ==2.35.39
- triton ==2.0.0
- typing-extensions ==4.7.1
- tzdata ==2023.3
- uri-template ==1.3.0
- urllib3 ==2.0.4
- wandb ==0.15.8
- wcwidth ==0.2.6
- webcolors ==1.13
- webencodings ==0.5.1
- websocket-client ==1.6.2
- werkzeug ==2.2.3
- widgetsnbextension ==4.0.8
- xxhash ==3.3.0
- yapf ==0.40.1
- yarl ==1.9.2
- zipp ==3.16.2