occnet

[ICCV 2023] OccNet: Scene as Occupancy

https://github.com/opendrivelab/occnet

Science Score: 64.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
    1 of 10 committers (10.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.3%) to scientific vocabulary

Keywords

3d-object-detection 3d-occupancy autonomous-driving

Keywords from Contributors

autonomous-driving-framework bev-segmentation end-to-end-autonomous-driving motion-planning motion-prediction multi-object-tracking occupancy-prediction perception-prediction-planning foundation-model
Last synced: 6 months ago · JSON representation ·

Repository

[ICCV 2023] OccNet: Scene as Occupancy

Basic Info
Statistics
  • Stars: 636
  • Watchers: 17
  • Forks: 54
  • Open Issues: 26
  • Releases: 1
Topics
3d-object-detection 3d-occupancy autonomous-driving
Created almost 3 years ago · Last pushed 8 months ago
Metadata Files
Readme Funding License Code of conduct Citation

README.md

[!IMPORTANT] 🌟 Stay up to date at opendrivelab.com!

# Occupancy and Flow Challenge **The tutorial of `Occupancy and Flow` track for [CVPR 2024 Autonomous Grand Challenge](https://opendrivelab.com/challenge2024).**

Introduction

Understanding the 3D surroundings including the background stuffs and foreground objects is important for autonomous driving. In the traditional 3D object detection task, a foreground object is represented by the 3D bounding box. However, the geometrical shape of the object is complex, which can not be represented by a simple 3D box, and the perception of the background stuffs is absent. The goal of this task is to predict the 3D occupancy of the scene. In this task, we provide a large-scale occupancy benchmark based on the nuScenes dataset. The benchmark is a voxelized representation of the 3D space, and the occupancy state and semantics of the voxel in 3D space are jointly estimated in this task. The complexity of this task lies in the dense prediction of 3D space given the surround-view images.

News

[!TIP] :ice_cube: We release a 3D occupancy synthetic dataset LightwheelOcc, with dense occupancy and depth label and realistic sensor configuration simulating nuScenes dataset. Check it out!

  • 2024/07/12 Test server reopen.
  • 2024/06/01 The challenge wraps up.
  • 2024/04/09 We release the technical report of the new RayIoU metric, as well as a new occupancy method: SparseOcc.
  • 2024/03/14 We release a new version (openocc_v2.1) of the occupancy ground-truth, including some bug fixes regarding the occupancy flow. Delete the old version and download the new one! Please refer to getting_started for details.
  • 2024/03/01 The challenge begins.

Table of Contents

Task Definition

Given images from multiple cameras, the goal is to predict the semantics and flow of each voxel grid in the scene. The paticipants are required to submit their prediction on nuScenes OpenOcc test set.

Rules for Occupancy and Flow Challenge

  • We allow using annotations provided in the nuScenes dataset. During inference, the input modality of the model should be camera only.
  • No future frame is allowed during inference.
  • In order to check the compliance, we will ask the participants to provide technical reports to the challenge committee and the participant will be asked to provide a public talk about the method after winning the award.
  • Every submission provides method information. We encourage publishing code, but do not make it a requirement.
  • Each team can have at most one account on the evaluation server. Users that create multiple accounts to circumvent the rules will be excluded from the challenge.
  • Each team can submit at most three results per day during the challenge.
  • Any attempt to circumvent these rules will result in a permanent ban of the team or company from the challenge.

(back to top)

Evaluation Metrics

Leaderboard ranking for this challenge is by the Occupancy Score. It consists of two parts: Ray-based mIoU, and absolute velocity error for occupancy flow.

The implementation is here: projects/mmdet3dplugin/datasets/raymetrics.py

Ray-based mIoU

We use the well-known mean intersection-over-union (mIoU) metric. However, the elements of the set are now query rays, not voxels.

Specifically, we emulate LiDAR by projecting query rays into the predicted 3D occupancy volume. For each query ray, we compute the distance it travels before it intersects any surface. We then retrieve the corresponding class label and flow prediction.

We apply the same procedure to the ground-truth occupancy to obtain the groud-truth depth, class label and flow.

A query ray is classified as a true positive (TP) if the class labels coincide and the L1 error between the ground-truth depth and the predicted depth is less than either a certain threshold (e.g. 2m).

Let $C$ be he number of classes.

$$ mIoU=\frac{1}{C}\displaystyle \sum{c=1}^{C}\frac{TPc}{TPc+FPc+FN_c}, $$

where $TPc$ , $FPc$ , and $FNc$ correspond to the number of true positive, false positive, and false negative predictions for class $ci$.

We finally average over distance thresholds of {1, 2, 4} meters and compute the mean across classes.

For more details about this metric, please refer to the technical report.

AVE for Occupancy Flow

Here we measure velocity errors for a set of true positives (TP). We use a threshold of 2m distance.

The absolute velocity error (AVE) is defined for 8 classes ('car', 'truck', 'trailer', 'bus', 'construction_vehicle', 'bicycle', 'motorcycle', 'pedestrian') in m/s.

Occupancy Score

The final occupancy score is defined to be a weighted sum of mIoU and mAVE. Note that the velocity errors are converted to velocity scores as max(1 - mAVE, 0.0). That is,

OccScore = mIoU * 0.9 + max(1 - mAVE, 0.0) * 0.1

(back to top)

OpenOcc Dataset

Basic Information

  • The nuScenes OpenOcc dataset contains 17 classes. Voxel semantics for each sample frame is given as [semantics] in the labels.npz. Occupancy flow is given as [flow] in the labels.npz.
| Type | Info | | :----: | :----: | | train | 28,130 | | val | 6,019 | | test | 6,008 | | cameras | 6 | | voxel size | 0.4m | | range | [-40m, -40m, -1m, 40m, 40m, 5.4m] | | volume size | [200, 200, 16] | | #classes | 0 - 16 |

Download

  1. Download the nuScenes dataset and put in into data/nuscenes

  2. Download our openocc_v2.1.zip and infos.zip from OpenDataLab or Google Drive

  3. Unzip them in data/nuscenes

Hierarchy

The hierarchy of folder data/nuscenes is described below:

nuscenes ├── maps ├── nuscenes_infos_train_occ.pkl ├── nuscenes_infos_val_occ.pkl ├── nuscenes_infos_test_occ.pkl ├── openocc_v2 ├── samples ├── v1.0-test └── v1.0-trainval

  • openocc_v2 is the occuapncy GT.
  • nuscenes_infos_{train/val/test}_occ.pkl contains meta infos of the dataset.
  • Other folders are borrowed from the official nuScenes dataset.

Known Issues

  • nuScenes (issue #721) lacks translation in the z-axis, which makes it hard to recover accurate 6d localization and would lead to the misalignment of point clouds while accumulating them over whole scenes. Ground stratification occurs in several data.

(back to top)

Baseline

We provide a baseline model based on BEVFormer.

Please refer to getting_started for details.

(back to top)

Submission

Submission format

The submission must be a single dict with the following structure:

submission = { 'method': '', <str> -- name of the method 'team': '', <str> -- name of the team, identical to the Google Form 'authors': [''] <list> -- list of str, authors 'e-mail': '', <str> -- e-mail address 'institution / company': '', <str> -- institution or company 'country / region': '', <str> -- country or region, checked by iso3166* 'results': { [token]: { <str> -- frame (sample) token 'pcd_cls' <np.ndarray> [N] -- predicted class ID, np.uint8, 'pcd_dist' <np.ndarray> [N] -- predicted depth, np.float16, 'pcd_flow' <np.ndarray> [N, 2] -- predicted flow, np.float16, }, ... } }

Below is an example of how to save the submission:

``` python import pickle, gzip

with gzip.open('submission.gz', 'wb', compresslevel=9) as f: pickle.dump(submission, f, protocol=pickle.HIGHEST_PROTOCOL) ```

We provide example scripts based on mmdetection3d to generate the submission file, please refer to baseline for details.

(back to top)

Working with your own codebase

We understand that many participants may use your own codebases. Here, we provide a simple standlone package that converts your occupancy predictions to the submission format. Please follows the steps below:

  1. Save the prediction results on nuScenes OpenOcc val locally, in the same format as the occupancy ground truth.
  2. Perform ray projection locally and save the projection results. cd tools/ray_iou python ray_casting.py --pred-root your_prediction
  3. Test whether the evaluation on nuScenes OpenOcc val meets expectations locally. python metric.py --pred output/my_pred_pcd.gz --gt output/nuscenes_infos_val_occ_pcd.gz
  4. Save and project the prediction results of nuScenes OpenOcc test according to steps 1 and 2, and upload them to the competition server.

License and Citation

If you use the challenge dataset in your paper, please consider citing OccNet with the following BibTex:

bibtex @article{sima2023_occnet, title={Scene as Occupancy}, author={Chonghao Sima and Wenwen Tong and Tai Wang and Li Chen and Silei Wu and Hanming Deng and Yi Gu and Lewei Lu and Ping Luo and Dahua Lin and Hongyang Li}, year={2023}, eprint={2306.02851}, archivePrefix={arXiv}, primaryClass={cs.CV} }

If you use RayIoU as the evaluation metric, please consider citing the following BibTex:

bibtex @misc{liu2024fully, title={Fully Sparse 3D Occupancy Prediction}, author={Haisong Liu and Yang Chen and Haiguang Wang and Zetong Yang and Tianyu Li and Jia Zeng and Li Chen and Hongyang Li and Limin Wang}, year={2024}, eprint={2312.17118}, archivePrefix={arXiv}, primaryClass={cs.CV} }

This dataset is under CC BY-NC-SA 4.0 license. Before using the dataset, you should register on the website and agree to the terms of use of the nuScenes. All code within this repository is under Apache 2.0 License.

(back to top)

Owner

  • Name: OpenDriveLab
  • Login: OpenDriveLab
  • Kind: organization
  • Email: contact@opendrivelab.com
  • Location: Hong Kong

AI for Robotics and Autonomous Driving, affiliated at The University of Hong Kong (HKU).

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - name: "OpenOccupancy Benchmark Contributors"
title: "OpenOccupancy: 3D Occupancy Benchmark for Scene Perception in Autonomous Driving"
date-released: 2023-02-10
url: "https://github.com/CVPR2023-Occupancy-Prediction-Challenge/CVPR2023-Occupancy-Prediction-Challenge"
license: Apache-2.0

GitHub Events

Total
  • Issues event: 9
  • Watch event: 73
  • Issue comment event: 12
  • Push event: 1
  • Fork event: 6
Last Year
  • Issues event: 9
  • Watch event: 73
  • Issue comment event: 12
  • Push event: 1
  • Fork event: 6

Committers

Last synced: 10 months ago

All Time
  • Total Commits: 119
  • Total Committers: 10
  • Avg Commits per committer: 11.9
  • Development Distribution Score (DDS): 0.756
Past Year
  • Commits: 2
  • Committers: 1
  • Avg Commits per committer: 2.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Chonghao Sima s****c@p****u 29
1349949 1****9@q****m 29
Alfred Liu a****7@g****m 22
Tianyu Li l****u@p****n 11
wave.leaf27 w****7@g****m 11
faikit 2****t 8
Hang Zhao z****4@g****m 5
Tianyu Li l****u@1****m 2
WenwenTong t****5@1****m 1
Hongyang Li h****0@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 50
  • Total pull requests: 1
  • Average time to close issues: 16 days
  • Average time to close pull requests: 8 minutes
  • Total issue authors: 41
  • Total pull request authors: 1
  • Average comments per issue: 1.6
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 5
  • Pull requests: 0
  • Average time to close issues: 20 minutes
  • Average time to close pull requests: N/A
  • Issue authors: 5
  • Pull request authors: 0
  • Average comments per issue: 1.2
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • cbiras (2)
  • xuyoji (2)
  • YuanxianH (2)
  • haiphamcse (2)
  • haoran18 (2)
  • Icecream-blue-sky (2)
  • lubinBoooos (2)
  • yanglong2000 (2)
  • aoyanl (2)
  • bbzh (1)
  • Prashantkramadhari (1)
  • russellyq (1)
  • RuanBispo (1)
  • howthep (1)
  • hanzifan (1)
Pull Request Authors
  • eltociear (1)
Top Labels
Issue Labels
Pull Request Labels