imvoxelnet-unofficial
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (4.9%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: henu77
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 3.88 MB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
ImVoxelNet 复现
声明
- 本项目不是原创,是基于ImVoxelNet 进行的修改,主要是为了适配更高的版本的 pytorch 和 mmcv 。
- 采用 OpenMMLab 的 mmdetection3d 进行实现,仅能用于预训练模型的推理。
环境配置
硬件信息
shell
nvcc -V
bash
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Jun__6_03:03:05_Pacific_Daylight_Time_2024
Cuda compilation tools, release 12.5, V12.5.82
Build cuda_12.5.r12.5/compiler.34385749_0
shell
nvidia-smi
bash
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 572.16 Driver Version: 572.16 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4060 WDDM | 00000000:01:00.0 On | N/A |
| 0% 46C P3 N/A / 120W | 3805MiB / 8188MiB | 11% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
创建环境
```shell
创建虚拟环境
conda create -n imvoxelnet python=3.8.20
激活虚拟环境
conda activate imvoxelnet
git clone https://github.com/henu77/ImVoxelNet-Unofficial.git
cd ImVoxelNet-Unofficial
安装CUDA 11.1版本的torch和torchvision
pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 -f https://mirrors.aliyun.com/pytorch-wheels/cu118
安装mmcv、mmengine、mmdet
pip install -U openmim mim install mmcv==2.0.0 mim install mmengine==0.10.6 mim install mmdet==3.3.0
本地安装mmdetection3d
pip install -e .
安装PyQT5
pip install PyQt5 pygrabber==0.1
安装Gradio
pip install gradio==4.44.1 ```
下载预训练权重
从这里下载对应的预训练权重,放到 ./checkpoints 文件夹下,并修改为名字为 imvoxelnet_total_sunrgbd_fast.pth。
创新点
- 端到端的多视角优化:首次将多视角RGB图像的3D物体检测任务定义为端到端的优化问题,支持任意数量输入(单目或多视角),且在训练和推理中均可灵活处理不同数量的视图。
- 通用全卷积架构:提出了一种全卷积3D检测框架(ImVoxelNet),通过将2D图像特征投影到3D体素空间,结合3D卷积网络提取特征,并复用点云检测器的头部结构,无需额外修改。
- 跨场景通用性:通过领域特定的检测头(室内/室外)实现统一的架构,在室内外场景(如KITTI、ScanNet)中均取得最优性能,成为首个通用型RGB-based 3D检测方法。
方法
- 数据预处理:
- 特征提取:使用预训练的2D卷积网络(如ResNet-50)提取多尺度特征,并通过FPN融合。
- 体素投影:将2D特征按相机位姿投影到3D体素空间,通过平均聚合多视角特征,构建3D体素表示。
- 3D特征提取:
- 编码器-解码器结构:针对室内场景设计轻量化的3D卷积网络,降低计算复杂度;室外场景则将体素压缩到BEV平面,使用2D卷积处理。
- 检测头设计:
- 室外检测头:基于BEV平面,采用2D锚框回归3D边界框(位置、尺寸、角度)。
- 室内检测头:扩展FCOS到3D,通过多尺度3D卷积预测边界框,引入旋转3D IoU损失。
- 额外任务头:联合估计相机位姿和房间布局(仅用于部分室内数据集)。

实验结果
指标
- 在
KITTI数据集上的结果

- 在
SUN RGB-D数据集上的结果

- 在
ScanNet数据集上的结果

可视化


界面展示
```shell
运行 PyQT5 界面
python ui.py ```



```shell
运行 Gradio 界面
python gradio_ui.py ```


Owner
- Name: malong
- Login: henu77
- Kind: user
- Location: 中国河南省开封市
- Repositories: 2
- Profile: https://github.com/henu77
My name is Malong.I am a student of Henan University majoring in computer science and technology. I like basketball and programming.
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - name: "MMDetection3D Contributors" title: "OpenMMLab's Next-generation Platform for General 3D Object Detection" date-released: 2020-07-23 url: "https://github.com/open-mmlab/mmdetection3d" license: Apache-2.0
GitHub Events
Total
- Watch event: 2
- Push event: 2
- Create event: 2
Last Year
- Watch event: 2
- Push event: 2
- Create event: 2
Dependencies
- pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build
- pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build
- docutils ==0.16.0
- markdown >=3.4.0
- myst-parser *
- sphinx ==4.0.2
- sphinx-tabs *
- sphinx_copybutton *
- sphinx_markdown_tables >=0.0.16
- tabulate *
- urllib3 <2.0.0
- mmcv >=2.0.0rc4,<2.2.0
- mmdet >=3.0.0,<3.3.0
- mmengine >=0.7.1,<1.0.0
- black ==20.8b1
- typing-extensions *
- waymo-open-dataset-tf-2-6-0 *
- mmcv >=2.0.0rc4
- mmdet >=3.0.0
- mmengine >=0.7.1
- torch *
- torchvision *
- lyft_dataset_sdk *
- networkx >=2.5
- numba *
- numpy *
- nuscenes-devkit *
- open3d *
- plyfile *
- scikit-image *
- tensorboard *
- trimesh *
- codecov * test
- flake8 * test
- interrogate * test
- isort * test
- kwarray * test
- parameterized * test
- pytest * test
- pytest-cov * test
- pytest-runner * test
- ubelt * test
- xdoctest >=0.10.0 test
- yapf * test