Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (4.9%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: henu77
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 3.88 MB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 12 months ago · Last pushed 12 months ago
Metadata Files
Readme License Citation

README.md

ImVoxelNet 复现

声明

  • 本项目不是原创,是基于ImVoxelNet 进行的修改,主要是为了适配更高的版本的 pytorch 和 mmcv 。
  • 采用 OpenMMLabmmdetection3d 进行实现,仅能用于预训练模型的推理。

环境配置

硬件信息

shell nvcc -V bash nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2024 NVIDIA Corporation Built on Thu_Jun__6_03:03:05_Pacific_Daylight_Time_2024 Cuda compilation tools, release 12.5, V12.5.82 Build cuda_12.5.r12.5/compiler.34385749_0

shell nvidia-smi

bash +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 572.16 Driver Version: 572.16 CUDA Version: 12.8 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 4060 WDDM | 00000000:01:00.0 On | N/A | | 0% 46C P3 N/A / 120W | 3805MiB / 8188MiB | 11% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+

创建环境

```shell

创建虚拟环境

conda create -n imvoxelnet python=3.8.20

激活虚拟环境

conda activate imvoxelnet

git clone https://github.com/henu77/ImVoxelNet-Unofficial.git

cd ImVoxelNet-Unofficial

安装CUDA 11.1版本的torch和torchvision

pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 -f https://mirrors.aliyun.com/pytorch-wheels/cu118

安装mmcv、mmengine、mmdet

pip install -U openmim mim install mmcv==2.0.0 mim install mmengine==0.10.6 mim install mmdet==3.3.0

本地安装mmdetection3d

pip install -e .

安装PyQT5

pip install PyQt5 pygrabber==0.1

安装Gradio

pip install gradio==4.44.1 ```

下载预训练权重

这里下载对应的预训练权重,放到 ./checkpoints 文件夹下,并修改为名字为 imvoxelnet_total_sunrgbd_fast.pth

创新点

  1. 端到端的多视角优化:首次将多视角RGB图像的3D物体检测任务定义为端到端的优化问题,支持任意数量输入(单目或多视角),且在训练和推理中均可灵活处理不同数量的视图。
  2. 通用全卷积架构:提出了一种全卷积3D检测框架(ImVoxelNet),通过将2D图像特征投影到3D体素空间,结合3D卷积网络提取特征,并复用点云检测器的头部结构,无需额外修改。
  3. 跨场景通用性:通过领域特定的检测头(室内/室外)实现统一的架构,在室内外场景(如KITTI、ScanNet)中均取得最优性能,成为首个通用型RGB-based 3D检测方法。

方法

  1. 数据预处理
    • 特征提取:使用预训练的2D卷积网络(如ResNet-50)提取多尺度特征,并通过FPN融合。
    • 体素投影:将2D特征按相机位姿投影到3D体素空间,通过平均聚合多视角特征,构建3D体素表示。
  2. 3D特征提取
    • 编码器-解码器结构:针对室内场景设计轻量化的3D卷积网络,降低计算复杂度;室外场景则将体素压缩到BEV平面,使用2D卷积处理。
  3. 检测头设计
    • 室外检测头:基于BEV平面,采用2D锚框回归3D边界框(位置、尺寸、角度)。
    • 室内检测头:扩展FCOS到3D,通过多尺度3D卷积预测边界框,引入旋转3D IoU损失。
    • 额外任务头:联合估计相机位姿和房间布局(仅用于部分室内数据集)。

image-20250312195144750

实验结果

指标

  • KITTI 数据集上的结果

image-20250312195349843

  • SUN RGB-D 数据集上的结果

image-20250312195450152

  • ScanNet 数据集上的结果

image-20250312195512125

可视化

image-20250312195612339

image-20250312195240782

界面展示

```shell

运行 PyQT5 界面

python ui.py ```

image-20250312205924123

image-20250312205806735

image-20250312205906440

```shell

运行 Gradio 界面

python gradio_ui.py ```

image-20250312210045668

image-20250312210114295

Owner

  • Name: malong
  • Login: henu77
  • Kind: user
  • Location: 中国河南省开封市

My name is Malong.I am a student of Henan University majoring in computer science and technology. I like basketball and programming.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - name: "MMDetection3D Contributors"
title: "OpenMMLab's Next-generation Platform for General 3D Object Detection"
date-released: 2020-07-23
url: "https://github.com/open-mmlab/mmdetection3d"
license: Apache-2.0

GitHub Events

Total
  • Watch event: 2
  • Push event: 2
  • Create event: 2
Last Year
  • Watch event: 2
  • Push event: 2
  • Create event: 2

Dependencies

docker/Dockerfile docker
  • pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build
docker/serve/Dockerfile docker
  • pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build
projects/BEVFusion/setup.py pypi
projects/DSVT/setup.py pypi
requirements/build.txt pypi
requirements/docs.txt pypi
  • docutils ==0.16.0
  • markdown >=3.4.0
  • myst-parser *
  • sphinx ==4.0.2
  • sphinx-tabs *
  • sphinx_copybutton *
  • sphinx_markdown_tables >=0.0.16
  • tabulate *
  • urllib3 <2.0.0
requirements/mminstall.txt pypi
  • mmcv >=2.0.0rc4,<2.2.0
  • mmdet >=3.0.0,<3.3.0
  • mmengine >=0.7.1,<1.0.0
requirements/optional.txt pypi
  • black ==20.8b1
  • typing-extensions *
  • waymo-open-dataset-tf-2-6-0 *
requirements/readthedocs.txt pypi
  • mmcv >=2.0.0rc4
  • mmdet >=3.0.0
  • mmengine >=0.7.1
  • torch *
  • torchvision *
requirements/runtime.txt pypi
  • lyft_dataset_sdk *
  • networkx >=2.5
  • numba *
  • numpy *
  • nuscenes-devkit *
  • open3d *
  • plyfile *
  • scikit-image *
  • tensorboard *
  • trimesh *
requirements/tests.txt pypi
  • codecov * test
  • flake8 * test
  • interrogate * test
  • isort * test
  • kwarray * test
  • parameterized * test
  • pytest * test
  • pytest-cov * test
  • pytest-runner * test
  • ubelt * test
  • xdoctest >=0.10.0 test
  • yapf * test
requirements.txt pypi
setup.py pypi