vision6d

6D Pose Annotation Tool and Real-time Visualization - Vision6D for supporting users to annotate the 6D pose of a given 3D object for any given 2D images and depth estimation. This 6D pose annotation tool supports annotate videos and a set of 2D images.

https://github.com/interactivegl/vision6d

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.2%) to scientific vocabulary

Keywords

annotation annotation-tool depth depth-estimation pose pose-annotation python3 tool visualization
Last synced: 6 months ago · JSON representation

Repository

6D Pose Annotation Tool and Real-time Visualization - Vision6D for supporting users to annotate the 6D pose of a given 3D object for any given 2D images and depth estimation. This 6D pose annotation tool supports annotate videos and a set of 2D images.

Basic Info
Statistics
  • Stars: 81
  • Watchers: 1
  • Forks: 9
  • Open Issues: 10
  • Releases: 9
Topics
annotation annotation-tool depth depth-estimation pose pose-annotation python3 tool visualization
Created over 3 years ago · Last pushed 7 months ago
Metadata Files
Readme License Citation

README.md


Vision6D
VISION6D

3D-to-2D visualization and annotation desktop app for 6D pose estimation related tasks.

pypi_release github_release github_license github_downloads github_stars

Key Features How To Use Examples Download Credits License

screenshot

Introduction

Source code for paper Vision6D: 3D-to-2D Interactive Visualization and Annotation Tool for 6D Pose Estimation. We compared the human annotations (only rely on visual cues) with the provided ground-truth camera poses from public dataset Linemod [1] and HANDAL [2]. In an user study, the rotation and translation errors are minimal [3].

The contributions of this work can be summarized as the following:

  1. Vision6D provides an interactive framework that effectively aligns 3D models onto 2D images, enabling precise 6D pose annotation. This bridges the gap between 2D image projection and the spatial complexity of 3D scenes. The tool allows users to efficiently annotate and refine 6D poses via an interactive user interface, simplifying the 6D camera pose related dataset generation process.

  2. We validate the effectiveness of Vision6D through a comprehensive user study, demonstrating that it offers an intuitive and accurate solution for 6D pose annotation. The user study used public 6D pose estimation datasets named Linemod [1] and HANDAL [2], where user-annotated poses were compared against ground-truth poses. The results illustrate the tools accuracy, efficiency, and usability, highlighting its potential as a standardized solution for 6D pose annotation.

[1] S. Hinterstoisser, V. Lepetit, S. Ilic, S. Holzer, G. Bradski, K. Konolige, and N. Navab, Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes, in Computer Vision ACCV 2012, K. M. Lee, Y. Matsushita, J. M. Rehg, and Z. Hu, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 548562.

[2] A. Guo, B. Wen, J. Yuan, J. Tremblay, S. Tyree, J. Smith, and S. Birchfield, Handal: A dataset of real-world manipulable object categories with pose annotations, affordances, and reconstructions, 2023. [Online]. Available: https://arxiv.org/abs/2308.01477.

[3] Zhang, Y., Davalos, E., & Noble, J. (2025). Vision6D: 3D-to-2D Interactive Visualization and Annotation Tool for 6D Pose Estimation. arXiv [Cs.GR]. Retrieved from http://arxiv.org/abs/2504.15329

Key Features

  • LivePreview - Make changes, See changes
    • Instantly see what your pose annotation in Vision6D as you move the 3D objects!
  • Provide built-in NOCS color representation for the 3D meshes
    • color the meshes with NOCS.
  • Load the textures for the 3D meshes
    • color the meshes with their own textures.
  • Segmentation Mask/Bounding Box Drawing
    • create a segmentation mask in Vision6D on top of the provided 2D image.
  • Real-time rendering results
    • renders the annotated results.
  • Cross platform
    • Windows, Linux (Ubuntu-tested), and Mac (Apple Silicon) ready (highly recommend to use with a mouse).

How To Use

To run this application, you'll need Git, Python, and Miniconda (optional) installed on your computer. From your command line:

Vision6D can be directly installed from PyPi

bash $ pip install vision6D

Another way to use this software is to clone from this repository

```bash

(Optional) Create a conda environment

$ conda create -n vision6D python=3.10

Clone this repository

$ git clone https://github.com/InteractiveGL/vision6D.git

Go into the repository

$ cd vision6D

Install dependencies

$ pip install .

Run the app

$ Vision6D ```

Examples

Note that when fisrt load the application, it may take some time. Once it load successfully, the interactive experience will be smooth.

PnP resgitration of the benchvise

screenshot 1 screenshot 2

Set a ground-truth pose for visualization of the benchvise (ground-truth pose is obtained from the public 6D pose dataset Linemod)

screenshot 1 screenshot 2

Free-hand registration of the benchvise

screenshot

Draw a segmentation mask on the duck in this scene

screenshot 1 screenshot 2

Draw a bounding box around the duck in this scene

screenshot 1 screenshot 2

Render the benchwise mesh

screenshot 1 screenshot 2

Download

You can download the latest installable version of Vision6D for Windows, macOS (support both Apple Silicon (ARM-based) and Intel (x86-based)), and Linux Ubuntu.

Credits

This software uses the following open source packages:

Citation

If you find this work is helpful, please consider cite the following paper:

@misc{zhang2025vision6d3dto2dinteractivevisualization, title={Vision6D: 3D-to-2D Interactive Visualization and Annotation Tool for 6D Pose Estimation}, author={Yike Zhang and Eduardo Davalos and Jack Noble}, year={2025}, eprint={2504.15329}, archivePrefix={arXiv}, primaryClass={cs.GR}, url={https://arxiv.org/abs/2504.15329}, }

Thank you for your support.

License

GNU General Public License v3.0

Star History

Star History Chart

Owner

  • Name: InteractiveGL
  • Login: InteractiveGL
  • Kind: organization

GitHub Events

Total
  • Create event: 50
  • Release event: 2
  • Issues event: 19
  • Watch event: 50
  • Delete event: 46
  • Issue comment event: 12
  • Push event: 131
  • Pull request event: 7
  • Fork event: 6
Last Year
  • Create event: 50
  • Release event: 2
  • Issues event: 19
  • Watch event: 50
  • Delete event: 46
  • Issue comment event: 12
  • Push event: 131
  • Pull request event: 7
  • Fork event: 6

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 16
  • Total pull requests: 4
  • Average time to close issues: 6 months
  • Average time to close pull requests: 1 minute
  • Total issue authors: 6
  • Total pull request authors: 1
  • Average comments per issue: 0.63
  • Average comments per pull request: 0.0
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 14
  • Pull requests: 4
  • Average time to close issues: 16 days
  • Average time to close pull requests: 1 minute
  • Issue authors: 6
  • Pull request authors: 1
  • Average comments per issue: 0.57
  • Average comments per pull request: 0.0
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • ykzzyk (10)
  • edavalosanaya (2)
  • missTL (1)
  • cassiecalf (1)
  • The-wind-rises-2023 (1)
  • jackchinor (1)
Pull Request Authors
  • ykzzyk (4)
Top Labels
Issue Labels
enhancement (2) documentation (2) question (1)
Pull Request Labels