https://github.com/ac-rad/mvtrans
Implementation of MVTrans: Multi-view Perception to See Transparent Objects
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.3%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Implementation of MVTrans: Multi-view Perception to See Transparent Objects
Basic Info
- Host: GitHub
- Owner: ac-rad
- Language: Python
- Default Branch: main
- Homepage: https://ac-rad.github.io/MVTrans/
- Size: 342 KB
Statistics
- Stars: 6
- Watchers: 2
- Forks: 1
- Open Issues: 1
- Releases: 0
Created almost 4 years ago
· Last pushed almost 3 years ago
https://github.com/ac-rad/MVTrans/blob/main/
## MVTrans: Multi-view Perception to See Transparent Objects (ICRA2023) [**Paper**](https://arxiv.org/abs/2302.11683) | [**Project**](https://ac-rad.github.io/MVTrans/) | [**Video**](https://youtu.be/8Qdc_xWVp-k) This repo contains the official implementation of the paper "MVTrans: Multi-view Perception to See Transparent Objects". ## Introduction Transparent object perception is a crucial skill for applications such as robot manipulation in household and laboratory settings. Existing methods utilize RGB-D or stereo inputs to handle a subset of perception tasks including depth and pose estimation. However transparent object perception remains to be an open problem. In this paper, we forgo the unreliable depth map from RGB-D sensors and extend the stereo based method. Our proposed method, MVTrans, is an end-to-end multi-view architecture with multiple perception capabilities, including depth estimation, segmentation, and pose estimation. Additionally, we establish a novel procedural photo-realistic dataset generation pipeline and create a large-scale transparent object detection dataset, Syn-TODD, which is suitable for training networks with all three modalities, RGB-D, stereo and multi-view RGB.## Installation Setup a conda environment, install required packages, and download the repo: ``` conda create -y --prefix ./env python=3.8 ./env/bin/python -m pip install -r requirements.txt git clone https://github.com/ac-rad/MVTrans.git ``` Weights & Biases (wandb) is used to log and visualize training results. Please follow the [instruction](https://docs.wandb.ai/) to setup wandb. To appropriately log results to cloud, insert your wandb login key in `net_train_multiview.py`. Otherwise, to log results locally, run the following command and access results at localhost: ``` wandb offline ``` ## Dataset Our synthetic transparent object detection dataset (Syn-TODD) can be downloaded at [here](https://borealisdata.ca/dataset.xhtml?persistentId=doi:10.5683/SP3/LQKTXE). ## Pre-trained Model We provide pre-trained model weight for MVTrans trained on Syn-TODD dataset. | Model views | Link | |-------------|------| | 2 views | [here](https://borealisdata.ca/api/access/datafile/632196) | | 3 views | [here](https://borealisdata.ca/api/access/datafile/632197) | | 5 views | [here](https://borealisdata.ca/api/access/datafile/632195) | ## Training To train MVTrans from scratch, modify the data path and output directory in configuration files under `config/`, and then run: ``` ./runner.sh net_train_multiview.py @config/net_config_blender_multiview_{NUM_OF_VIEW}_train.txt ``` ## Evaluation To run the evaluation, need to change modify the data path and output directory in configuration files under `config/`, and then run: ``` ./runner.sh net_train_multiview.py @config/net_config_blender_multiview_{NUM_OF_VIEW}_eval.txt ``` ## Inference To run the inference, launch jupyter notebook and run `inference.ipynb`. ## Citation Please cite our paper: ``` @misc{wang2023mvtrans, title={MVTrans: Multi-View Perception of Transparent Objects}, author={Yi Ru Wang and Yuchi Zhao and Haoping Xu and Saggi Eppel and Alan Aspuru-Guzik and Florian Shkurti and Animesh Garg}, year={2023}, eprint={2302.11683}, archivePrefix={arXiv}, primaryClass={cs.RO} } ``` ## Reference Our MVTrans architecture is built based on [SimNet](https://github.com/ToyotaResearchInstitute/simnet) and [ESTDepth](https://github.com/xxlong0/ESTDepth).
Owner
- Name: RA^2D: Robotics-assisted Accelerated Discovery
- Login: ac-rad
- Kind: organization
- Website: https://acceleration.utoronto.ca/
- Repositories: 10
- Profile: https://github.com/ac-rad
We introduce novel robotic & AI solutions to accelerate science discoveries, sponsored by Acceleration Consortium.
GitHub Events
Total
- Issues event: 1
- Watch event: 4
- Issue comment event: 1
Last Year
- Issues event: 1
- Watch event: 4
- Issue comment event: 1
## Installation
Setup a conda environment, install required packages, and download the repo:
```
conda create -y --prefix ./env python=3.8
./env/bin/python -m pip install -r requirements.txt
git clone https://github.com/ac-rad/MVTrans.git
```
Weights & Biases (wandb) is used to log and visualize training results. Please follow the [instruction](https://docs.wandb.ai/) to setup wandb. To appropriately log results to cloud, insert your wandb login key in `net_train_multiview.py`. Otherwise, to log results locally, run the following command and access results at localhost:
```
wandb offline
```
## Dataset
Our synthetic transparent object detection dataset (Syn-TODD) can be downloaded at [here](https://borealisdata.ca/dataset.xhtml?persistentId=doi:10.5683/SP3/LQKTXE).
## Pre-trained Model
We provide pre-trained model weight for MVTrans trained on Syn-TODD dataset.
| Model views | Link |
|-------------|------|
| 2 views | [here](https://borealisdata.ca/api/access/datafile/632196) |
| 3 views | [here](https://borealisdata.ca/api/access/datafile/632197) |
| 5 views | [here](https://borealisdata.ca/api/access/datafile/632195) |
## Training
To train MVTrans from scratch, modify the data path and output directory in configuration files under `config/`, and then run:
```
./runner.sh net_train_multiview.py @config/net_config_blender_multiview_{NUM_OF_VIEW}_train.txt
```
## Evaluation
To run the evaluation, need to change modify the data path and output directory in configuration files under `config/`, and then run:
```
./runner.sh net_train_multiview.py @config/net_config_blender_multiview_{NUM_OF_VIEW}_eval.txt
```
## Inference
To run the inference, launch jupyter notebook and run `inference.ipynb`.
## Citation
Please cite our paper:
```
@misc{wang2023mvtrans,
title={MVTrans: Multi-View Perception of Transparent Objects},
author={Yi Ru Wang and Yuchi Zhao and Haoping Xu and Saggi Eppel and Alan Aspuru-Guzik and Florian Shkurti and Animesh Garg},
year={2023},
eprint={2302.11683},
archivePrefix={arXiv},
primaryClass={cs.RO}
}
```
## Reference
Our MVTrans architecture is built based on [SimNet](https://github.com/ToyotaResearchInstitute/simnet) and [ESTDepth](https://github.com/xxlong0/ESTDepth).