bboxmaskpose

[ICCV 25] The official repository of paper 'Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle'

https://github.com/mirapurkrabek/bboxmaskpose

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.0%) to scientific vocabulary

Keywords

computer-vision human-pose-estimation iccv iccv2025 keypoint-detection pose-estimation research-paper

Last synced: 6 months ago · JSON representation ·

Repository

[ICCV 25] The official repository of paper 'Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle'

Basic Info

Host: GitHub
Owner: MiraPurkrabek
License: gpl-3.0
Language: Python
Default Branch: main
Homepage: https://MiraPurkrabek.github.io/BBox-Mask-Pose/
Size: 6.58 MB

Statistics

Stars: 40
Watchers: 7
Forks: 4
Open Issues: 1
Releases: 1

Topics

computer-vision human-pose-estimation iccv iccv2025 keypoint-detection pose-estimation research-paper

Created over 1 year ago · Last pushed 7 months ago

Metadata Files

Readme License Citation

Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle

ICCV 2025

[![Paper](https://img.shields.io/badge/Paper-ICCV%202025-blue)](https://arxiv.org/abs/2412.01562) [![Website](https://img.shields.io/badge/Website-BBoxMaskPose-green)](https://mirapurkrabek.github.io/BBox-Mask-Pose/) [![License](https://img.shields.io/badge/License-GPL%203.0-orange.svg)](LICENSE) [![Video](https://img.shields.io/badge/Video-YouTube-red?logo=youtube)](https://youtu.be/U05yUP4b2LQ) Papers with code: [![2D Pose AP on OCHuman: 42.5](https://img.shields.io/badge/OCHuman-2D_Pose:_49.2_AP-blue)](https://paperswithcode.com/sota/2d-human-pose-estimation-on-ochuman?p=detection-pose-estimation-and-segmentation-1) [![Human Instance Segmentation AP on OCHuman: 34.0](https://img.shields.io/badge/OCHuman-Human_Instance_Segmentation:_34.0_AP-blue)](https://paperswithcode.com/sota/human-instance-segmentation-on-ochuman?p=detection-pose-estimation-and-segmentation-1)

📋 Overview

The BBox-Mask-Pose (BMP) method integrates detection, pose estimation, and segmentation into a self-improving loop by conditioning these tasks on each other. This approach enhances all three tasks simultaneously. Using segmentation masks instead of bounding boxes improves performance in crowded scenarios, making top-down methods competitive with bottom-up approaches.

Key contributions: 1. MaskPose: a pose estimation model conditioned by segmentation masks instead of bounding boxes, boosting performance in dense scenes without adding parameters - Download pre-trained weights below 2. BBox-MaskPose (BMP): method linking bounding boxes, segmentation masks, and poses to simultaneously address multi-body detection, segmentation and pose estimation - Try the demo! 3. Fine-tuned RTMDet adapted for itterative detection (ignoring 'holes') - Download pre-trained weights below 5. Support for multi-dataset training of ViTPose, previously implemented in the official ViTPose repository but absent in MMPose.

For more details, please visit our project website.

📢 News

Aug 2025: HuggingFace Image Demo is out! 🎮
Jul 2025: Version 1.1 with easy-to-run image demo released
Jun 2025: Paper accepted to ICCV 2025! 🎉
Dec 2024: The code is available
Nov 2024: The project website is on

🚀 Installation

This project is built on top of MMPose and SAM 2.1. Please refer to the MMPose installation guide or SAM installation guide for detailed setup instructions.

Basic installation steps: ```bash

Clone the repository

git clone https://github.com/mirapurkrabek/BBoxMaskPose.git BBoxMaskPose/ cd BBoxMaskPose

Install your version of torch, torchvision, OpenCV and NumPy

pip install torch==2.1.2+cu121 torchvision==0.16.2+cu121 --extra-index-url https://download.pytorch.org/whl/cu121 pip install numpy==1.25.1 opencv-python==4.9.0.80

Install MMLibrary

pip install -U openmim mim install mmengine "mmcv==2.1.0" "mmdet==3.3.0" "mmpretrain==1.2.0"

Install dependencies

pip install -r requirements.txt pip install -e . ```

🎮 Demo

Step 1: Download SAM2 weights using the enclosed script.

Step 2: Run the full BBox-Mask-Pose pipeline on an input image:

bash python demo/bmp_demo.py configs/bmp_D3.yaml data/004806.jpg

It will take an image 004806.jpg from OCHuman and run (1) detector, (2) pose estimator and (3) SAM2 refinement. Details are in the cofiguration file bmp_D3.yaml.

Options: - configs/bmp_D3.yaml: BMP configuration file - data/004806.jpg: Input image - --device: (Optional) Inference device (default: cuda:0) - --output-root: (Optional) Directory to save outputs (default: demo/outputs) - --create-gif: (Optional) Generate an animated GIF of all iterations (default False)

After running, outputs are in outputs/004806/. The expected output should look like this:

&nbsp&nbsp&nbsp&nbsp

📦 Pre-trained Models

Pre-trained models are available on VRG Hugging Face 🤗. To run the demo, you only need do download SAM weights with enclosed script. Our detector and pose estimator will be downloaded during the runtime.

If you want to download our weights yourself, here are the links to our HuggingFace: - ViTPose-b trained on COCO+MPII+AIC -- download weights - MaskPose-b -- download weights - Fine-tuned RTMDet-L -- download weights

🙏 Acknowledgments

The code combines MMDetection, MMPose 2.0, ViTPose and SAM 2.1.

📝 Citation

The code was implemented by Miroslav Purkrábek. If you use this work, kindly cite it using the reference provided below.

For questions, please use the Issues of Discussion.

@InProceedings{Purkrabek2025ICCV, author={Purkrabek, Miroslav and Matas, Jiri}, title={Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle}, booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision}, year={2025}, month={October}, }

Owner

Name: Miroslav Purkrábek
Login: MiraPurkrabek
Kind: user
Location: Prague, Czech Republic

Repositories: 1
Profile: https://github.com/MiraPurkrabek

AI Researcher @ Visual Recognition Group, FEE CTU in Prague

Citation (CITATION.cff)

# CITATION.cff file for Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle
# This file provides metadata for the software and its preferred citation format.
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Purkrabek
  given-names: Miroslav
- family-names: Matas
  given-names: Jiri
title: "Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle"
version: 1.0.0
date-released: 2025-06-20
preferred-citation:
  type: conference-paper
  authors:
  - family-names: Purkrabek
    given-names: Miroslav
  - family-names: Matas
    given-names: Jiri
  collection-title: "Proceedings of the IEEE/CVF International Conference on Computer Vision"
  month: 10
  start: 1 # First page number
  end: 8 # Last page number
  title: "Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle"
  year: 2025

GitHub Events

Total

Issues event: 8
Watch event: 34
Delete event: 3
Issue comment event: 8
Push event: 19
Pull request event: 2
Pull request review event: 6
Pull request review comment event: 4
Fork event: 3
Create event: 5

Last Year

Issues event: 8
Watch event: 34
Delete event: 3
Issue comment event: 8
Push event: 19
Pull request event: 2
Pull request review event: 6
Pull request review comment event: 4
Fork event: 3
Create event: 5

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 4
Total pull requests: 2
Average time to close issues: about 1 month
Average time to close pull requests: 2 minutes
Total issue authors: 4
Total pull request authors: 1
Average comments per issue: 2.25
Average comments per pull request: 0.0
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 4
Pull requests: 2
Average time to close issues: about 1 month
Average time to close pull requests: 2 minutes
Issue authors: 4
Pull request authors: 1
Average comments per issue: 2.25
Average comments per pull request: 0.0
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

NielsRogge (1)
JunghoYoo (1)
ousandada (1)
neptune4year (1)

Pull Request Authors

MiraPurkrabek (2)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

requirements/albu.txt pypi

albumentations >=0.3.2

requirements/build.txt pypi

numpy *
torch >=1.8

requirements/docs.txt pypi

docutils ==0.16.0
markdown *
myst-parser *
sphinx ==4.5.0
sphinx_copybutton *
sphinx_markdown_tables *
urllib3 <2.0.0

requirements/mminstall.txt pypi

mmcv >=2.0.0,<2.2.0
mmdet >=3.0.0,<3.3.0
mmengine >=0.4.0,<1.0.0

requirements/optional.txt pypi

requests *

requirements/poseval.txt pypi

shapely ==1.8.4

requirements/readthedocs.txt pypi

mmcv >=2.0.0rc4
mmengine >=0.6.0,<1.0.0
munkres *
regex *
scipy *
titlecase *
torch >1.6
torchvision *
xtcocotools >=1.13

requirements/runtime.txt pypi

chumpy *
json_tricks *
matplotlib *
munkres *
numpy *
opencv-python *
pillow *
scipy *
torchvision *
xtcocotools >=1.12

requirements/tests.txt pypi

coverage * test
flake8 * test
interrogate * test
isort ==4.3.21 test
parameterized * test
pytest * test
pytest-runner * test
xdoctest >=0.10.0 test
yapf * test

requirements.txt pypi

setup.py pypi