bboxmaskpose

[ICCV 25] The official repository of paper 'Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle'

https://github.com/mirapurkrabek/bboxmaskpose

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.0%) to scientific vocabulary

Keywords

computer-vision human-pose-estimation iccv iccv2025 keypoint-detection pose-estimation research-paper
Last synced: 4 months ago · JSON representation ·

Repository

[ICCV 25] The official repository of paper 'Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle'

Basic Info
Statistics
  • Stars: 40
  • Watchers: 7
  • Forks: 4
  • Open Issues: 1
  • Releases: 1
Topics
computer-vision human-pose-estimation iccv iccv2025 keypoint-detection pose-estimation research-paper
Created about 1 year ago · Last pushed 6 months ago
Metadata Files
Readme License Citation

README.md

    Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle

    ICCV 2025

BBox-Mask-Pose loop [![Paper](https://img.shields.io/badge/Paper-ICCV%202025-blue)](https://arxiv.org/abs/2412.01562)     [![Website](https://img.shields.io/badge/Website-BBoxMaskPose-green)](https://mirapurkrabek.github.io/BBox-Mask-Pose/)     [![License](https://img.shields.io/badge/License-GPL%203.0-orange.svg)](LICENSE)     [![Video](https://img.shields.io/badge/Video-YouTube-red?logo=youtube)](https://youtu.be/U05yUP4b2LQ) Papers with code: [![2D Pose AP on OCHuman: 42.5](https://img.shields.io/badge/OCHuman-2D_Pose:_49.2_AP-blue)](https://paperswithcode.com/sota/2d-human-pose-estimation-on-ochuman?p=detection-pose-estimation-and-segmentation-1)    [![Human Instance Segmentation AP on OCHuman: 34.0](https://img.shields.io/badge/OCHuman-Human_Instance_Segmentation:_34.0_AP-blue)](https://paperswithcode.com/sota/human-instance-segmentation-on-ochuman?p=detection-pose-estimation-and-segmentation-1)

📋 Overview

The BBox-Mask-Pose (BMP) method integrates detection, pose estimation, and segmentation into a self-improving loop by conditioning these tasks on each other. This approach enhances all three tasks simultaneously. Using segmentation masks instead of bounding boxes improves performance in crowded scenarios, making top-down methods competitive with bottom-up approaches.

Key contributions: 1. MaskPose: a pose estimation model conditioned by segmentation masks instead of bounding boxes, boosting performance in dense scenes without adding parameters - Download pre-trained weights below 2. BBox-MaskPose (BMP): method linking bounding boxes, segmentation masks, and poses to simultaneously address multi-body detection, segmentation and pose estimation - Try the demo! 3. Fine-tuned RTMDet adapted for itterative detection (ignoring 'holes') - Download pre-trained weights below 5. Support for multi-dataset training of ViTPose, previously implemented in the official ViTPose repository but absent in MMPose.

For more details, please visit our project website.

📢 News

  • Aug 2025: HuggingFace Image Demo is out! 🎮
  • Jul 2025: Version 1.1 with easy-to-run image demo released
  • Jun 2025: Paper accepted to ICCV 2025! 🎉
  • Dec 2024: The code is available
  • Nov 2024: The project website is on

🚀 Installation

This project is built on top of MMPose and SAM 2.1. Please refer to the MMPose installation guide or SAM installation guide for detailed setup instructions.

Basic installation steps: ```bash

Clone the repository

git clone https://github.com/mirapurkrabek/BBoxMaskPose.git BBoxMaskPose/ cd BBoxMaskPose

Install your version of torch, torchvision, OpenCV and NumPy

pip install torch==2.1.2+cu121 torchvision==0.16.2+cu121 --extra-index-url https://download.pytorch.org/whl/cu121 pip install numpy==1.25.1 opencv-python==4.9.0.80

Install MMLibrary

pip install -U openmim mim install mmengine "mmcv==2.1.0" "mmdet==3.3.0" "mmpretrain==1.2.0"

Install dependencies

pip install -r requirements.txt pip install -e . ```

🎮 Demo

Step 1: Download SAM2 weights using the enclosed script.

Step 2: Run the full BBox-Mask-Pose pipeline on an input image:

bash python demo/bmp_demo.py configs/bmp_D3.yaml data/004806.jpg

It will take an image 004806.jpg from OCHuman and run (1) detector, (2) pose estimator and (3) SAM2 refinement. Details are in the cofiguration file bmp_D3.yaml.

Options: - configs/bmp_D3.yaml: BMP configuration file - data/004806.jpg: Input image - --device: (Optional) Inference device (default: cuda:0) - --output-root: (Optional) Directory to save outputs (default: demo/outputs) - --create-gif: (Optional) Generate an animated GIF of all iterations (default False)

After running, outputs are in outputs/004806/. The expected output should look like this:

Detection results &nbsp&nbsp&nbsp&nbsp Pose results

📦 Pre-trained Models

Pre-trained models are available on VRG Hugging Face 🤗. To run the demo, you only need do download SAM weights with enclosed script. Our detector and pose estimator will be downloaded during the runtime.

If you want to download our weights yourself, here are the links to our HuggingFace: - ViTPose-b trained on COCO+MPII+AIC -- download weights - MaskPose-b -- download weights - Fine-tuned RTMDet-L -- download weights

🙏 Acknowledgments

The code combines MMDetection, MMPose 2.0, ViTPose and SAM 2.1.

📝 Citation

The code was implemented by Miroslav Purkrábek. If you use this work, kindly cite it using the reference provided below.

For questions, please use the Issues of Discussion.

@InProceedings{Purkrabek2025ICCV, author={Purkrabek, Miroslav and Matas, Jiri}, title={Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle}, booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision}, year={2025}, month={October}, }

Owner

  • Name: Miroslav Purkrábek
  • Login: MiraPurkrabek
  • Kind: user
  • Location: Prague, Czech Republic

AI Researcher @ Visual Recognition Group, FEE CTU in Prague

Citation (CITATION.cff)

# CITATION.cff file for Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle
# This file provides metadata for the software and its preferred citation format.
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Purkrabek
  given-names: Miroslav
- family-names: Matas
  given-names: Jiri
title: "Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle"
version: 1.0.0
date-released: 2025-06-20
preferred-citation:
  type: conference-paper
  authors:
  - family-names: Purkrabek
    given-names: Miroslav
  - family-names: Matas
    given-names: Jiri
  collection-title: "Proceedings of the IEEE/CVF International Conference on Computer Vision"
  month: 10
  start: 1 # First page number
  end: 8 # Last page number
  title: "Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle"
  year: 2025

GitHub Events

Total
  • Issues event: 8
  • Watch event: 34
  • Delete event: 3
  • Issue comment event: 8
  • Push event: 19
  • Pull request event: 2
  • Pull request review event: 6
  • Pull request review comment event: 4
  • Fork event: 3
  • Create event: 5
Last Year
  • Issues event: 8
  • Watch event: 34
  • Delete event: 3
  • Issue comment event: 8
  • Push event: 19
  • Pull request event: 2
  • Pull request review event: 6
  • Pull request review comment event: 4
  • Fork event: 3
  • Create event: 5

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 4
  • Total pull requests: 2
  • Average time to close issues: about 1 month
  • Average time to close pull requests: 2 minutes
  • Total issue authors: 4
  • Total pull request authors: 1
  • Average comments per issue: 2.25
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 4
  • Pull requests: 2
  • Average time to close issues: about 1 month
  • Average time to close pull requests: 2 minutes
  • Issue authors: 4
  • Pull request authors: 1
  • Average comments per issue: 2.25
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • NielsRogge (1)
  • JunghoYoo (1)
  • ousandada (1)
  • neptune4year (1)
Pull Request Authors
  • MiraPurkrabek (2)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements/albu.txt pypi
  • albumentations >=0.3.2
requirements/build.txt pypi
  • numpy *
  • torch >=1.8
requirements/docs.txt pypi
  • docutils ==0.16.0
  • markdown *
  • myst-parser *
  • sphinx ==4.5.0
  • sphinx_copybutton *
  • sphinx_markdown_tables *
  • urllib3 <2.0.0
requirements/mminstall.txt pypi
  • mmcv >=2.0.0,<2.2.0
  • mmdet >=3.0.0,<3.3.0
  • mmengine >=0.4.0,<1.0.0
requirements/optional.txt pypi
  • requests *
requirements/poseval.txt pypi
  • shapely ==1.8.4
requirements/readthedocs.txt pypi
  • mmcv >=2.0.0rc4
  • mmengine >=0.6.0,<1.0.0
  • munkres *
  • regex *
  • scipy *
  • titlecase *
  • torch >1.6
  • torchvision *
  • xtcocotools >=1.13
requirements/runtime.txt pypi
  • chumpy *
  • json_tricks *
  • matplotlib *
  • munkres *
  • numpy *
  • opencv-python *
  • pillow *
  • scipy *
  • torchvision *
  • xtcocotools >=1.12
requirements/tests.txt pypi
  • coverage * test
  • flake8 * test
  • interrogate * test
  • isort ==4.3.21 test
  • parameterized * test
  • pytest * test
  • pytest-runner * test
  • xdoctest >=0.10.0 test
  • yapf * test
requirements.txt pypi
setup.py pypi