posebh

PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation

https://github.com/uyoung-jeong/posebh

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.8%) to scientific vocabulary

Keywords

computer-vision deep-learning human-pose-estimation pose-estimation pytorch
Last synced: 4 months ago · JSON representation ·

Repository

PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation

Basic Info
  • Host: GitHub
  • Owner: uyoung-jeong
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 10.5 MB
Statistics
  • Stars: 7
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
computer-vision deep-learning human-pose-estimation pose-estimation pytorch
Created 10 months ago · Last pushed 7 months ago
Metadata Files
Readme License Citation

README.md

PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation

PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation
Uyoung Jeong, Jonathan Freer, Seungryul Baek, Hyung Jin Chang, Kwang In Kim
CVPR 2025

PoseBH is a new multi-dataset training framework that tackles keypoint heterogeneity and limited supervision through two key techniques. (1) Keypoint prototypes for learning arbitrary keypoints from multiple datasets ensuring high transferability. (2) A cross-type self-supervision mechanism that aligns keypoint regression outputs with keypoint embeddings, enriching supervision for unlabeled keypoints.

Model Zoo

Model weights trained with our method are provided in Google Drive.

Place the trained weights in weights/posebh. e.g., weights/posebh/base.pth.

In order to download baseline multi-head weights, please download from ViTPose repository.

Evaluation Results

COCO val set

Using detection results from a detector that obtains 56 mAP on person. Configs in the table are only for evaluation.

| Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 77.0 | 82.6 | config | | ViTPose++-H | 256x192 | 79.4 | 84.8 | config | | PoseBH-B | 256x192 | 77.3 | 82.4 | config | | PoseBH-H | 256x192 | 79.5 | 84.5 | config |

COCO test-dev set

Using detection results from a detector that obtains 60.9 mAP on person. Configs in the table are only for evaluation.

| Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 76.4 | 81.5 | config | | ViTPose++-H | 256x192 | 78.5 | 83.4 | config | | PoseBH-B | 256x192 | 76.6 | 81.7 | config | | PoseBH-H | 256x192 | 78.6 | 83.5 | config |

OCHuman test set

Using ground-truth bounding boxes. Configs in the table are only for evaluation. | Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 82.6 | 84.8 | config | | ViTPose++-H | 256x192 | 85.7 | 87.4 | config | | PoseBH-B | 256x192 | 83.1 | 85.1 | config | | PoseBH-H | 256x192 | 87.0 | 88.4 | config |

MPII val set

Using groundtruth bounding boxes. Configs in the table are only for evaluation.

| Model | Resolution | PCKh | PCKh@0.1 | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 92.8 | 39.1 | config | | ViTPose++-H | 256x192 | 94.2 | 41.6 | config | | PoseBH-B | 256x192 | 93.2 | 39.3 | config | | PoseBH-H | 256x192 | 94.2 | 42.3 | config |

AI Challenger val set

Using groundtruth bounding boxes. Configs in the table are only for evaluation.

| Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 31.8 | 36.3 | config | | ViTPose++-H | 256x192 | 34.8 | 39.1 | config | | PoseBH-B | 256x192 | 32.1 | 36.7 | config | | PoseBH-H | 256x192 | 35.1 | 39.5 | config |

AP-10K test set

Using groundtruth bounding boxes. Configs in the table are only for evaluation.

| Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 74.5 | - | config | | ViTPose++-H | 256x192 | 82.4 | - | config | | PoseBH-B | 256x192 | 75.0 | 78.3 | config | | PoseBH-H | 256x192 | 82.6 | 85.4 | config |

APT-36K test set

Using groundtruth bounding boxes. Configs in the table are only for evaluation. Note that we could not acquire preprocessed APT-36K annotations that are used in VITPose training, so we preprocessed APT-36K annotations ourselves. For fair comparison, please check the Table 2 of our main paper. APT-36K preprocessing script is at tools/dataset/prepro_apt36k.py.

| Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 75.9 | - | config | | ViTPose++-H | 256x192 | 82.3 | - | config | | PoseBH-B | 256x192 | 87.2 | 89.6 | config | | PoseBH-H | 256x192 | 90.6 | 92.6 | config |

WholeBody dataset

Configs in the table are only for evaluation.

| Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 57.4 | - | config | | ViTPose++-H | 256x192 | 60.6 | - | config | | PoseBH-B | 256x192 | 57.9 | 69.5 | config | | PoseBH-H | 256x192 | 62.0 | 72.9 | config |

Transfer on InterHand2.6M

| Model | Resolution | PCK | AUC | EPE | config | | :----: | :----: | :----: | :----: | :----: | :---: | | ViTPose++-B | 256x192 | 98.3 | 86.2 | 4.02 | config | | PoseBH-B (paper) | 256x192 | 98.6 | 87.1 | 3.70 | config | | PoseBH-B | 256x192 | 98.7 | 86.6 | 3.61 | config |

Transfer on 3DPW

| Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 81.7 | 85.2 | config | | PoseBH-B (paper) | 256x192 | 83.6 | 87.1 | config | | PoseBH-B | 256x192 | 83.8 | 87.1 | config |

Usage

Setup

We use Ubuntu 20, Python 3.8, PyTorch 1.11.0, cuda 11.3, and mmcv 1.4.8 for the experiments.

conda environment setup conda create -n posebh python=3.8 -y conda activate posebh

pytorch 1.11.0 compatible version pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 --extra-index-url https://download.pytorch.org/whl/cu113

mmcv install: pip install "git+https://github.com/open-mmlab/mmcv.git@v1.4.8"

cd /path/to/PoseBH pip install -v -e .

install other packages bash pip install -r requirements.txt

Directory Tree

``` data ├─ aic ├─ ap10k ├─ ap36k │ └─annotations │ └─ap36k_train.json ├─ coco └─ mpii

weights ├─ vitpose+base.pth └─ vitpose+huge.pth

splitweights └─ basecoco.pth `` Downloadvitpose+base.pthandvitpose+huge.pthfile from [ViTPose](https://github.com/ViTAE-Transformer/ViTPose). To get thebasecoco.pthfile, runtools/modelsplit.pyonvitpose+_base.pth` file.

Download annotation files for APT-36K and 3DPW here.

Training

After downloading the pretrained models, please conduct the experiments by running

```bash

for single machine

bash tools/dist_train.sh --cfg-options model.pretrained= --seed 0

for multiple machines

python -m torch.distributed.launch --nnodes --noderank --nprocpernode --masteraddr --master_port tools/train.py --cfg-options model.pretrained= --launcher pytorch --seed 0 ```

We provide multi-dataset-training config files as below: - configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/mdt/vitb_posebh.py - configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/mdt/vith_posebh.py

We also provide a bash script for training ViTPose++-Base with our method in scripts/train_vitb_posebh.sh.

Evaluation

To test the pretrained models performance, please run

bash bash tools/dist_test.sh <Config PATH> <Checkpoint PATH> <NUM GPUs>

For our multi-dataset-trained models, please first re-organize the weights using

bash python tools/model_split.py --source <Pretrained PATH>

An exemplar evaluation bash script is provided in scripts/eval_vitb_mdt.sh.

Transfer

We provide transfer config files as below: - configs/hand/2d_kpt_sview_rgb_img/topdown_heatmap/interhand2d/vitb_posebh_interhand2d_256x192.py - configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/pw3d/vitb_posebh_pw3d_2d_256x192.py

In order to perform transfer learning, you may need to remove prototypes in the pretrained checkpoint file. Use tools/remove_proto.py script, similar to below:

bash python tools/remove_proto.py --source work_dirs/vitb_posebh/epoch_100.pth

After the checkpoint is prepared, run the experiment like below:

bash python -m torch.distributed.launch --nnodes 1 --node_rank 0 --nproc_per_node 4 --master_addr 127.0.0.1 --master_port 23459 tools/train.py configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/pw3d/vitb_posebh_interhand2d_256x192.py --launcher pytorch --seed 0 --cfg-options model.multihead_pretrained=work_dirs/vitb_posebh/epoch_100_no_proto.pth

Citation

@InProceedings{Jeong_2025_CVPR, author = {Jeong, Uyoung and Freer, Jonathan and Baek, Seungryul and Chang, Hyung Jin and Kim, Kwang In}, title = {PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {12278-12288} }

Acknowledgements

Owner

  • Name: Uyoung Jeong
  • Login: uyoung-jeong
  • Kind: user
  • Company: UNIST

PhD course student at UNIST, South Korea

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - name: "MMPose Contributors"
title: "OpenMMLab Pose Estimation Toolbox and Benchmark"
date-released: 2020-08-31
url: "https://github.com/open-mmlab/mmpose"
license: Apache-2.0

GitHub Events

Total
  • Watch event: 9
  • Issue comment event: 3
  • Push event: 1
  • Create event: 2
Last Year
  • Watch event: 9
  • Issue comment event: 3
  • Push event: 1
  • Create event: 2

Dependencies

docker/Dockerfile docker
  • pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build
docker/serve/Dockerfile docker
  • pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build
requirements/build.txt pypi
  • numpy *
  • torch >=1.3
requirements/docs.txt pypi
  • docutils ==0.16.0
  • myst-parser *
  • sphinx ==4.0.2
  • sphinx_copybutton *
  • sphinx_markdown_tables *
requirements/mminstall.txt pypi
  • mmcv-full >=1.3.8
  • mmdet >=2.14.0
  • mmtrack >=0.6.0
requirements/optional.txt pypi
  • albumentations >=0.3.2
  • onnx *
  • onnxruntime *
  • pyrender *
  • requests *
  • smplx >=0.1.28
  • trimesh *
requirements/readthedocs.txt pypi
  • mmcv-full *
  • munkres *
  • regex *
  • scipy *
  • titlecase *
  • torch *
  • torchvision *
  • xtcocotools >=1.8
requirements/runtime.txt pypi
  • chumpy *
  • dataclasses *
  • json_tricks *
  • matplotlib *
  • munkres *
  • numpy *
  • opencv-python *
  • pillow *
  • scipy *
  • torchvision *
  • xtcocotools >=1.8
requirements/tests.txt pypi
  • coverage * test
  • flake8 * test
  • interrogate * test
  • isort ==4.3.21 test
  • pytest * test
  • pytest-runner * test
  • smplx >=0.1.28 test
  • xdoctest >=0.10.0 test
  • yapf * test
requirements.txt pypi
  • einops *
  • psutil *
  • tensorboard *
  • timm ==0.4.9
  • yapf ==0.40.1
setup.py pypi