posebh
PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.8%) to scientific vocabulary
Keywords
Repository
PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation
Basic Info
Statistics
- Stars: 7
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation
PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation
Uyoung Jeong, Jonathan Freer, Seungryul Baek, Hyung Jin Chang, Kwang In Kim
CVPR 2025

PoseBH is a new multi-dataset training framework that tackles keypoint heterogeneity and limited supervision through two key techniques. (1) Keypoint prototypes for learning arbitrary keypoints from multiple datasets ensuring high transferability. (2) A cross-type self-supervision mechanism that aligns keypoint regression outputs with keypoint embeddings, enriching supervision for unlabeled keypoints.
Model Zoo
Model weights trained with our method are provided in Google Drive.
Place the trained weights in weights/posebh. e.g., weights/posebh/base.pth.
In order to download baseline multi-head weights, please download from ViTPose repository.
Evaluation Results
COCO val set
Using detection results from a detector that obtains 56 mAP on person. Configs in the table are only for evaluation.
| Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 77.0 | 82.6 | config | | ViTPose++-H | 256x192 | 79.4 | 84.8 | config | | PoseBH-B | 256x192 | 77.3 | 82.4 | config | | PoseBH-H | 256x192 | 79.5 | 84.5 | config |
COCO test-dev set
Using detection results from a detector that obtains 60.9 mAP on person. Configs in the table are only for evaluation.
| Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 76.4 | 81.5 | config | | ViTPose++-H | 256x192 | 78.5 | 83.4 | config | | PoseBH-B | 256x192 | 76.6 | 81.7 | config | | PoseBH-H | 256x192 | 78.6 | 83.5 | config |
OCHuman test set
Using ground-truth bounding boxes. Configs in the table are only for evaluation. | Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 82.6 | 84.8 | config | | ViTPose++-H | 256x192 | 85.7 | 87.4 | config | | PoseBH-B | 256x192 | 83.1 | 85.1 | config | | PoseBH-H | 256x192 | 87.0 | 88.4 | config |
MPII val set
Using groundtruth bounding boxes. Configs in the table are only for evaluation.
| Model | Resolution | PCKh | PCKh@0.1 | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 92.8 | 39.1 | config | | ViTPose++-H | 256x192 | 94.2 | 41.6 | config | | PoseBH-B | 256x192 | 93.2 | 39.3 | config | | PoseBH-H | 256x192 | 94.2 | 42.3 | config |
AI Challenger val set
Using groundtruth bounding boxes. Configs in the table are only for evaluation.
| Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 31.8 | 36.3 | config | | ViTPose++-H | 256x192 | 34.8 | 39.1 | config | | PoseBH-B | 256x192 | 32.1 | 36.7 | config | | PoseBH-H | 256x192 | 35.1 | 39.5 | config |
AP-10K test set
Using groundtruth bounding boxes. Configs in the table are only for evaluation.
| Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 74.5 | - | config | | ViTPose++-H | 256x192 | 82.4 | - | config | | PoseBH-B | 256x192 | 75.0 | 78.3 | config | | PoseBH-H | 256x192 | 82.6 | 85.4 | config |
APT-36K test set
Using groundtruth bounding boxes. Configs in the table are only for evaluation.
Note that we could not acquire preprocessed APT-36K annotations that are used in VITPose training, so we preprocessed APT-36K annotations ourselves. For fair comparison, please check the Table 2 of our main paper.
APT-36K preprocessing script is at tools/dataset/prepro_apt36k.py.
| Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 75.9 | - | config | | ViTPose++-H | 256x192 | 82.3 | - | config | | PoseBH-B | 256x192 | 87.2 | 89.6 | config | | PoseBH-H | 256x192 | 90.6 | 92.6 | config |
WholeBody dataset
Configs in the table are only for evaluation.
| Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 57.4 | - | config | | ViTPose++-H | 256x192 | 60.6 | - | config | | PoseBH-B | 256x192 | 57.9 | 69.5 | config | | PoseBH-H | 256x192 | 62.0 | 72.9 | config |
Transfer on InterHand2.6M
| Model | Resolution | PCK | AUC | EPE | config | | :----: | :----: | :----: | :----: | :----: | :---: | | ViTPose++-B | 256x192 | 98.3 | 86.2 | 4.02 | config | | PoseBH-B (paper) | 256x192 | 98.6 | 87.1 | 3.70 | config | | PoseBH-B | 256x192 | 98.7 | 86.6 | 3.61 | config |
Transfer on 3DPW
| Model | Resolution | AP | AR | config | | :----: | :----: | :----: | :----: | :----: | | ViTPose++-B | 256x192 | 81.7 | 85.2 | config | | PoseBH-B (paper) | 256x192 | 83.6 | 87.1 | config | | PoseBH-B | 256x192 | 83.8 | 87.1 | config |
Usage
Setup
We use Ubuntu 20, Python 3.8, PyTorch 1.11.0, cuda 11.3, and mmcv 1.4.8 for the experiments.
conda environment setup
conda create -n posebh python=3.8 -y
conda activate posebh
pytorch 1.11.0 compatible version
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
mmcv install:
pip install "git+https://github.com/open-mmlab/mmcv.git@v1.4.8"
cd /path/to/PoseBH
pip install -v -e .
install other packages
bash
pip install -r requirements.txt
Directory Tree
``` data ├─ aic ├─ ap10k ├─ ap36k │ └─annotations │ └─ap36k_train.json ├─ coco └─ mpii
weights ├─ vitpose+base.pth └─ vitpose+huge.pth
splitweights
└─ basecoco.pth
``
Downloadvitpose+base.pthandvitpose+huge.pthfile from [ViTPose](https://github.com/ViTAE-Transformer/ViTPose).
To get thebasecoco.pthfile, runtools/modelsplit.pyonvitpose+_base.pth` file.
Download annotation files for APT-36K and 3DPW here.
Training
After downloading the pretrained models, please conduct the experiments by running
```bash
for single machine
bash tools/dist_train.sh
for multiple machines
python -m torch.distributed.launch --nnodes
We provide multi-dataset-training config files as below:
- configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/mdt/vitb_posebh.py
- configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/mdt/vith_posebh.py
We also provide a bash script for training ViTPose++-Base with our method in scripts/train_vitb_posebh.sh.
Evaluation
To test the pretrained models performance, please run
bash
bash tools/dist_test.sh <Config PATH> <Checkpoint PATH> <NUM GPUs>
For our multi-dataset-trained models, please first re-organize the weights using
bash
python tools/model_split.py --source <Pretrained PATH>
An exemplar evaluation bash script is provided in scripts/eval_vitb_mdt.sh.
Transfer
We provide transfer config files as below:
- configs/hand/2d_kpt_sview_rgb_img/topdown_heatmap/interhand2d/vitb_posebh_interhand2d_256x192.py
- configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/pw3d/vitb_posebh_pw3d_2d_256x192.py
In order to perform transfer learning, you may need to remove prototypes in the pretrained checkpoint file. Use tools/remove_proto.py script, similar to below:
bash
python tools/remove_proto.py --source work_dirs/vitb_posebh/epoch_100.pth
After the checkpoint is prepared, run the experiment like below:
bash
python -m torch.distributed.launch --nnodes 1 --node_rank 0 --nproc_per_node 4 --master_addr 127.0.0.1 --master_port 23459 tools/train.py configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/pw3d/vitb_posebh_interhand2d_256x192.py --launcher pytorch --seed 0 --cfg-options model.multihead_pretrained=work_dirs/vitb_posebh/epoch_100_no_proto.pth
Citation
@InProceedings{Jeong_2025_CVPR,
author = {Jeong, Uyoung and Freer, Jonathan and Baek, Seungryul and Chang, Hyung Jin and Kim, Kwang In},
title = {PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation},
booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
month = {June},
year = {2025},
pages = {12278-12288}
}
Acknowledgements
Owner
- Name: Uyoung Jeong
- Login: uyoung-jeong
- Kind: user
- Company: UNIST
- Website: https://uyoung-jeong.github.io/
- Repositories: 2
- Profile: https://github.com/uyoung-jeong
PhD course student at UNIST, South Korea
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - name: "MMPose Contributors" title: "OpenMMLab Pose Estimation Toolbox and Benchmark" date-released: 2020-08-31 url: "https://github.com/open-mmlab/mmpose" license: Apache-2.0
GitHub Events
Total
- Watch event: 9
- Issue comment event: 3
- Push event: 1
- Create event: 2
Last Year
- Watch event: 9
- Issue comment event: 3
- Push event: 1
- Create event: 2
Dependencies
- pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build
- pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build
- numpy *
- torch >=1.3
- docutils ==0.16.0
- myst-parser *
- sphinx ==4.0.2
- sphinx_copybutton *
- sphinx_markdown_tables *
- mmcv-full >=1.3.8
- mmdet >=2.14.0
- mmtrack >=0.6.0
- albumentations >=0.3.2
- onnx *
- onnxruntime *
- pyrender *
- requests *
- smplx >=0.1.28
- trimesh *
- mmcv-full *
- munkres *
- regex *
- scipy *
- titlecase *
- torch *
- torchvision *
- xtcocotools >=1.8
- chumpy *
- dataclasses *
- json_tricks *
- matplotlib *
- munkres *
- numpy *
- opencv-python *
- pillow *
- scipy *
- torchvision *
- xtcocotools >=1.8
- coverage * test
- flake8 * test
- interrogate * test
- isort ==4.3.21 test
- pytest * test
- pytest-runner * test
- smplx >=0.1.28 test
- xdoctest >=0.10.0 test
- yapf * test
- einops *
- psutil *
- tensorboard *
- timm ==0.4.9
- yapf ==0.40.1