https://github.com/cvi-szu/facebench

[CVPR 2025] FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.9%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

[CVPR 2025] FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs

Basic Info

Host: GitHub
Owner: CVI-SZU
License: mit
Default Branch: main
Homepage:
Size: 3.45 MB

Statistics

Stars: 34
Watchers: 6
Forks: 0
Open Issues: 2
Releases: 0

Created over 1 year ago · Last pushed 10 months ago

Metadata Files

Readme License

FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs [CVPR 2025]

Xiaoqin Wang, Xusen Ma, Xianxu Hou, Meidan Ding, Yudong Li, Junliang Chen, Wenting Chen, Xiaoyang Peng, Linlin Shen* [![ArXiv](https://img.shields.io/badge/ArXiv-2503.21457-B31B1B.svg)](https://arxiv.org/pdf/2503.21457) [![Webpage](https://img.shields.io/badge/Webpage-FaceBench-.svg)](https://github.com/CVI-SZU/FaceBench/tree/main) [![Dataset](https://img.shields.io/badge/HuggingFace🤗-Dataset-blue)](https://github.com/CVI-SZU/FaceBench/tree/main) [![Models](https://img.shields.io/badge/HuggingFace🤗-Models-blue)](https://huggingface.co/wxqlab/face-llava-v1.5-13b)

Overview

In this work, we introduce FaceBench, a dataset featuring hierarchical multi-view and multi-level attributes specifically designed to assess the comprehensive face perception abilities of MLLMs. We construct a hierarchical facial attribute structure, which encompasses five views with up to three levels of attributes, totaling over 210 attributes and 700 attribute values. Based on the structure, the proposed FaceBench consists of 49,919 visual question-answering (VQA) pairs for evaluation and 23,841 pairs for fine-tuning. Moreover, we further develop a robust face perception MLLM baseline, Face-LLaVA, by training with our proposed face VQA data.

Distribution of visual question-answer pairs

Some samples from our dataset

News

[2024-08-20] The Face-LLaVA model is released on HuggingFace🤗.
[2024-03-27] The paper is released on ArXiv🔥.

TODO

[X] Release the Face-LLaVA model.
[X] Release the evaluation code.
[ ] Release the dataset.

Evaluation

Model inference

OMP_NUM_THREADS=8 CUDA_VISIBLE_DEVICES=0 python evaluation/inference.py \ --data-dir ./datasets/example/test.jsonl \ --images-dir ./datasets/example/images/ \ --model-name face_llava_1_5_13b \ --question-type "TFQ, SCQ, MCQ, OEQ" \ --save-dir "./responses-and-results/"

Calculate metrics

OMP_NUM_THREADS=8 CUDA_VISIBLE_DEVICES=5 python evaluation/evaluation.py \ --data-path ./responses-and-results/face_llava_1_5_13b_test_responses.jsonl"

Results

Experimental results of various MLLMs and our Face-LLaVA across five facial attribute views.

Experimental results of various MLLMs and our Face-LLaVA across Level 1 facial attributes.

Citation

If you find this work useful for your research, please consider citing our paper: ``` @inproceedings{wang2025facebench, title={FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs}, author={Wang, Xiaoqin and Ma, Xusen and Hou, Xianxu and Ding, Meidan and Li, Yudong and Chen, Junliang and Chen, Wenting and Peng, Xiaoyang and Shen, Linlin}, booktitle={Proceedings-2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025}, year={2025} }

@article{wang2025facebench, title={FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs}, author={Wang, Xiaoqin and Ma, Xusen and Hou, Xianxu and Ding, Meidan and Li, Yudong and Chen, Junliang and Chen, Wenting and Peng, Xiaoyang and Shen, Linlin}, journal={arXiv preprint arXiv:2503.21457}, year={2025} } ``` If you have any questions, you can either create issues or contact me by email wangxiaoqin2022@email.szu.edu.cn.

Acknowledgments

This work is heavily based on LLaVA. Thanks to the authors for their great work.

Owner

Name: Computer Vision Institute, SZU
Login: CVI-SZU
Kind: organization
Location: Shenzhen Univeristy, Shenzhen, China

Website: http://cv.szu.edu.cn/
Repositories: 13
Profile: https://github.com/CVI-SZU

Computer Vision Institute, Shenzhen University

GitHub Events

Total

Issues event: 3
Watch event: 25
Issue comment event: 2
Member event: 1
Push event: 68
Create event: 2

Last Year

Issues event: 3
Watch event: 25
Issue comment event: 2
Member event: 1
Push event: 68
Create event: 2

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/cvi-szu/facebench

Science Score: 36.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs [CVPR 2025]

Overview

Distribution of visual question-answer pairs

Some samples from our dataset

News

TODO

Evaluation

Model inference

Calculate metrics

Results

Citation

Acknowledgments

Owner

GitHub Events

Total

Last Year