Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.0%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: lichongod
- Language: Python
- Default Branch: main
- Size: 22.5 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
K12-Vista
Quickstart | Datasets | Leaderboard | Report | Citation
This repository is the official implementation of K12-Vista.
K12Vista: Exploring the Boundaries of MLLMs in K-12 Education
Chong Li, Chenglin Zhu, Tao Zhang*, Mingan Lin†, Zenan Zhou†, Jian Xie†
* Equal Contribution
† Corresponding Author
News
- [2025-06-01] The technical report of K12-Vista is released!
Introduction
Multimodal large language models (MLLMs) have demonstrated remarkable reasoning capabilities in various visual tasks. However, their abilities in K12 (Grades1–12) scenarios are still systematically underexplored. Previous studies suffer from various limitations including narrow subject coverage, insufficient data scale,lack of diversity in question types, and naive answer-centric evaluation method, resulting in insufficient exploration of model capabilities. To address these gaps, we propose K12Vista, the most comprehensive multimodal benchmark for Chinese K12 subject knowledge understanding and reasoning to date, featuring 33,000 questions across five core subjects from primary to high school and three question types. Moreover, beyond the final outcome, we are also concerned with he correctness of MLLMs’ reasoning processes. For this purpose, we meticulously compiles errors from MLLMs’ reasoning processes and leverage an automated data pipeline to construct K12-PEM-800K, the largest process evaluation dataset offering detailed step-by-step judgement annotations for MLLMs’ reasoning. Subsequently, we developed K12-PEM, an advanced process evaluation model that integrates an overall assessment of both the reasoning process and answer correctness. Moreover, we also introduce K12-PEBench, the first highquality, human-annotated benchmark specifically designed for evaluating abilities of reasoning process evaluation. Extensive experiments reveal that current MLLMs exhibit significant flaws when reasoning within K12Vista, providing critical insights for the development of more capable MLLMs. We open our resources at https://github.com/lichongod/K12Vista.
Quick Start
Please refer to K12-Vista for your quick start.
K12-Vista
First, you need to start the vllm service using K12_Vista/script/vllm_infer_model_setup.sh, then register the model name in K12_Vista/code/model_dict.py. After that, refer to K12_Vista/script/infer_eval.sh for inference. For evaluation, you need to first start the vllm service for the judgemodel by K12_Vista/script/vllm_qwen25_vl_72b_instruct_judgemodel_setup.sh and K12_Vista/script/vllm_K12_PEM_judgemodel_setup.sh, and then refer eval.py function in the infer_eval.sh to eval.
K12-PEMBench
First, you need to start the vllm service using K12_PEMBench/script/vllm_K12_PEM_judgemodel_setup.sh, then register the model name in K12_PEMBench/code/model_dict.py. After that, refer to K12_PEMBench/script/qwen25_vl_72b.sh for inference and evaluation.
Datasets and K12-PEM
K12-Vista, K12-PEMBench, and K12_PEM800K are in https://huggingface.co/datasets/lipku1999/K12-Vista K12-PEM is in https://huggingface.co/lipku1999/K12-PEM
Leaderboard

🖊️ Citation
If you feel K12Vista in your project or research, please kindly use the following BibTeX entry to cite our paper. Thanks! ```bibtex @misc{li2025k12vistaexploringboundariesmllms, title={K12Vista: Exploring the Boundaries of MLLMs in K-12 Education}, author={Chong Li and Chenglin Zhu and Tao Zhang and Mingan Lin and Zenan Zhou and Jian Xie}, year={2025}, eprint={2506.01676}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2506.01676}, }
Owner
- Login: lichongod
- Kind: user
- Repositories: 1
- Profile: https://github.com/lichongod
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Chong
given-names: Li
- family-names: Chenglin
given-names: Zhu
- family-names: Tao
given-names: Zhang
- family-names: Mingan
given-names: Lin
- family-names: Zena
given-names: Zhou
- family-names: Jian
given-names: Xie
title: "K12Vista:Exploring the Boundaries of MLLMs in K-12 Education"
version: "1.0.0"
url: "GitHub - lichongod/K12Vista-Exploring-the-Boundaries-of-MLLMs-in-K-12-Education"
year: 2025
preferred-citation:
type: article
authors:
- family-names: Li
given-names: Chong
- family-names: Zhu
given-names: Chenglin
- family-names: Zhang
given-names: Tao
- family-names: Lin
given-names: Mingan
- family-names: Zhou
given-names: Zenan
- family-names: Xie
given-names: Jian
title: "K12Vista:Exploring the Boundaries of MLLMs in K-12 Education"
version: "1.0.0"
url: "GitHub - lichongod/K12Vista-Exploring-the-Boundaries-of-MLLMs-in-K-12-Education"
year: 2025
GitHub Events
Total
- Issues event: 1
- Push event: 9
- Create event: 2
Last Year
- Issues event: 1
- Push event: 9
- Create event: 2