Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.0%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: lichongod
  • Language: Python
  • Default Branch: main
  • Size: 22.5 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created 10 months ago · Last pushed 9 months ago
Metadata Files
Readme Citation

README.md

K12-Vista

Quickstart | Datasets | Leaderboard | Report | Citation

This repository is the official implementation of K12-Vista.

K12Vista: Exploring the Boundaries of MLLMs in K-12 Education
Chong Li, Chenglin Zhu, Tao Zhang*, Mingan Lin†, Zenan Zhou†, Jian Xie†
* Equal Contribution
Corresponding Author

News

  • [2025-06-01] The technical report of K12-Vista is released!

Introduction

Multimodal large language models (MLLMs) have demonstrated remarkable reasoning capabilities in various visual tasks. However, their abilities in K12 (Grades1–12) scenarios are still systematically underexplored. Previous studies suffer from various limitations including narrow subject coverage, insufficient data scale,lack of diversity in question types, and naive answer-centric evaluation method, resulting in insufficient exploration of model capabilities. To address these gaps, we propose K12Vista, the most comprehensive multimodal benchmark for Chinese K12 subject knowledge understanding and reasoning to date, featuring 33,000 questions across five core subjects from primary to high school and three question types. Moreover, beyond the final outcome, we are also concerned with he correctness of MLLMs’ reasoning processes. For this purpose, we meticulously compiles errors from MLLMs’ reasoning processes and leverage an automated data pipeline to construct K12-PEM-800K, the largest process evaluation dataset offering detailed step-by-step judgement annotations for MLLMs’ reasoning. Subsequently, we developed K12-PEM, an advanced process evaluation model that integrates an overall assessment of both the reasoning process and answer correctness. Moreover, we also introduce K12-PEBench, the first highquality, human-annotated benchmark specifically designed for evaluating abilities of reasoning process evaluation. Extensive experiments reveal that current MLLMs exhibit significant flaws when reasoning within K12Vista, providing critical insights for the development of more capable MLLMs. We open our resources at https://github.com/lichongod/K12Vista.

Quick Start

Please refer to K12-Vista for your quick start.

K12-Vista

First, you need to start the vllm service using K12_Vista/script/vllm_infer_model_setup.sh, then register the model name in K12_Vista/code/model_dict.py. After that, refer to K12_Vista/script/infer_eval.sh for inference. For evaluation, you need to first start the vllm service for the judgemodel by K12_Vista/script/vllm_qwen25_vl_72b_instruct_judgemodel_setup.sh and K12_Vista/script/vllm_K12_PEM_judgemodel_setup.sh, and then refer eval.py function in the infer_eval.sh to eval.

K12-PEMBench

First, you need to start the vllm service using K12_PEMBench/script/vllm_K12_PEM_judgemodel_setup.sh, then register the model name in K12_PEMBench/code/model_dict.py. After that, refer to K12_PEMBench/script/qwen25_vl_72b.sh for inference and evaluation.

Datasets and K12-PEM

K12-Vista, K12-PEMBench, and K12_PEM800K are in https://huggingface.co/datasets/lipku1999/K12-Vista K12-PEM is in https://huggingface.co/lipku1999/K12-PEM

Leaderboard

Leaderboard

🖊️ Citation

If you feel K12Vista in your project or research, please kindly use the following BibTeX entry to cite our paper. Thanks! ```bibtex @misc{li2025k12vistaexploringboundariesmllms, title={K12Vista: Exploring the Boundaries of MLLMs in K-12 Education}, author={Chong Li and Chenglin Zhu and Tao Zhang and Mingan Lin and Zenan Zhou and Jian Xie}, year={2025}, eprint={2506.01676}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2506.01676}, }

Owner

  • Login: lichongod
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Chong
    given-names: Li
  - family-names: Chenglin
    given-names: Zhu
  - family-names: Tao
    given-names: Zhang
  - family-names: Mingan
    given-names: Lin
  - family-names: Zena
    given-names: Zhou 
  - family-names: Jian
    given-names: Xie
title: "K12Vista:Exploring the Boundaries of MLLMs in K-12 Education"
version: "1.0.0"
url: "GitHub - lichongod/K12Vista-Exploring-the-Boundaries-of-MLLMs-in-K-12-Education"
year: 2025
preferred-citation:
  type: article
  authors:
  - family-names: Li
    given-names: Chong
  - family-names: Zhu
    given-names: Chenglin
  - family-names: Zhang
    given-names: Tao
  - family-names: Lin
    given-names: Mingan
  - family-names: Zhou
    given-names: Zenan
  - family-names: Xie
    given-names: Jian
  title: "K12Vista:Exploring the Boundaries of MLLMs in K-12 Education"
  version: "1.0.0"
  url: "GitHub - lichongod/K12Vista-Exploring-the-Boundaries-of-MLLMs-in-K-12-Education"
  year: 2025

GitHub Events

Total
  • Issues event: 1
  • Push event: 9
  • Create event: 2
Last Year
  • Issues event: 1
  • Push event: 9
  • Create event: 2