https://github.com/ai4bharat/fermat
A vLLM-based Pipeline for benchmarking various VLMs on HMER Dataset of AI4Bharat
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.9%) to scientific vocabulary
Repository
A vLLM-based Pipeline for benchmarking various VLMs on HMER Dataset of AI4Bharat
Basic Info
- Host: GitHub
- Owner: AI4Bharat
- License: mit
- Language: Python
- Default Branch: main
- Size: 902 KB
Statistics
- Stars: 4
- Watchers: 1
- Forks: 1
- Open Issues: 1
- Releases: 0
Metadata Files
README.md
FERMAT: Can Vision-Language Models Evaluate Handwritten Math?
We present FERMAT, a benchmark designed to assess VLMs’ ability to detect, localize and correct errors in handwritten mathematical content. Please refer to our paper for more details.
Loading Data
Steps to download data and store the images in benchmarkimages, and csv in benchmarkcsv. Steps to dowload data for the oikantik format
Setup
To run evaluation of VLMs against the FEMRAT dataset, you need to install the required packages by running the following command:
bash
pip install -r requirements.txt
We self-hosted Pixtral-12B-2409 (https://huggingface.co/mistralai/Pixtral-12B-2409), Pixtral-Large-Instruct-2411, LLaMa-3.2-11B-Vision-Instruct, LLaMa-3.2-90B-Vision-Instruct, Phi-3.5-Vision-Instruct using VLLM (https://github.com/vllm-project/vllm)
We used hosted services for GPT-Family, Gemini-Family
For self-hosted models,
- Set up environment variables:
bash
export OPENAI_API_BASE=[ADD_THE_ENDPOINT_URL_OF_HOSTED_MODEL]
Example: "http://localhost:8004/v1"
- Start Evaluations:
bash
python main.py --model [MODEL_NAME] --dir_name [DATA_DIR]
- MODELNAME: Name of the model to be evaluated. Choices: `['pixtral', 'pixtrallarge', 'phi', 'llama_large', 'llama']`
- DATA_DIR: Path to the directory where the Benchmark Images are stored
- Fill-in CSV
Once the evaluation is done, the results will be stored in a JSON File with the format state_<MODEL_NAME>.json. You can convert this JSON file to a CSV file using the following command:
bash
python fill_in_csv.py --model [MODEL_NAME] --csv-file [CSV_FILE] --json-file [JSON_FILE]
- MODELNAME: Name of the model to be evaluated. Choices: `['pixtral', 'pixtrallarge', 'phi', 'llama_large', 'llama']`
- CSV_FILE: Path to the CSV file where the results need to be filled in.
- JSON_FILE: Path to the JSON file where the results are stored.
Citation
If you used this repository or our models, please cite our work:
bibtex
@article{nath2025vision1language,
title = {Can Vision-Language Models Evaluate Handwritten Math?},
author = {Oikantik Nath and Hanani Bathina and Mohammed Safi Ur Rahman Khan and Mitesh M. Khapra},
year = {2025},
journal = {arXiv preprint arXiv: 2501.07244}
}
Owner
- Name: AI4Bhārat
- Login: AI4Bharat
- Kind: organization
- Email: opensource@ai4bharat.org
- Location: India
- Website: https://ai4bharat.org
- Twitter: AI4Bharat
- Repositories: 37
- Profile: https://github.com/AI4Bharat
Artificial-Intelligence-For-Bhārat : Building open-source AI solutions for India!
GitHub Events
Total
- Watch event: 1
- Push event: 2
- Fork event: 1
Last Year
- Watch event: 1
- Push event: 2
- Fork event: 1
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- 9115jin (1)