flame
[CVPR 2025] PyTorch implementation of paper "FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training"
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.2%) to scientific vocabulary
Repository
[CVPR 2025] PyTorch implementation of paper "FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training"
Basic Info
Statistics
- Stars: 27
- Watchers: 4
- Forks: 1
- Open Issues: 3
- Releases: 0
Metadata Files
README.md
CVPR 2025 | FLAME
FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training
Anjia Cao, Xing Wei, Zhiheng Ma
📰 News
- [2025/03/09] Release training codes and scripts.
- [2025/03/08] Recaptioned CC3M and Recaptioned YFCC15M on Hugging Face.
- [2025/02/27] Accepted by CVPR 2025.
- [2024/11/28] Model on Hugging Face.
- [2024/11/28] Release evaluation codes and scripts.
- [2024/11/18] Paper on arXiv.
💡 Highlights
- 🔥 Leveraging frozen LLMs to naturally process long text inputs.
- 🔥 Generalizing from monolingual training to multilingual evaluation.
- 🔥 Strong improvement on long/short-context image-text retrieval, image classification, and multilingual scenarios.

📅 TODO Roadmap
- [x] Release training code and data.
- [x] Release evaluation code.
- [x] Release pre-trained checkpoints.
🛠️ Get Started
Setup
git clone https://github.com/MIV-XJTU/FLAME.git
cd FLAME
conda create -n flame python=3.10 -y
conda activate flame
make install
make install-training
make install-test
Training
See Training.md.
Evaluation
See Evaluation.md.
📁 Datasets
| Dataset | Link |
|---|---|
| CC3M-ReCap | Hugging Face |
| YFCC15M-ReCap | Hugging Face |
🔐 Pre-trained Checkpoints
| Dataset | Model | Link |
|---|---|---|
| CC3M | Mistral-Nemo-ViT-B/16 | Hugging Face |
🛂 License
The project is under a standard Creative Common CC-BY-4.0 License.
📖 Citation
If you find our work helpful for your research, please consider giving a star and citation.
bibtex
@inproceedings{cao2025flame,
title={FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training},
author={Cao, Anjia and Wei, Xing and Ma, Zhiheng},
booktitle={CVPR},
year={2025}
}
🫡 Acknowledgements
This project is based on open_clip, and thanks for the nice work! We also thank CLIP_benchmark, DreamLIP, Long-CLIP, PromptEOL, and MiniCPM-V for their codes.
Owner
- Login: MIV-XJTU
- Kind: user
- Repositories: 1
- Profile: https://github.com/MIV-XJTU
Citation (CITATION.cff)
cff-version: 1.1.0
message: If you use this software, please cite it as below.
authors:
- family-names: Ilharco
given-names: Gabriel
- family-names: Wortsman
given-names: Mitchell
- family-names: Wightman
given-names: Ross
- family-names: Gordon
given-names: Cade
- family-names: Carlini
given-names: Nicholas
- family-names: Taori
given-names: Rohan
- family-names: Dave
given-names: Achal
- family-names: Shankar
given-names: Vaishaal
- family-names: Namkoong
given-names: Hongseok
- family-names: Miller
given-names: John
- family-names: Hajishirzi
given-names: Hannaneh
- family-names: Farhadi
given-names: Ali
- family-names: Schmidt
given-names: Ludwig
title: OpenCLIP
version: v0.1
doi: 10.5281/zenodo.5143773
date-released: 2021-07-28
GitHub Events
Total
- Issues event: 9
- Watch event: 29
- Issue comment event: 7
- Member event: 1
- Push event: 22
- Fork event: 1
- Create event: 2
Last Year
- Issues event: 9
- Watch event: 29
- Issue comment event: 7
- Member event: 1
- Push event: 22
- Fork event: 1
- Create event: 2