TinyZero_LLM_curriculum_training
https://github.com/davidliu2024/TinyZero_LLM_curriculum_training
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.4%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: davidliu2024
- License: mit
- Language: Python
- Default Branch: main
- Size: 1.43 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Applying Supervised Fine-Tuning on TinyZero for Reasoning Tasks Using Curriculum Learning Methods
Abstract
Current LLMs are curated on various types of datasets for specific problems. For example, TinyZero is trained on various types of multiplication, division, and countdown problems to improve reasoning performance, and Verigen is trained on Verilog HDL to generate Verilog code better. However, in psychology, children are found to learn through various levels of the curriculum, where the child learns the basic tasks first before moving onto more difficult tasks (i.e., learning algebra before calculus). In this project, we aim to use a curriculum learning approach, and apply supervised fine-tuning with increasingly challenging problem datasets to TinyZero, a reproduction of Deepseek Zero that uses reinforcement learning for self-verification and more accurate searching abilities and evaluate its performance first with curriculum learning, and then without.
About
Authors
- David Liu (david_liu@tamu.edu)
- Amarachukwu Nzedibe (amaranzedibe1@tamu.edu)
- Nicole LoGiudice (nicolelogiudice30@tamu.edu)
Organization
- Texas A&M University - College Station
Purpose
Project Structure
Main script:
TinyZeroTry2.py
Executes training and collects loss.
Results are under:
./results
Includes testing loss datapoints and plots.
Requirements
OS and Software requirements
- Ubuntu > 20.0 (Or any Debian environment)
- Python > 3.9
- MiniConda > 25.0 ### Hardware requirements
- CPU memory > 8GB
- GPU memory > 32GB
Initialization and Set Up
To set up the MiniConda Environment:
conda create -n tinyzero-env python=3.9
conda activate tinyzero-env
To install all packages:
pip install -r freeze.txt
Execution
To run a single round of training:
python TinyZeroTry2.py -m <model> -d <dataset> -p <problems> -o <output directory>
Avaliable models include:
- tinyzero
- tinyzero-1.5
- Any local model saved to the machine
Available datasets include:
- gsm8k
- prm800k
To run and collect results at 50,100,250,500,750,1000 problem sets.
chmod +x ./curriculum_learning.sh
./curriculum_learning.sh
Owner
- Login: davidliu2024
- Kind: user
- Repositories: 2
- Profile: https://github.com/davidliu2024
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "L." given-names: "David" - family-names: "N." given-names: "Amara" - family-names: "L." given-names: "Nicole" title: "ECEN743-TinyZero-SFT" version: 1.0.0 doi: 10.5281/zenodo.1234 date-released: 2025-05-01 url: "https://github.com/davidliu2024/ECEN743-TinyZero-SFT"