TinyZero_LLM_curriculum_training

https://github.com/davidliu2024/TinyZero_LLM_curriculum_training

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.4%) to scientific vocabulary

Last synced: 9 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: davidliu2024
License: mit
Language: Python
Default Branch: main
Size: 1.43 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created about 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

Applying Supervised Fine-Tuning on TinyZero for Reasoning Tasks Using Curriculum Learning Methods

Abstract

Current LLMs are curated on various types of datasets for specific problems. For example, TinyZero is trained on various types of multiplication, division, and countdown problems to improve reasoning performance, and Verigen is trained on Verilog HDL to generate Verilog code better. However, in psychology, children are found to learn through various levels of the curriculum, where the child learns the basic tasks first before moving onto more difficult tasks (i.e., learning algebra before calculus). In this project, we aim to use a curriculum learning approach, and apply supervised fine-tuning with increasingly challenging problem datasets to TinyZero, a reproduction of Deepseek Zero that uses reinforcement learning for self-verification and more accurate searching abilities and evaluate its performance first with curriculum learning, and then without.

About

Authors

David Liu (david_liu@tamu.edu)
Amarachukwu Nzedibe (amaranzedibe1@tamu.edu)
Nicole LoGiudice (nicolelogiudice30@tamu.edu)

Organization

Texas A&M University - College Station

Purpose

Project Structure

Main script: TinyZeroTry2.py Executes training and collects loss.

Results are under: ./results Includes testing loss datapoints and plots.

Requirements

OS and Software requirements

Ubuntu > 20.0 (Or any Debian environment)
Python > 3.9
MiniConda > 25.0 ### Hardware requirements
CPU memory > 8GB
GPU memory > 32GB

Initialization and Set Up

To set up the MiniConda Environment: conda create -n tinyzero-env python=3.9 conda activate tinyzero-env

To install all packages: pip install -r freeze.txt

Execution

To run a single round of training: python TinyZeroTry2.py -m <model> -d <dataset> -p <problems> -o <output directory> Avaliable models include: - tinyzero - tinyzero-1.5 - Any local model saved to the machine Available datasets include: - gsm8k - prm800k

To run and collect results at 50,100,250,500,750,1000 problem sets. chmod +x ./curriculum_learning.sh ./curriculum_learning.sh

Owner

Login: davidliu2024
Kind: user

Repositories: 2
Profile: https://github.com/davidliu2024

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "L."
  given-names: "David"
- family-names: "N."
  given-names: "Amara"
- family-names: "L."
  given-names: "Nicole"
title: "ECEN743-TinyZero-SFT"
version: 1.0.0
doi: 10.5281/zenodo.1234
date-released: 2025-05-01
url: "https://github.com/davidliu2024/ECEN743-TinyZero-SFT"

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science