https://github.com/albertzhaoca/e1

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (5.7%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: AlbertZhaoCA
License: apache-2.0
Language: Python
Default Branch: main
Size: 5.22 MB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Created about 1 year ago · Last pushed 12 months ago

Metadata Files

Readme Contributing License Code of conduct

README.md

AlgoForge: Specializing Code Generation Agents through Collaborative Reinforcement Learning

Requirements

Software Requirements

Python 3.9+
transformers>=4.51.0
flash-attn>=2.4.3
vllm>=0.8.3

1. Train the Planner Agent

Start the vLLM engine Update the model name, URL and port in examples/reward_function/higher-order.py as needed, and verify the settings in examples/higher-order.yaml.

bash python -m vllm.entrypoints.openai.api_server \ --model path-to-your-model \ --host 0.0.0.0 \ --port 8000 \ --tensor-parallel-size 4 \ --trust-remote-code

Launch the training script

bash bash examples/higher-order.sh

2. Train the Coder Agent

Generate the dataset Run agent2_dataset.py, then update the dataset path in either:

examples/codegenwc.yaml
examples/codegenwoc.yaml

Launch the training script

bash bash examples/codewoc.sh

3. Merge Checkpoint to Hugging Face Format

Once training is complete, convert your checkpoint:

bash python3 scripts/model_merger.py \ --local_dir to-the-actor-path-of-your-model-chekcpoint

Now you’re all set to run both agents end‑to‑end!

Owner

Name: Albert Zhao
Login: AlbertZhaoCA
Kind: user

Repositories: 1
Profile: https://github.com/AlbertZhaoCA

A CS sophomore at Kean University Interest in HCI, NLP, and software architecture

GitHub Events

Total

Push event: 6
Create event: 2

Last Year

Push event: 6
Create event: 2

Dependencies

.github/workflows/tests.yml actions

actions/checkout v4 composite
actions/setup-python v5 composite

Dockerfile docker

nvcr.io/nvidia/pytorch 24.08-py3 build

.github/requirements-test.txt pypi

codetiming * test
datasets * test
pillow * test
pytest * test
ray * test
ruff * test
tensordict * test
torch * test
torchvision * test
transformers * test

pyproject.toml pypi

requirements.txt pypi

accelerate *
codetiming *
datasets *
flash-attn >=2.4.3
liger-kernel *
mathruler *
numpy *
omegaconf *
pandas *
peft *
pillow *
pyarrow >=15.0.0
pylatexenc *
qwen-vl-utils *
ray *
tensordict *
torchdata *
transformers >=4.51.0,<4.53.0
vllm >=0.8.0
wandb *

setup.py pypi

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/albertzhaoca/e1

Science Score: 26.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

AlgoForge: Specializing Code Generation Agents through Collaborative Reinforcement Learning

Requirements

Software Requirements

1. Train the Planner Agent

2. Train the Coder Agent

3. Merge Checkpoint to Hugging Face Format

Owner

GitHub Events

Total

Last Year

Dependencies

https://github.com/albertzhaoca/e1

Science Score: 26.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

AlgoForge: Specializing Code Generation Agents through Collaborative Reinforcement Learning

Requirements

Software Requirements

1. Train the Planner Agent

2. Train the Coder Agent

3. Merge Checkpoint to Hugging Face Format

Owner

GitHub Events

Total

Last Year

Dependencies

3. Merge Checkpoint to Hugging Face Format