https://github.com/albertzhaoca/e1
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (5.7%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Basic Info
- Host: GitHub
- Owner: AlbertZhaoCA
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 5.22 MB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Created about 1 year ago
· Last pushed 12 months ago
Metadata Files
Readme
Contributing
License
Code of conduct
README.md
AlgoForge: Specializing Code Generation Agents through Collaborative Reinforcement Learning
Requirements
Software Requirements
- Python 3.9+
- transformers>=4.51.0
- flash-attn>=2.4.3
- vllm>=0.8.3
1. Train the Planner Agent
- Start the vLLM engine
Update the model name, URL and port in
examples/reward_function/higher-order.pyas needed, and verify the settings inexamples/higher-order.yaml.
bash
python -m vllm.entrypoints.openai.api_server \
--model path-to-your-model \
--host 0.0.0.0 \
--port 8000 \
--tensor-parallel-size 4 \
--trust-remote-code
- Launch the training script
bash
bash examples/higher-order.sh
2. Train the Coder Agent
- Generate the dataset
Run
agent2_dataset.py, then update the dataset path in either:
examples/codegenwc.yamlexamples/codegenwoc.yaml
- Launch the training script
bash
bash examples/codewoc.sh
3. Merge Checkpoint to Hugging Face Format
Once training is complete, convert your checkpoint:
bash
python3 scripts/model_merger.py \
--local_dir to-the-actor-path-of-your-model-chekcpoint
Now you’re all set to run both agents end‑to‑end!
Owner
- Name: Albert Zhao
- Login: AlbertZhaoCA
- Kind: user
- Repositories: 1
- Profile: https://github.com/AlbertZhaoCA
A CS sophomore at Kean University Interest in HCI, NLP, and software architecture
GitHub Events
Total
- Push event: 6
- Create event: 2
Last Year
- Push event: 6
- Create event: 2
Dependencies
.github/workflows/tests.yml
actions
- actions/checkout v4 composite
- actions/setup-python v5 composite
Dockerfile
docker
- nvcr.io/nvidia/pytorch 24.08-py3 build
.github/requirements-test.txt
pypi
- codetiming * test
- datasets * test
- pillow * test
- pytest * test
- ray * test
- ruff * test
- tensordict * test
- torch * test
- torchvision * test
- transformers * test
pyproject.toml
pypi
requirements.txt
pypi
- accelerate *
- codetiming *
- datasets *
- flash-attn >=2.4.3
- liger-kernel *
- mathruler *
- numpy *
- omegaconf *
- pandas *
- peft *
- pillow *
- pyarrow >=15.0.0
- pylatexenc *
- qwen-vl-utils *
- ray *
- tensordict *
- torchdata *
- transformers >=4.51.0,<4.53.0
- vllm >=0.8.0
- wandb *
setup.py
pypi