https://github.com/bigcode-project/astraios

Astraios: Parameter-Efficient Instruction Tuning Code Language Models

https://github.com/bigcode-project/astraios

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.4%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Astraios: Parameter-Efficient Instruction Tuning Code Language Models

Basic Info
Statistics
  • Stars: 49
  • Watchers: 4
  • Forks: 2
  • Open Issues: 1
  • Releases: 0
Created over 2 years ago · Last pushed about 2 years ago
Metadata Files
Readme License

README.md

Astraios: Parameter-Efficient Instruction Tuning Code Language Models

Astraios

This repository provides an overview of all components from the paper Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models.

Overview

Data CommitPackFT+OASST Filtered version of CommitPack and OASST for high-quality commit messages that resemble instructions
Model Astraios-1B Collection of StarCoderBase-1B models instruction tuned on CommitPackFT + OASST with different tuning methods
Astraios-3B Collection of StarCoderBase-3B (3B parameters) models instruction tuned on CommitPackFT + OASST with different tuning methods
Astraios-7B Collection of StarCoderBase-7B (7B parameters) models instruction tuned on CommitPackFT + OASST with different tuning methods
Astraios-16B Collection of StarCoderBase-16B (16B parameters) models instruction tuned on CommitPackFT + OASST with different tuning methods
Evaluation BigCloneBench Dataset for clone detection; We use 2,000 samples for evaluation
Devign Dataset for defect detection; We use 2,000 samples for evaluation
HumanEvalPack Extension of OpenAI's HumanEval to cover 3 scenarios across 6 languages
ReCode Dataset for the robustness of code generation, covering 4 variants
Asleep At The Keyboard Datasets for security of code generation; We use DoW for evaluation

PEFT

Setup: Run the bash code below to set up the PEFT methods used in our work. We additionally implement AdapterH, AdapterP and Parallel methods based on the peft==0.6.0.dev0. For more information, please refer to the peft folder.

bash pip install git+https://github.com/bigcode-project/astraios#subdirectory=peft

Notes:

  • As Prefix Tuning does not work for StarCoder training, we do not evaluate this method.
  • For any configuration issues, please refer to the original PEFT.

Evaluation

  1. Setup: Run the bash code below to set up the evaluation repository.

bash git clone -b astraios https://github.com/bigcode-project/bigcode-evaluation-harness cd bigcode-evaluation-harness pip install -q -r requirements.txt accelerate config

  1. Run: All evaluation scripts are in evaluation folder. Run each script via bash.

We use astraios-1b-lora as an example and use the bash code to run the following tasks:

  • Clone Detection bash accelerate launch main.py \ --model bigcode/starcoderbase-1b \ --peft_model bigcode/astraios-1b-lora \ --tasks clone_detection \ --do_sample False \ --batch_size 1 \ --save_generations \ --trust_remote_code \ --save_generations_path generations_clone_detection_astraios-1b-lora.json \ --max_length_generation 512

  • Defect Detection bash accelerate launch main.py \ --model bigcode/starcoderbase-1b \ --peft_model bigcode/astraios-1b-lora \ --tasks clone_detection \ --do_sample False \ --batch_size 1 \ --save_generations \ --trust_remote_code \ --save_generations_path generations_defect_detection_astraios-1b-lora.json \ --max_length_generation 512

  • HumanEvalSynthesize-Python bash accelerate launch main.py \ --model bigcode/starcoderbase-1b \ --peft_model bigcode/astraios-1b-lora \ --tasks humanevalsynthesize-python \ --do_sample True \ --temperature 0.2 \ --n_samples 20 \ --batch_size 5 \ --allow_code_execution \ --save_generations \ --trust_remote_code \ --prompt octocoder \ --save_generations_path generations_humanevalsynthesizepython_astraios-1b-lora.json \ --metric_output_path evaluation_humanevalsynthesizepython_astraios-1b-lora.json \ --max_length_generation 2048

  • HumanEvalFix-Python bash accelerate launch main.py \ --model bigcode/starcoderbase-1b \ --peft_model bigcode/astraios-1b-lora \ --tasks humanevalfixtests-python \ --do_sample True \ --temperature 0.2 \ --n_samples 20 \ --batch_size 1 \ --allow_code_execution \ --save_generations \ --trust_remote_code \ --prompt octocoder \ --save_generations_path generations_humanevalfixpython_astraios-1b-lora.json \ --metric_output_path evaluation_humanevalfixpython_astraios-1b-lora.json \ --max_length_generation 2048

  • HumanEvalExplain-Python ```bash accelerate launch main.py \ --model bigcode/starcoderbase-1b \ --peftmodel bigcode/astraios-1b-lora \ --tasks humanevalexplaindescribe-python \ --generationonly \ --dosample True \ --temperature 0.2 \ --nsamples 20 \ --batchsize 5 \ --allowcodeexecution \ --savegenerations \ --trustremotecode \ --prompt octocoder \ --savegenerationspath generationshumanevalexplaindescribe-pythonastraios-1b-lora.json \ --maxlengthgeneration 2048

accelerate launch main.py \ --model bigcode/starcoderbase-1b \ --peftmodel bigcode/astraios-1b-lora \ --tasks humanevalexplainsynthesize-python \ --dosample True \ --temperature 0.2 \ --nsamples 1 \ --batchsize 1 \ --allowcodeexecution \ --savegenerations \ --trustremotecode \ --prompt octocoder \ --loaddatapath generationshumanevalexplainsynthesize-pythonastraios-1b-lora.json \ --savegenerationspath generationshumanevalexplainsynthesize-pythonastraios-1b-lora.json \ --metricoutputpath evaluationhumanevalexplainpythonastraios-1b-lora.json \ --maxlength_generation 2048 ```

  • ReCode-Format bash accelerate launch main.py \ --model bigcode/starcoderbase-1b \ --peft_model bigcode/astraios-1b-lora \ --tasks perturbed-humaneval-format-num_seeds_5 \ --do_sample False \ --batch_size 1 \ --allow_code_execution \ --save_generations \ --trust_remote_code \ --n_samples 1 \ --batch_size 1 \ --allow_code_execution \ --save_generations \ --trust_remote_code \ --prompt octocoder \ --save_generations_path generations_perturbed-humaneval-format-num_seeds_5_astraios-1b-lora.json \ --metric_output_path evaluation_perturbed-humaneval-format-num_seeds_5_astraios-1b-lora.json \ --max_length_generation 1024

  • AATK-DoW bash accelerate launch main.py \ --model bigcode/starcoderbase-1b \ --peft_model bigcode/astraios-1b-lora \ --tasks asleep_completion \ --do_sample True \ --temperature 0.2 \ --n_samples 20 \ --batch_size 1 \ --generation_only \ --save_generations \ --trust_remote_code \ --prompt octocoder \ --save_generations_path generations_asleep_completion_astraios-1b-lora.json \ --metric_output_path evaluation_asleep_completion_astraios-1b-lora.json \ --max_length_generation 1024

Note: - When evaluating FFT models, --peft_model should be removed and FFT model names need to pass with --model, e.g.: bash accelerate launch main.py \ --model bigcode/astraios-1b-fft \ --tasks clone_detection \ --do_sample False \ --batch_size 1 \ --save_generations \ --trust_remote_code \ --save_generations_path generations_clone_detection_astraios-1b-fft.json \ --max_length_generation 512 - The evaluation notebook for Clone Detection and Defection is stored in evaluation/evalcodecomprehension.ipynb.

Training

PEFT

The finetuning python script is at finetune.py. Coresponding PEFT configurations are stored in peft_config.py. To train all models with PEFT, run the bash code: bash sh run_peft.sh

Note: - --gradient_accumulation_steps 32 is for a single GPU. If the model is trained with 8 GPUs, gradient_accumulation_steps should be adjusted to 4.

FFT

To train all models with FFT, run the bash code: bash sh run_fft.sh

Note: - --gradient_accumulation_steps 32 is for a single GPU. If the model is trained with 8 GPUs, gradient_accumulation_steps should be adjusted to 4.

Outputs

Outputs are under their corresponding subfolders in outputs folder.

Note: - Some outputs may be missing as they were not saved initially.

Visuals

Figures:

  • All figures are created via this colab notebook.

  • The figure of Astraios is generated via DALL-E with the prompt of A fantasy-inspired, adorable illustration of Astraios, the Greek god of dusk. The setting is a serene evening landscape with a gradient sky transition.

Licenses

Everything is licensed as permissively as possible to us.

The evaluation repository Code Generation LM Evaluation Harness is licensed under the Apache-2.0 license.

PEFT library is licensed under Apache-2.0 license.

All Astraios models are licensed under the same license as StarCoder (Commercial except for use cases deemed harmful).

The remaining code originally created in this repository is licensed under the MIT License.

Todo List

  • [ ] Organize the file names in outputs/ folder.
  • [ ] Organize the bash script in evaluation/ folder.
  • [ ] Merge the PR to bigcode-evaluation-harness.

Citation

bibtex @article{zhuo2024astraios, title={Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models}, author={Terry Yue Zhuo and Armel Zebaze and Nitchakarn Suppattarachai and Leandro von Werra and Harm de Vries and Qian Liu and Niklas Muennighoff}, journal={https://arxiv.org/abs/2401.00788}, year={2024} }

Owner

  • Name: BigCode Project
  • Login: bigcode-project
  • Kind: organization
  • Email: contact@bigcode-project.org

BigCode Project is an open scientific collaboration run by Hugging Face and ServiceNow Research, focused on open and responsible development of LLMs for code.

GitHub Events

Total
  • Watch event: 2
Last Year
  • Watch event: 2