canitedit
Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions
Science Score: 62.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
✓Institutional organization owner
Organization nuprl has institutional domain (www.ccs.neu.edu) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.1%) to scientific vocabulary
Repository
Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions
Basic Info
Statistics
- Stars: 47
- Watchers: 5
- Forks: 7
- Open Issues: 3
- Releases: 0
Metadata Files
README.md
Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions
CanItEdit is a benchmark for evaluating LLMs on instructional code editing, the task of
updating a program given a natural language instruction. The benchmark contains 105
hand-crafted Python programs with before and after code blocks,
two types of natural language instructions (descriptive and lazy), and a hidden test suite.
See our paper for more.
This repository provides code for evaluating models on the benchmark, and the code to reproduce EditPackFT and EditCoder, a dataset and a LLM built for instructional code editing.
The CanItEdit benchmark dataset, EditCoder model, and EditPackFT dataset can be found on HuggingFace:
- CanItEdit: https://huggingface.co/datasets/nuprl/CanItEdit
- EditCoder: https://huggingface.co/nuprl/EditCoder-6.7b-v1
- EditPackFT: https://huggingface.co/datasets/nuprl/EditPackFT
Cloning the repository
It is very important to clone this repository and initialize all submodule recursively. This can be done with the following command:
bash
git clone --recurse-submodules https://github.com/nuprl/CanItEdit
Structure
./benchmarkcontains the CanItEdit benchmark dataset and code for generating and evaluating completions./editcodercontains code to train an EditCoder model./editpackftcontains code to reproduce the EditPackFT dataset./requirements.txtcontains the requirements for running the code in this repository
Citation
If you use this code or the CanItEdit benchmark, please cite our paper:
@inproceedings{cassano:canitedit,
title={Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions},
author={Federico Cassano and Luisa Li and Akul Sethi and Noah Shinn and Abby Brennan-Jones and Anton Lozhkov and Carolyn Jane Anderson and Arjun Guha},
booktitle={Conference on Language Modeling (COLM)},
year={2024},
}
Owner
- Name: Northeastern University Programming Research Lab
- Login: nuprl
- Kind: organization
- Location: Boston
- Website: http://www.ccs.neu.edu/research/prl/
- Repositories: 39
- Profile: https://github.com/nuprl
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Cassano"
given-names: "Federico"
- family-names: "Li"
given-names: "Luisa"
- family-names: "Sethi"
given-names: "Akul"
- family-names: "Shinn"
given-names: "Noah"
- family-names: "Brennan-Jones"
given-names: "Abby"
- family-names: "Lozkhov"
given-names: "Anton"
- family-names: "Anderson"
given-names: "Carolyn Jane"
- family-names: "Guha"
given-names: "Arjun"
title: "Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions"
version: 1.0.0
date-released: 2024
url: "https://github.com/example/canitredit"
preferred-citation:
type: conference-paper
authors:
- family-names: "Cassano"
given-names: "Federico"
- family-names: "Li"
given-names: "Luisa"
- family-names: "Sethi"
given-names: "Akul"
- family-names: "Shinn"
given-names: "Noah"
- family-names: "Brennan-Jones"
given-names: "Abby"
- family-names: "Lozkhov"
given-names: "Anton"
- family-names: "Anderson"
given-names: "Carolyn Jane"
- family-names: "Guha"
given-names: "Arjun"
title: "Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions"
year: 2024
conference:
name: "Conference on Language Modelling (COLM)"
GitHub Events
Total
- Commit comment event: 1
- Issues event: 4
- Watch event: 7
- Issue comment event: 1
- Push event: 5
- Fork event: 4
Last Year
- Commit comment event: 1
- Issues event: 4
- Watch event: 7
- Issue comment event: 1
- Push event: 5
- Fork event: 4
Dependencies
- ubuntu 22.04 build
- coverage ==7.3.2
- pandas ==2.0.2
- torch ==2.1.0
- z3-solver ==4.12.2.0
- accelerate ==0.24.1
- bitsandbytes ==0.41.0
- datasets ==2.15.0
- deepspeed ==0.12.3
- editdistance ==0.6.2
- huggingface-hub ==0.19.4
- openai ==1.2.0
- peft ==0.4.0
- ray ==2.8.0
- rouge-rs ==0.1.0
- scikit-learn ==1.3.0
- tokenizers ==0.15.0
- torch ==2.1.0
- tqdm ==4.65.0
- transformers ==4.35.2
- vllm ==0.2.2
- wandb ==0.15.4
- wordcloud ==1.9.2