canitedit

Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions

https://github.com/nuprl/canitedit

Science Score: 62.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
✓
Institutional organization owner
Organization nuprl has institutional domain (www.ccs.neu.edu)
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.1%) to scientific vocabulary

Last synced: 9 months ago · JSON representation ·

Repository

Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions

Basic Info

Host: GitHub
Owner: nuprl
License: other
Language: Python
Default Branch: main
Homepage:
Size: 235 KB

Statistics

Stars: 47
Watchers: 5
Forks: 7
Open Issues: 3
Releases: 0

Created over 2 years ago · Last pushed 10 months ago

Metadata Files

Readme License Citation

Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions

CanItEdit is a benchmark for evaluating LLMs on instructional code editing, the task of updating a program given a natural language instruction. The benchmark contains 105 hand-crafted Python programs with before and after code blocks, two types of natural language instructions (descriptive and lazy), and a hidden test suite.

See our paper for more.

This repository provides code for evaluating models on the benchmark, and the code to reproduce EditPackFT and EditCoder, a dataset and a LLM built for instructional code editing.

The CanItEdit benchmark dataset, EditCoder model, and EditPackFT dataset can be found on HuggingFace:

CanItEdit: https://huggingface.co/datasets/nuprl/CanItEdit
EditCoder: https://huggingface.co/nuprl/EditCoder-6.7b-v1
EditPackFT: https://huggingface.co/datasets/nuprl/EditPackFT

Cloning the repository

It is very important to clone this repository and initialize all submodule recursively. This can be done with the following command:

bash git clone --recurse-submodules https://github.com/nuprl/CanItEdit

Structure

./benchmark contains the CanItEdit benchmark dataset and code for generating and evaluating completions
./editcoder contains code to train an EditCoder model
./editpackft contains code to reproduce the EditPackFT dataset
./requirements.txt contains the requirements for running the code in this repository

Citation

If you use this code or the CanItEdit benchmark, please cite our paper:

@inproceedings{cassano:canitedit, title={Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions}, author={Federico Cassano and Luisa Li and Akul Sethi and Noah Shinn and Abby Brennan-Jones and Anton Lozhkov and Carolyn Jane Anderson and Arjun Guha}, booktitle={Conference on Language Modeling (COLM)}, year={2024}, }

Owner

Name: Northeastern University Programming Research Lab
Login: nuprl
Kind: organization
Location: Boston

Website: http://www.ccs.neu.edu/research/prl/
Repositories: 39
Profile: https://github.com/nuprl

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: "Cassano"
    given-names: "Federico"
  - family-names: "Li"
    given-names: "Luisa"
  - family-names: "Sethi"
    given-names: "Akul"
  - family-names: "Shinn"
    given-names: "Noah"
  - family-names: "Brennan-Jones"
    given-names: "Abby"
  - family-names: "Lozkhov"
    given-names: "Anton"
  - family-names: "Anderson"
    given-names: "Carolyn Jane"
  - family-names: "Guha"
    given-names: "Arjun"
title: "Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions"
version: 1.0.0
date-released: 2024
url: "https://github.com/example/canitredit"
preferred-citation:
  type: conference-paper
  authors:
    - family-names: "Cassano"
      given-names: "Federico"
    - family-names: "Li"
      given-names: "Luisa"
    - family-names: "Sethi"
      given-names: "Akul"
    - family-names: "Shinn"
      given-names: "Noah"
    - family-names: "Brennan-Jones"
      given-names: "Abby"
    - family-names: "Lozkhov"
      given-names: "Anton"
    - family-names: "Anderson"
      given-names: "Carolyn Jane"
    - family-names: "Guha"
      given-names: "Arjun"
  title: "Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions"
  year: 2024
  conference:
    name: "Conference on Language Modelling (COLM)"

GitHub Events

Total

Commit comment event: 1
Issues event: 4
Watch event: 7
Issue comment event: 1
Push event: 5
Fork event: 4

Last Year

Commit comment event: 1
Issues event: 4
Watch event: 7
Issue comment event: 1
Push event: 5
Fork event: 4

Dependencies

benchmark/Dockerfile docker

ubuntu 22.04 build

benchmark/requirements.txt pypi

coverage ==7.3.2
pandas ==2.0.2
torch ==2.1.0
z3-solver ==4.12.2.0

requirements.txt pypi

accelerate ==0.24.1
bitsandbytes ==0.41.0
datasets ==2.15.0
deepspeed ==0.12.3
editdistance ==0.6.2
huggingface-hub ==0.19.4
openai ==1.2.0
peft ==0.4.0
ray ==2.8.0
rouge-rs ==0.1.0
scikit-learn ==1.3.0
tokenizers ==0.15.0
torch ==2.1.0
tqdm ==4.65.0
transformers ==4.35.2
vllm ==0.2.2
wandb ==0.15.4
wordcloud ==1.9.2

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

canitedit

Science Score: 62.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions

See our paper for more.

Cloning the repository

Structure

Citation

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies