pie-perf

Training language models to make programs faster

https://github.com/madaan/pie-perf

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.2%) to scientific vocabulary

Keywords

code-generation code-optimization llms optimization software-engineering
Last synced: 6 months ago · JSON representation

Repository

Training language models to make programs faster

Basic Info
  • Host: GitHub
  • Owner: madaan
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage: htttps://pie4perf.com
  • Size: 479 MB
Statistics
  • Stars: 87
  • Watchers: 6
  • Forks: 14
  • Open Issues: 6
  • Releases: 0
Topics
code-generation code-optimization llms optimization software-engineering
Created about 3 years ago · Last pushed almost 2 years ago
Metadata Files
Readme Citation

README.md

Learning Performance-Improving Code Edits

  • Repository for Learning Performance-Improving Code Edits (paper, website).

image

Updates

[May 2023] A large number of problem statements in codenet were in Japanese. We have translated them to English using ChatGPT/GPT-4. The files are located here

Dataset

  • PIE is based on IBM CodeNet. Huge thanks to the authors of CodeNet for making their curated dataset available!

  • All trajectories (tsv) are located here. Columns description:

  • user_id: user id

  • problem_id: problem id. Details about the problems can be found in data/problem_list.csv

  • language: programming language

  • submission_id_v0: submission id of the first version of the code

  • submission_id_v1: submission id of the improved version of the code

  • cpu_time_v0: cpu time of the first version of the code

  • cpu_time_v1: cpu time of the second version of the code. cpu_time_v0 > cpu_time_v1 by at least 1% for all the pairs in the dataset. For pairs where the first version was TLE, cpu_time_v0 is set to some high value (e.g. 1000000).

  • memory_v{0,1}: memory used by the code in the two versions. We can also use memory_v0 > memory_v1 to filter out pairs.

  • status_v{0,1}: status of the code in the two versions. status_v0 can be Accepted or Time Limit Exceeded, but status_v1 is always Accepted.

  • improvement_frac: percentage of improvement of the second version of the code with respect to the first version. improvement_frac is always > 0.

  • Python splits

  • C++ splits

Each file is a jsonl:

{ "user_id": "u187233527", "problem_id": "p03317", "language": "python", "submission_id_v0": "s743350482", "submission_id_v1": "s961810347", "cpu_time_v0": 28.0, "cpu_time_v1": 17.0, "memory_v0": 3060.0, "memory_v1": 3060.0, "status_v0": "Accepted", "status_v1": "Accepted", "improvement_frac": 39.29, "input": "N, K = list(map(int, input().split()))\n\nN -= K\n\nans = 1\n\nwhile N > 0:\n\n N -= K - 1\n\n ans += 1\n\nprint(ans)", "target": "import math\n\n\n\nn, k = list(map(int, input().split()))\n\nprint((math.ceil((n - 1) / (k - 1))))", "code_v0_loc": 7.0, "code_v1_loc": 4.0, "code_v0_num_chars": 101, "code_v1_num_chars": 84, "code_v0_no_empty_lines": "N, K = list(map(int, input().split()))\nN -= K\nans = 1\nwhile N > 0:\n N -= K - 1\n ans += 1\nprint(ans)\n", "code_v1_no_empty_lines": "import math\n\nn, k = list(map(int, input().split()))\nprint((math.ceil((n - 1) / (k - 1))))\n", "code_same": false, "relative_loc_diff_percent": 42.8571428571, "diff": [ "-N, K = list(map(int, input().split()))", "-N -= K", "-ans = 1", "-while N > 0:", "- N -= K - 1", "- ans += 1", "-print(ans)", "+import math", "+", "+n, k = list(map(int, input().split()))", "+print((math.ceil((n - 1) / (k - 1))))" ], "diff_only_import_comment": false, "measured_runtime_v0": 0.045435272, "measured_runtime_v1": 0.0459265449, "runtime_lift": 0.9893030722 }

We use src/make_splits.py to create these splits. The exact configuration for creating each split is specified in the folder.

Evaluating Your Method

  • Suppose you have a new method for code optimization, say awesome_optimization. We provide a sandbox for evaluating the generated code. The sandbox runs the input and the generated code over a set of test cases and reports the performance of both. We provide
  1. Save the generations in a jsonl file with the following fields:

js { "slow_code_col": "the column name for the input code", "model_generated_potentially_faster_code_col": "slow_code_col after applying awesome_optimization. This is the code that will be evaluated. You can also provide a list of different candidates here, and the evaluation will be done for each candidate", }

  1. Next, we need to provide the path to the file with some metadata. We call it the reference_file but providing references are optional. The main purpose of this file is to provide information like the language of the code, the problem id, etc. The file should have slow_code_col (same as the generations file) and problem_id. We join the generations file and the references file on the slow_code_col to get the problem id.

  2. Finally, we need to provide the path to the file with the actual test cases. We call it the inputs_outputs_basepath. This is a directory with the following structure:

inputs_outputs_basepath/{problem_id}/{inputs, outputs}.txt

where {inputs, outputs}.txt are the input and output files for the problem with id problem_id. The input and output are plain text files. Each program is fed inputs.txt and the output is compared with outputs.txt.

  1. So far, we have discussed the generation file, the reference file, and the inputs/outputs directory. In addition to these, we need to provide some information about the run. Specifically, the number of times each program should be run, the number of programs to evaluate, the timeout, and so on.

We wrap all of this information is provided in a yaml file. Here is an example:

yaml model_generated_outputs_path: "data/sample/codex_greedy_outputs.jsonl" inputs_outputs_basepath: "data/codenet/public_test_cases/" reference_file_path: "data/sample/py_reference.jsonl" output_report_file_path: "data/sample//codex_greedy_outputs.jsonl.report" num_problems_to_evaluate: -1 num_trials: 25 ignore_first_k: 1 max_time_per_run: 10 temp_dir: null model_generated_potentially_faster_code_col: "generated_answers" slow_code_col: "input" reference_code_col: "target" is_prompt_based: true cpu_number: 0

Please see src/codenet_eval/evalconfig.py for the full list of parameters and their descriptions.

  1. Finally, we can run the evaluation. We provide a script for this: src/codenet_eval/run_eval.py. The script takes the yaml file as input. Here is an example:

bash python src/codenet_eval/run_eval.py --eval_config data/sample/sample_eval_config.yaml


Citation

@article{madaan2023learning, title={Learning Performance-Improving Code Edits}, author={Madaan, Aman and Shypula, Alexander and Alon, Uri and Hashemi, Milad and Ranganathan, Parthasarathy and Yang, Yiming and Neubig, Graham and Yazdanbakhsh, Amir}, journal={arXiv preprint arXiv:2302.07867}, year={2023} }

Owner

  • Name: Aman Madaan
  • Login: madaan
  • Kind: user
  • Location: Pittsburgh, PA

PhD student at CMU

GitHub Events

Total
  • Watch event: 12
  • Fork event: 1
Last Year
  • Watch event: 12
  • Fork event: 1

Dependencies

requirements.txt pypi
  • black ==22.6.0
  • clang-format *
  • cppclean ==0.13.0
  • joblib ==1.1.0
  • multiprocess *
  • numpy ==1.23.1
  • pandarallel ==1.6.3
  • pandas ==1.4.4
  • psutil ==5.9.2
  • pyyaml *
  • tqdm ==4.49.0
docs/Gemfile rubygems
  • github-pages >= 0 development
  • jekyll-feed ~> 0.6 development
  • eventmachine ~> 1.2
  • kramdown-parser-gfm >= 0
  • minima ~> 2.0
  • tzinfo ~> 1.2
  • tzinfo-data >= 0
docs/Gemfile.lock rubygems
  • activesupport 5.2.8.1
  • addressable 2.8.1
  • bundler 2.2.32
  • coffee-script 2.4.1
  • coffee-script-source 1.11.1
  • colorator 1.1.0
  • commonmarker 0.17.13
  • concurrent-ruby 1.1.10
  • dnsruby 1.61.9
  • em-websocket 0.5.3
  • ethon 0.15.0
  • eventmachine 1.2.7
  • execjs 2.8.1
  • faraday 1.0.1
  • ffi 1.15.5
  • forwardable-extended 2.6.0
  • gemoji 3.0.1
  • github-pages 207
  • github-pages-health-check 1.16.1
  • html-pipeline 2.14.3
  • http_parser.rb 0.8.0
  • i18n 0.9.5
  • jekyll 3.9.0
  • jekyll-avatar 0.7.0
  • jekyll-coffeescript 1.1.1
  • jekyll-commonmark 1.3.1
  • jekyll-commonmark-ghpages 0.1.6
  • jekyll-default-layout 0.1.4
  • jekyll-feed 0.13.0
  • jekyll-gist 1.5.0
  • jekyll-github-metadata 2.13.0
  • jekyll-mentions 1.5.1
  • jekyll-optional-front-matter 0.3.2
  • jekyll-paginate 1.1.0
  • jekyll-readme-index 0.3.0
  • jekyll-redirect-from 0.15.0
  • jekyll-relative-links 0.6.1
  • jekyll-remote-theme 0.4.1
  • jekyll-sass-converter 1.5.2
  • jekyll-seo-tag 2.6.1
  • jekyll-sitemap 1.4.0
  • jekyll-swiss 1.0.0
  • jekyll-theme-architect 0.1.1
  • jekyll-theme-cayman 0.1.1
  • jekyll-theme-dinky 0.1.1
  • jekyll-theme-hacker 0.1.1
  • jekyll-theme-leap-day 0.1.1
  • jekyll-theme-merlot 0.1.1
  • jekyll-theme-midnight 0.1.1
  • jekyll-theme-minimal 0.1.1
  • jekyll-theme-modernist 0.1.1
  • jekyll-theme-primer 0.5.4
  • jekyll-theme-slate 0.1.1
  • jekyll-theme-tactile 0.1.1
  • jekyll-theme-time-machine 0.1.1
  • jekyll-titles-from-headings 0.5.3
  • jekyll-watch 2.2.1
  • jemoji 0.11.1
  • kramdown 2.3.0
  • kramdown-parser-gfm 1.1.0
  • liquid 4.0.3
  • listen 3.5.0
  • mercenary 0.3.6
  • mini_portile2 2.4.0
  • minima 2.5.1
  • minitest 5.15.0
  • multipart-post 2.2.3
  • nokogiri 1.10.10
  • octokit 4.25.1
  • pathutil 0.16.2
  • public_suffix 3.1.1
  • rb-fsevent 0.11.2
  • rb-inotify 0.10.1
  • rexml 3.2.5
  • rouge 3.19.0
  • ruby-enum 0.9.0
  • rubyzip 1.3.0
  • safe_yaml 1.0.5
  • sass 3.7.4
  • sass-listen 4.0.0
  • sawyer 0.9.2
  • simpleidn 0.2.1
  • terminal-table 1.8.0
  • thread_safe 0.3.6
  • typhoeus 1.4.0
  • tzinfo 1.2.10
  • tzinfo-data 1.2022.5
  • unf 0.1.4
  • unf_ext 0.0.8.2
  • unicode-display_width 1.8.0
  • wdm 0.1.1