https://github.com/assert-kth/ghrb

A Repository of Real, Recent Java Bugs

https://github.com/assert-kth/ghrb

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.2%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

A Repository of Real, Recent Java Bugs

Basic Info
  • Host: GitHub
  • Owner: ASSERT-KTH
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 702 KB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of coinse/GHRB
Created over 2 years ago · Last pushed over 2 years ago

https://github.com/ASSERT-KTH/GHRB/blob/main/

# GHRB: GitHub Recent Bugs

GitHub Recent Bugs (GHRB) is a collection of real-world bugs merged _after_
the OpenAI LLM training cutoff point (Sep. 2021), along with a command-line
interface to easily execute code. It was constructed to facilitate
evaluation of LLM-based automated debugging techniques, free from concern
of training data contamination.

## Setting up GHRB

First, build the docker image:
```bash
cd Docker
docker build -t ghrb_framework .
```

Next, make a docker container:
```bash
sh run_docker_container.sh
```

Inside the docker, the following commands need to be executed to complete setup:

```bash
cd /root/framework
chmod +x cli.py

cd debug
python collector.py
cd ..
```

Finally, check whether cli.py runs correctly via:
```bash
./cli.py -h
```

## Using GHRB

(Note that all directory paths given to the tool need to be absolute paths 
within the container.)

 1. `info` - View information about a project or a particular bug.
    *  Example: `./cli.py info -p gson`
 2. `checkout` - checkout the buggy or fixed version of a bug.
    *  Example: `./cli.py checkout -p gson -v 1b -w /root/framework/testing` 
 3. `compile` - compile the code in a directory.
    *  Example: `./cli.py compile -w /root/framework/testing`
 4. `test` - run the tests for a project.
    *  Example: `./cli.py test -w /root/framework/testing`

## Using Fetch/Filter Scripts

With a file that consists of a list of url to the repositories, use
```
python filter_repo.py --repository_list 
```
to collect the metadata of repositories prior to gathering pull request information. `` should look like:

```
https://github.com/coinse/GHRB
https://github.com/coinse/libro
...
```

Note that the script will automatically filter out non-English repositories and repositories where Java consists <90% according to GitHub language statistics.

With the pull request information, gather the actual pull request data with: ```bash python collect_raw_data.py --api_token --repository_file ``` where `` should have a format like: ```jsonc [ { "name": // name of the repo, "owner": { "login": // owner of the repo }, "url": // full url for git clone }, ] ``` should be included in the repository manually. Note that each metadata item should be inside a list. An example of such file can be found under the `example/` directory. ## Publication The companion preprint for our work is here: [http://arxiv.org/abs/2310.13229](http://arxiv.org/abs/2310.13229).

Owner

  • Name: ASSERT
  • Login: ASSERT-KTH
  • Kind: organization
  • Location: Sweden

assertEquals("Research group at KTH Royal Institute of Technology, Stockholm, Sweden", description);

GitHub Events

Total
Last Year