https://github.com/chains-project/bump

A dataset of reproducible breaking dependency updates, SANER 2024 (https://doi.org/10.1109/SANER60148.2024.00024)

https://github.com/chains-project/bump

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org, zenodo.org
  • Committers with academic emails
    1 of 9 committers (11.1%) from academic institutions
  • Institutional organization owner
    Organization chains-project has institutional domain (chains.proj.kth.se)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.7%) to scientific vocabulary

Keywords

dependency-analysis java

Keywords from Contributors

controllers transformer gan report autograder hack animal standardization interpretability metering
Last synced: 5 months ago · JSON representation

Repository

A dataset of reproducible breaking dependency updates, SANER 2024 (https://doi.org/10.1109/SANER60148.2024.00024)

Basic Info
  • Host: GitHub
  • Owner: chains-project
  • License: mit
  • Language: Java
  • Default Branch: main
  • Homepage:
  • Size: 237 MB
Statistics
  • Stars: 20
  • Watchers: 5
  • Forks: 8
  • Open Issues: 14
  • Releases: 0
Topics
dependency-analysis java
Created over 3 years ago · Last pushed 6 months ago
Metadata Files
Readme License

README.md

BUMP Breaking Updates

Overview

Bump is a benchmark of breaking dependency updates. It can be downloaded from Zenodo. A breaking updates is defined as: a pair of commits for a Java project, which we designate as the pre-commit and the breaking-commit, typically performed by bots such as Dependabot and Renovate. When we build the project with the pre-commit, compilation and test execution are successful, while the build of the breaking-commit fails. Each breaking-commit is a one-line change in the Maven pom file.

If you use Bump, please cite:

bibtex @inproceedings{bump2024, title = {BUMP: A Benchmark of Reproducible Breaking Dependency Updates}, booktitle = {Proceedings of SANER}, year = {2024}, doi = {10.1109/SANER60148.2024.00024}, author = {Frank Reyes and Yogya Gamage and Gabriel Skoglund and Benoit Baudry and Martin Monperrus}, url = {http://arxiv.org/pdf/2401.09906}, }

Download BUMP

All breaking updates in Bump are stored within Docker images. They can be downloaded from Zenodo.
To easily download the Zenodo tar file and load the associated Docker images use the following commands:
Warning: You need a minimum of 250 GB of free disk space to load the images.

```bash $ wget https://zenodo.org/records/10041883/files/bump.tar.gz $ docker load -i bump.tar.gz # this loads 1142 images $ docker images | wc -l 1142

running a breaking commit

docker run ghcr.io/chains-project/breaking-updates:{-pre,-breaking}

$ docker run ghcr.io/chains-project/breaking-updates:5769bdad76925da568294cb8a40e7d4469699ac3-breaking ```

Data format

Gathered data can be found as JSON files in the data folder. There are 3 sub-folders inside the data folder. * benchmark : contains the successfully reproduced breaking dependency updates. * in-progress-reproductions : contains the potential breaking updates which have not yet been reproduced. * sanity-check-failures : contains the data that are removed after the sanity-check procedure. * unsuccessful-reproductions : contains the data regarding unsuccessful reproduction attempts. Each file inside these folders is named according to the SHA of the (potential) breaking commit.

The JSON files in our benchmark of breaking dependency updates have the following JSON data format. json { "url": "<github pr url>", "project": "<github_project>", "projectOrganisation": "<github_project_organisation>", "breakingCommit": "<sha>", "prAuthor": "{human|bot}", "preCommitAuthor": "{human|bot}", "breakingCommitAuthor": "{human|bot}", "updatedDependency": { "dependencyGroupID": "<group id>", "dependencyArtifactID": "<artifact id>", "previousVersion": "<label indicating the previous version of the dependency>", "newVersion": "<label indicating the new version of the dependency>", "dependencyScope": "{compile|provided|runtime|system|import}", "versionUpdateType": "{major|minor|patch|other}", "githubCompareLink": "<the github comparison link for the previous and breaking tag releases of the updated dependency if it exists>", "mavenSourceLinkPre": "<maven source jar link for the previous release of the updated dependency if it exists>", "mavenSourceLinkBreaking": "<maven source jar link for the breaking release of the updated dependency if it exists>", "updatedFileType": "{pom|jar}", "dependencySection" : "{dependencies|dependencyManagement|buildPlugins|buildPluginManagement|profileBuildPlugins}" }, "preCommitReproductionCommand": "<the command to compile and run tests without the breaking update commit>", "breakingUpdateReproductionCommand": "<the command to compile and run tests with the breaking update commit>", "javaVersionUsedForReproduction": "<the java version version used for reproduction>", "failureCategory": "<the category of the root cause of the reproduction failure>" }

Workflow

The data gathering workflow is as follows: * Stage 1 : Collect Java projects which meet the following criteria. * builds with Maven, * has at least 100 commits on the default branch, * created in the last 10 years, * has at least 3 contributors, * has at least 10 stars. * Stage 2 : Identify the breaking updates. * Stage 3 : Reproduce the failure locally under the assumptions documented below. * Assumptions: * We run Linux (kernel version and distribution to be documented) * We use Maven version 3.8.6 * We run OpenJDK * As a starting point, we use Java 11 * The reproduction can result in different successful outcomes based on the Maven goal where the failure happens. For example, * The compilation step fails after the dependency is updated, but not before. This is a successful reproduction corresponding to the label "COMPILATIONFAILURE". * The test step fails _after the dependency is updated, but not before. This is a successful reproduction corresponding to the label "TESTFAILURE". * The project build fails _after the dependency is updated due to unresolved dependencies, but not before. This is a successful reproduction corresponding to the label "DEPENDENCYRESOLUTIONFAILURE". * The project build fails after the dependency is updated due to enforcer rules violations, but not before. This is a successful reproduction corresponding to the label "ENFORCERFAILURE". * The project build fails _after the dependency is updated when executing the plugin dependency-lock-maven-plugin, but not before. This is a successful reproduction corresponding to the label "DEPENDENCYLOCKFAILURE". * The project build fails after the dependency is updated due to the activation of the failOnWarning option in the configuration file. This is a successful reproduction corresponding to the label "WERROR_FAILURE". * Stage 4 : Build two Docker images for each successfully reproduced breaking update, and isolate all environment / network requests by downloading them. After stage 4, by running the preCommitReproductionCommand, and the breakingUpdateReproductionCommand, the successful build of the previous commit and the failing build of the breaking commit can be reproduced.

Tools

The BreakingUpdateMiner

In order to gather breaking dependency updates from GitHub, a tool called the BreakingUpdateMiner is available.
You can build this tool locally using mvn package with Java 17. You can then run the tool and print usage information with the command: bash java -jar target/BreakingUpdateMiner.jar --help

The BreakingUpdateReproducer

In order to perform local reproduction once potential breaking uppdates have been found by the miner, a tool called the BreakingUpdateReproducer is available. You can build this tool locally using mvn package with Java 17. You can then run the tool and print usage information with the command: bash java -jar target/BreakingUpdateReproducer.jar --help

Owner

  • Name: CHAINS research project at KTH Royal Institute of Technology
  • Login: chains-project
  • Kind: organization

"Consistent Hardening and Analysis of Software Supply Chains" at KTH, funded by SSF

GitHub Events

Total
  • Watch event: 4
  • Delete event: 20
  • Push event: 85
  • Pull request event: 58
  • Fork event: 2
  • Create event: 34
Last Year
  • Watch event: 4
  • Delete event: 20
  • Push event: 85
  • Pull request event: 58
  • Fork event: 2
  • Create event: 34

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 417
  • Total Committers: 9
  • Avg Commits per committer: 46.333
  • Development Distribution Score (DDS): 0.657
Past Year
  • Commits: 71
  • Committers: 4
  • Avg Commits per committer: 17.75
  • Development Distribution Score (DDS): 0.07
Top Committers
Name Email Commits
renovate[bot] 2****] 143
Gabriel Skoglund g****d@g****m 77
Frank Reyes f****g@k****e 67
Yogya Tulip Gamage 4****e 57
github-actions[bot] 4****] 49
YogyaGamage 4****e 9
Martin Monperrus m****s@g****g 9
frankreyesSC f****s@n****m 3
Lukas l****s@f****h 3
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 32
  • Total pull requests: 107
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 23 days
  • Total issue authors: 6
  • Total pull request authors: 5
  • Average comments per issue: 1.22
  • Average comments per pull request: 0.24
  • Merged pull requests: 91
  • Bot issues: 1
  • Bot pull requests: 67
Past Year
  • Issues: 0
  • Pull requests: 39
  • Average time to close issues: N/A
  • Average time to close pull requests: 2 months
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 28
  • Bot issues: 0
  • Bot pull requests: 39
Top Authors
Issue Authors
  • monperrus (14)
  • gabrielskoglund (11)
  • yogyagamage (3)
  • renovate[bot] (3)
  • frankreyesgarcia (3)
  • snadi (1)
  • MartinWitt (1)
  • LukvonStrom (1)
Pull Request Authors
  • renovate[bot] (139)
  • yogyagamage (21)
  • gabrielskoglund (7)
  • frankreyesgarcia (5)
  • monperrus (4)
  • LukvonStrom (3)
Top Labels
Issue Labels
enhancement (8) bug (2)
Pull Request Labels

Dependencies

pom.xml maven
  • com.google.code.gson:gson 2.10
  • com.squareup.okhttp3:okhttp 4.10.0
  • info.picocli:picocli-codegen 4.7.0
  • org.kohsuke:github-api 1.313
  • junit:junit 4.13.2 test
  • org.junit.jupiter:junit-jupiter 5.9.1 test