ReproZip
ReproZip: The Reproducibility Packer - Published in JOSS (2016)
Science Score: 100.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org -
✓Committers with academic emails
3 of 16 committers (18.8%) from academic institutions -
✓Institutional organization owner
Organization vida-nyu has institutional domain (vida.engineering.nyu.edu) -
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Keywords from Contributors
Repository
ReproZip is a tool that simplifies the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science.
Basic Info
- Host: GitHub
- Owner: VIDA-NYU
- License: bsd-3-clause
- Language: Python
- Default Branch: 1.x
- Homepage: https://www.reprozip.org/
- Size: 19.8 MB
Statistics
- Stars: 344
- Watchers: 17
- Forks: 36
- Open Issues: 77
- Releases: 37
Topics
Metadata Files
README.md
ReproZip
ReproZip is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science.
It tracks operating system calls and creates a package that contains all the binaries, files and dependencies required to run a given command on the author's computational environment (packing step). A reviewer can then extract the experiment in his environment to reproduce the results (unpacking step).
Quickstart
We have an example repository with a variety of different software. Don't hesitate to check it out, and contribute your own example if use ReproZip for something new!
Packing
Packing experiments is only available for Linux distributions. In the environment where the experiment is originally executed, first install reprozip:
$ pip install reprozip
Then, run your experiment with reprozip. Suppose you execute your experiment by originally running the following command:
$ ./myexperiment -my --options inputs/somefile.csv other_file_here.bin
To run it with reprozip, you just need to use the prefix reprozip trace:
$ reprozip trace ./myexperiment -my --options inputs/somefile.csv other_file_here.bin
This command creates a .reprozip-trace directory, in which you'll find the configuration file, named config.yml. You can edit the command line and environment variables, and choose which files to pack.
If you are using Debian or Ubuntu, most of these files (library dependencies) are organized by package. You can add or remove files, or choose not to include a package by changing option packfiles from true to false. In this way, smaller packs can be created with reprozip (if space is an issue), and reprounzip can download these files from the package manager; however, note this is only available for Debian and Ubuntu for now, and also be aware that package versions might differ. Choosing which files to pack is also important to remove sensitive information and third-party software that is not open source and should not be distributed.
Once done editing the configuration file (or even if you did not change anything), run the following command to create a ReproZip package named my_experiment:
$ reprozip pack my_experiment.rpz
Voilà! Now your experiment has been packed, and you can send it to your collaborators, reviewers, and researchers around the world!
Note that you can open the help message for any reprozip command by using the flag -h.
Unpacking
Do you need to unpack an experiment in a Linux machine? Easy! First, install reprounzip:
$ pip install reprounzip
Then, if you want to unpack everything in a single directory named mydirectory and execute the experiment from there, use the prefix reprounzip directory:
$ reprounzip directory setup my_experiment.rpz mydirectory
$ reprounzip directory run mydirectory
In case you prefer to build a chroot environment under mychroot, use the prefix reprounzip chroot:
$ reprounzip chroot setup my_experiment.rpz mychroot
$ reprounzip chroot run mychroot
Note that the previous options do not interfere with the original configuration of the environment, so don't worry! If you are using Debian or Ubuntu, reprounzip also has an option to install all the library dependencies directly on the machine using package managers (rather than just copying the files from the .rpz package). Be aware that this will interfere in your environment and it may update your library packages, so use it at your own risk! For this option, just use the prefix reprounzip installpkgs:
$ reprounzip installpkgs my_experiment.rpz
What if you want to reproduce the experiment in Windows or Mac OS X? You can build a virtual machine with the experiment! Easy as well! First, install the plugin reprounzip-vagrant:
$ pip install reprounzip-vagrant
Note that (i) you must install reprounzip first, and (ii) the plugin requires having Vagrant installed. Then, use the prefix reprounzip vagrant to create and start a virtual machine under directory mytemplate:
$ reprounzip vagrant setup my_experiment.rpz mytemplate
To execute the experiment, simply run:
$ reprounzip vagrant run mytemplate
Alternatively, you may use Docker containers to reproduce the experiment, which also works under Linux, Mac OS X, and Windows! First, install the plugin reprounzip-docker:
$ pip install reprounzip-docker
Then, assuming that you want to create the container under directory mytemplate, simply use the prefix reprounzip docker:
$ reprounzip docker setup my_experiment.rpz mytemplate
$ reprounzip docker run mytemplate
Remember that you can open the help message and learn more about other available flags and options by using the flag -h for any reprounzip command.
Citing ReproZip
Please use the following when citing ReproZip (BibTeX):
ReproZip: Computational Reproducibility With Ease
F. Chirigati, R. Rampin, D. Shasha, and J. Freire.
In Proceedings of the 2016 ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 2085-2088, 2016
Contribute
Please subscribe to and contact the reprozip@nyu.edu mailing list for questions, suggestions and discussions about using reprozip.
Bugs and feature plannings are tracked in the GitHub issues. Feel free to add an issue!
To suggest changes to this source code, feel free to raise a GitHub pull request. Any contributions received are assumed to be covered by the BSD 3-Clause license. We might ask you to sign a Contributor License Agreement before accepting a larger contribution.
License
- Copyright (C) 2014, New York University
Licensed under a BSD 3-Clause license. See the file LICENSE.txt for details.
Links and References
For more detailed information, please refer to our website, as well as to our documentation.
ReproZip is currently being developed at NYU. The team includes:
Owner
- Name: VIDA-NYU
- Login: VIDA-NYU
- Kind: organization
- Location: New York, NY
- Website: https://vida.engineering.nyu.edu/
- Twitter: nyuvida
- Repositories: 92
- Profile: https://github.com/VIDA-NYU
Visualization, Imaging, and Data Analysis Center at New York University
JOSS Publication
ReproZip: The Reproducibility Packer
Authors
Tags
reproducibility reproducible research provenance archive sharingCitation (CITATION.cff)
cff-version: 1.2.0
message: If you use this software, please cite it as below.
authors:
- family-names: Rampin
given-names: Remi
affiliation: New York University
orcid: https://orcid.org/0000-0002-0524-2282
website: https://remi.rampin.org/
- family-names: Freire
given-names: Juliana
affiliation: New York University
orcid: https://orcid.org/0000-0003-3915-7075
website: https://vgc.poly.edu/~juliana/
- family-names: Chirigati
given-names: Fernando
affiliation: New York University
orcid: https://orcid.org/0000-0002-9566-5835
website: http://fchirigati.com/
- family-names: Shasha
given-names: Dennis
affiliation: New York University
orcid: https://orcid.org/0000-0002-7036-3312
website: http://cs.nyu.edu/shasha/
- family-names: Rampin
given-names: Vicky
affiliation: New York University
orcid: https://orcid.org/0000-0003-4298-168X
website: https://vicky.rampin.org/
license: BSD-3-Clause
url: https://www.reprozip.org/
repository-code: https://github.com/VIDA-NYU/reprozip
title: ReproZip
abstract: |
ReproZip is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science. It tracks operating system calls and creates a package that contains all the binaries, files and dependencies required to run a given command on the author's computational environment (packing step).
keywords: [python, linux, docker, reproducibility, provenance, reproducible-research, reproducible-science]
references:
- type: proceedings
doi: 10.1145/2882903.2899401
conference:
name: "SIGMOD '16"
website: https://www.sigmod2016.org/
city: San Francisco
country: US
authors:
- family-names: Rampin
given-names: Remi
affiliation: New York University
orcid: https://orcid.org/0000-0002-0524-2282
website: https://remi.rampin.org/
- family-names: Freire
given-names: Juliana
affiliation: New York University
orcid: https://orcid.org/0000-0003-3915-7075
website: https://vgc.poly.edu/~juliana/
- family-names: Chirigati
given-names: Fernando
affiliation: New York University
orcid: https://orcid.org/0000-0002-9566-5835
website: http://fchirigati.com/
- family-names: Shasha
given-names: Dennis
affiliation: New York University
orcid: https://orcid.org/0000-0002-7036-3312
website: http://cs.nyu.edu/shasha/
date-published: 2016-06-26
year: 2016
month: 6
title: "ReproZip: Computational Reproducibility With Ease"
abstract: |
We present ReproZip, the recommended packaging tool for the SIGMOD Reproducibility Review. ReproZip was designed to simplify the process of making an existing computational experiment reproducible across platforms, even when the experiment was put together without reproducibility in mind. The tool creates a self-contained package for an experiment by automatically tracking and identifying all its required dependencies. The researcher can share the package with others, who can then use ReproZip to unpack the experiment, reproduce the findings on their favorite operating system, as well as modify the original experiment for reuse in new research, all with little effort. The demo will consist of examples of non-trivial experiments, showing how these can be packed in a Linux machine and reproduced on different machines and operating systems. Demo visitors will also be able to pack and reproduce their own experiments.
- type: proceedings
doi: 10.21105/joss.00107
journal: Journal of Open Source Software
authors:
- family-names: Rampin
given-names: Remi
affiliation: New York University
orcid: https://orcid.org/0000-0002-0524-2282
website: https://remi.rampin.org/
- family-names: Freire
given-names: Juliana
affiliation: New York University
orcid: https://orcid.org/0000-0003-3915-7075
website: https://vgc.poly.edu/~juliana/
- family-names: Chirigati
given-names: Fernando
affiliation: New York University
orcid: https://orcid.org/0000-0002-9566-5835
website: http://fchirigati.com/
- family-names: Shasha
given-names: Dennis
affiliation: New York University
orcid: https://orcid.org/0000-0002-7036-3312
website: http://cs.nyu.edu/shasha/
- family-names: Rampin
given-names: Vicky
affiliation: New York University
orcid: https://orcid.org/0000-0003-4298-168X
website: https://vicky.rampin.org/
date-published: 2016-12-01
year: 2016
month: 12
title: "ReproZip: The Reproducibility Packer"
preferred-citation:
type: proceedings
doi: 10.1145/2882903.2899401
conference:
name: "SIGMOD '16"
website: https://www.sigmod2016.org/
city: San Francisco
country: US
authors:
- family-names: Rampin
given-names: Remi
affiliation: New York University
orcid: https://orcid.org/0000-0002-0524-2282
website: https://remi.rampin.org/
- family-names: Freire
given-names: Juliana
affiliation: New York University
orcid: https://orcid.org/0000-0003-3915-7075
website: https://vgc.poly.edu/~juliana/
- family-names: Chirigati
given-names: Fernando
affiliation: New York University
orcid: https://orcid.org/0000-0002-9566-5835
website: http://fchirigati.com/
- family-names: Shasha
given-names: Dennis
affiliation: New York University
orcid: https://orcid.org/0000-0002-7036-3312
website: http://cs.nyu.edu/shasha/
date-published: 2016-06-26
year: 2016
month: 6
title: "ReproZip: Computational Reproducibility With Ease"
abstract: |
We present ReproZip, the recommended packaging tool for the SIGMOD Reproducibility Review. ReproZip was designed to simplify the process of making an existing computational experiment reproducible across platforms, even when the experiment was put together without reproducibility in mind. The tool creates a self-contained package for an experiment by automatically tracking and identifying all its required dependencies. The researcher can share the package with others, who can then use ReproZip to unpack the experiment, reproduce the findings on their favorite operating system, as well as modify the original experiment for reuse in new research, all with little effort. The demo will consist of examples of non-trivial experiments, showing how these can be packed in a Linux machine and reproduced on different machines and operating systems. Demo visitors will also be able to pack and reproduce their own experiments.
version: "1.1"
date-released: 2021-07-06
doi: 10.5281/zenodo.5081097
Papers & Mentions
Total mentions: 2
LittleBrain: A gradient-based tool for the topographical interpretation of cerebellar neuroimaging findings
- DOI: 10.1371/journal.pone.0210028
- OpenAlex ID: https://openalex.org/W2951379231
- Published: January 2019
- Total mentions: 2
Everything Matters: The ReproNim Perspective on Reproducible Neuroimaging
- DOI: 10.3389/fninf.2019.00001
- OpenAlex ID: https://openalex.org/W2912497992
- Published: February 2019
GitHub Events
Total
- Watch event: 35
- Delete event: 3
- Issue comment event: 11
- Push event: 11
- Pull request review event: 2
- Pull request review comment event: 2
- Pull request event: 5
- Fork event: 4
- Create event: 2
Last Year
- Watch event: 35
- Delete event: 3
- Issue comment event: 11
- Push event: 11
- Pull request review event: 2
- Pull request review comment event: 2
- Pull request event: 5
- Fork event: 4
- Create event: 2
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Remi Rampin | r****n@g****m | 2,010 |
| Manfred Touron | m@4****m | 84 |
| Fernando Chirigati | f****i@n****u | 69 |
| Vicky Steeves | v****s@g****m | 4 |
| Brian Hoffman | b****n@g****m | 2 |
| Vicky Steeves | v****s@n****u | 2 |
| Ilya Beda | i****x@g****m | 1 |
| James Clarke | j****7@d****g | 1 |
| Josua Krause | j****e@g****m | 1 |
| Marius Gedminas | m****s@g****s | 1 |
| Martin von Gagern | M****n@g****t | 1 |
| Peder Landsverk | p****k@g****m | 1 |
| Qiwen Wang | q****0@i****u | 1 |
| Stian Soiland-Reyes | s****n@a****g | 1 |
| William Yeh | w****h@g****m | 1 |
| Yo Yehudi | y****h@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 57
- Total pull requests: 47
- Average time to close issues: over 1 year
- Average time to close pull requests: 11 months
- Total issue authors: 13
- Total pull request authors: 8
- Average comments per issue: 1.93
- Average comments per pull request: 1.15
- Merged pull requests: 29
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 2
- Average time to close issues: N/A
- Average time to close pull requests: 6 days
- Issue authors: 0
- Pull request authors: 2
- Average comments per issue: 0
- Average comments per pull request: 4.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- remram44 (39)
- yarikoptic (3)
- appukuttan-shailesh (2)
- milech (2)
- VickyRampin (2)
- effigies (1)
- sainyam (1)
- jwscook (1)
- stoianmihail (1)
- sachiniyer (1)
- nuest (1)
- chaoyue729 (1)
- bmcfee (1)
Pull Request Authors
- remram44 (35)
- quoideneuf (5)
- VickyRampin (3)
- yuzibo (2)
- PhilippWendler (2)
- Peder2911 (1)
- xcorail (1)
- yochannah (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 13
-
Total downloads:
- pypi 1,340 last-month
- Total docker downloads: 15
-
Total dependent packages: 12
(may contain duplicates) -
Total dependent repositories: 34
(may contain duplicates) - Total versions: 182
- Total maintainers: 2
pypi.org: reprounzip
Linux tool enabling reproducible experiments (unpacker)
- Homepage: https://www.reprozip.org/
- Documentation: https://docs.reprozip.org/
- License: BSD-3-Clause
-
Latest release: 1.2.1
published almost 3 years ago
Rankings
Maintainers (2)
pypi.org: reprozip
Linux tool enabling reproducible experiments (packer)
- Homepage: https://www.reprozip.org/
- Documentation: https://docs.reprozip.org/
- License: BSD-3-Clause
-
Latest release: 1.0.16
published almost 7 years ago
Rankings
Maintainers (2)
pypi.org: reprounzip-docker
Allows the ReproZip unpacker to create Docker containers
- Homepage: https://www.reprozip.org/
- Documentation: https://docs.reprozip.org/
- License: BSD-3-Clause
-
Latest release: 1.0.16
published almost 7 years ago
Rankings
Maintainers (2)
pypi.org: reprounzip-vagrant
Allows the ReproZip unpacker to create virtual machines
- Homepage: https://www.reprozip.org/
- Documentation: https://docs.reprozip.org/
- License: BSD-3-Clause
-
Latest release: 1.0.13
published over 7 years ago
Rankings
Maintainers (2)
pypi.org: reprounzip-vistrails
Integrates the ReproZip unpacker with the VisTrails workflow management system
- Homepage: https://www.reprozip.org/
- Documentation: https://docs.reprozip.org/
- License: BSD-3-Clause
-
Latest release: 1.0.16
published almost 7 years ago
Rankings
Maintainers (2)
pypi.org: reprounzip-qt
Graphical user interface for reprounzip, using Qt
- Homepage: https://www.reprozip.org/
- Documentation: https://docs.reprozip.org/
- License: BSD-3-Clause
-
Latest release: 1.2.1
published almost 3 years ago
Rankings
Maintainers (2)
pypi.org: reprozip-jupyter
Jupyter Notebook tracing/reproduction using ReproZip
- Homepage: https://www.reprozip.org/
- Documentation: https://docs.reprozip.org/
- License: BSD
-
Latest release: 1.0.14
published over 7 years ago
Rankings
Maintainers (2)
conda-forge.org: reprounzip
ReproZip is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science. It tracks operating system calls and creates a package that contains all the binaries, files and dependencies required to run a given command on the author's computational environment (packing step). A reviewer can then extract the experiment in his environment to reproduce the results (unpacking step).
- Homepage: http://github.com/VIDA-NYU/reprozip
- License: BSD-3-Clause
-
Latest release: 1.0.16
published almost 7 years ago
Rankings
conda-forge.org: reprounzip-vagrant
ReproZip is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science. It tracks operating system calls and creates a package that contains all the binaries, files and dependencies required to run a given command on the author's computational environment (packing step). A reviewer can then extract the experiment in his environment to reproduce the results (unpacking step).
- Homepage: http://github.com/VIDA-NYU/reprozip
- License: BSD-3-Clause
-
Latest release: 1.0.13
published over 7 years ago
Rankings
conda-forge.org: reprozip-jupyter
ReproZip is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science. It tracks operating system calls and creates a package that contains all the binaries, files and dependencies required to run a given command on the author's computational environment (packing step). A reviewer can then extract the experiment in his environment to reproduce the results (unpacking step).
- Homepage: http://github.com/VIDA-NYU/reprozip
- License: BSD-3-Clause
-
Latest release: 1.0.14
published over 7 years ago
Rankings
conda-forge.org: reprounzip-docker
ReproZip is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science. It tracks operating system calls and creates a package that contains all the binaries, files and dependencies required to run a given command on the author's computational environment (packing step). A reviewer can then extract the experiment in his environment to reproduce the results (unpacking step).
- Homepage: http://github.com/VIDA-NYU/reprozip
- License: BSD-3-Clause
-
Latest release: 1.0.16
published almost 7 years ago
Rankings
conda-forge.org: reprounzip-qt
ReproZip is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science. It tracks operating system calls and creates a package that contains all the binaries, files and dependencies required to run a given command on the author's computational environment (packing step). A reviewer can then extract the experiment in his environment to reproduce the results (unpacking step).
- Homepage: http://github.com/VIDA-NYU/reprozip
- License: BSD-3-Clause
-
Latest release: 1.0.16
published almost 7 years ago
Rankings
conda-forge.org: reprozip
ReproZip is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science. It tracks operating system calls and creates a package that contains all the binaries, files and dependencies required to run a given command on the author's computational environment (packing step). A reviewer can then extract the experiment in his environment to reproduce the results (unpacking step).
- Homepage: http://github.com/VIDA-NYU/reprozip
- License: BSD-3-Clause
-
Latest release: 1.0.16
published almost 7 years ago
Rankings
Dependencies
- reprounzip >=1.1
- rpaths >=0.8
- PyYAML *
- qtpy *
- reprounzip >=1.0
- paramiko *
- reprounzip >=1.1
- rpaths >=0.8
- jupyter_client *
- nbconvert *
- nbformat *
- notebook *
- reprounzip >=1.0
- rpaths *
- actions/checkout v3 composite
- actions/setup-python v4 composite
