https://github.com/annakrystalli/piggyback
:package: for using large(r) data files on GitHub
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: joss.theoj.org, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (18.0%) to scientific vocabulary
Last synced: 9 months ago
·
JSON representation
Repository
:package: for using large(r) data files on GitHub
Basic Info
- Host: GitHub
- Owner: annakrystalli
- License: gpl-3.0
- Language: R
- Default Branch: master
- Homepage: https://ropensci.github.io/piggyback
- Size: 285 KB
Statistics
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of ropensci/piggyback
Created over 7 years ago
· Last pushed over 7 years ago
https://github.com/annakrystalli/piggyback/blob/master/
[](https://www.tidyverse.org/lifecycle/#stable)
[](https://travis-ci.org/ropensci/piggyback)
[](https://codecov.io/github/ropensci/piggyback?branch=master)
[](https://ci.appveyor.com/project/cboettig/piggyback)
[](https://cran.r-project.org/package=piggyback)
[](https://github.com/ropensci/onboarding/issues/220)
[](https://zenodo.org/badge/latestdoi/132979724)
[](https://doi.org/10.21105/joss.00971)
# piggyback
`piggyback` is basically a poor souls [Git
LFS](https://git-lfs.github.com/). GitHub rejects commits containing
files larger than 50 Mb. Git LFS is not only expensive, it also [breaks
GitHubs collaborative
model](https://medium.com/@megastep/github-s-large-file-storage-is-no-panacea-for-open-source-quite-the-opposite-12c0e16a9a91).
(Someone wants to submit a PR with a simple edit to your docs, they
cannot fork) Unlike Git LFS, `piggyback` doesnt take over your standard
`git` client, it just perches comfortably on the shoulders of your
existing GitHub API. Data can be versioned by `piggyback`, but relative
to `git LFS` versioning is less strict: uploads can be set as a new
version or allowed to overwrite previously uploaded data. `piggyback`
works with both public and private repositories.
## Installation
You can install the development version from
[GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("ropensci/piggyback")
```
## Quickstart
See the [piggyback
vignette](https://ropensci.github.io/piggyback/articles/intro.html) for
details on authentication and additional package functionality.
Piggyback can download data attached to a release on any repository:
``` r
library(piggyback)
pb_download("data/mtcars.tsv.gz", repo = "cboettig/piggyback-tests", dest = tempdir())
```
Downloading from private repos or uploading to any repo requires
authentication, so be sure to set a `GITHUB_TOKEN` (or `GITHUB_PAT`)
environmental variable, or include the `.token` argument. Omit the file
name to download all attached objects. Omit the repository name to
default to the current repository. See
[vignette](https://ropensci.github.io/piggyback/articles/intro.html) or
function documentation for details.
We can also upload data to any existing release (defaults to `latest`):
``` r
## We'll need some example data first.
## Pro tip: compress your tabular data to save space & speed upload/downloads
readr::write_tsv(mtcars, "mtcars.tsv.gz")
pb_upload("mtcars.tsv.gz", repo = "cboettig/piggyback-tests")
```
### Tracking data files
For a [Git LFS](https://git-lfs.github.com/) style workflow, just
specify the type of files you wish to track using `pb_track()`.
Piggyback will retain a record of these files in a hidden
`.pbattributes` file in your repository, and add these to `.gitignore`
so you dont accidentally commit them to GitHub. `pb_track` will also
return a list of such files that you can easily pass to `pb_upload()`:
``` r
# track csv files, compressed data, and geotiff files:
pb_track(c("*.csv", "*.gz", "*.tif")) %>%
pb_upload()
```
You can easily download the latest version of all data attached to a
given release with `pb_download()` with no file argument (analgous to a
`git pull` for data):
``` r
pb_download()
```
-----
Please note that this project is released with a [Contributor Code of
Conduct](CODE_OF_CONDUCT.md). By participating in this project you agree
to abide by its
terms.
[](https://ropensci.org)
Owner
- Name: Anna Krystalli
- Login: annakrystalli
- Kind: user
- Location: Syros, Greece
- Company: @r-rse
- Website: https://www.r-rse.eu
- Twitter: annakrystalli
- Repositories: 240
- Profile: https://github.com/annakrystalli
Research Software Engineering Service in #rstats at @r-rse. Ex @RSE-Sheffield. Editor @ropensci. Core team member @reprohack. Available for hire! 🚀😎