ediutils
An API Client for the Environmental Data Initiative Repository
Science Score: 20.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: zenodo.org -
✓Committers with academic emails
2 of 2 committers (100.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (18.5%) to scientific vocabulary
Keywords
ecology
eml-metadata
open-access
open-data
r
r-package
research-data-management
research-data-repository
rstats
Last synced: 6 months ago
·
JSON representation
Repository
An API Client for the Environmental Data Initiative Repository
Basic Info
- Host: GitHub
- Owner: ropensci
- License: other
- Language: R
- Default Branch: main
- Homepage: https://docs.ropensci.org/EDIutils/
- Size: 1.03 MB
Statistics
- Stars: 9
- Watchers: 9
- Forks: 2
- Open Issues: 8
- Releases: 4
Topics
ecology
eml-metadata
open-access
open-data
r
r-package
research-data-management
research-data-repository
rstats
Created about 7 years ago
· Last pushed over 2 years ago
Metadata Files
Readme
Contributing
License
Code of conduct
README.Rmd
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
# fig.path = "man/figures/README-",
fig.path = "README-",
out.width = "100%"
)
```
# EDIutils
[](https://www.repostatus.org/#active)
[](https://github.com/ropensci/EDIutils/actions)
[](https://github.com/ropensci/software-review/issues/498)
[](https://cran.r-project.org/package=EDIutils)
[](https://app.codecov.io/github/ropensci/EDIutils?branch=main)
[](https://zenodo.org/badge/latestdoi/159572464)
A client for the Environmental Data Initiative repository REST API. The [EDI data repository](https://portal.edirepository.org/nis/home.jsp) is for publication and reuse of ecological data with emphasis on metadata accuracy and completeness. It was developed in collaboration with the [US LTER Network](https://lternet.edu/) and is built upon the [PASTA+ software stack](https://pastaplus-core.readthedocs.io/en/latest/index.html#). EDIutils includes functions to search and access existing data, evaluate and upload new data, and assist with related data management tasks.
- [Search and Access Data](https://docs.ropensci.org/EDIutils/articles/search_and_access.html)
- [Evaluate and Upload Data](https://docs.ropensci.org/EDIutils/articles/evaluate_and_upload.html)
- [Retrieve Download Metrics](https://docs.ropensci.org/EDIutils/articles/retrieve_downloads.html)
- [Retrieve Citation Metrics](https://docs.ropensci.org/EDIutils/articles/retrieve_citations.html)
## Installation
Get the latest version:
```{r eval=FALSE}
install.packages("EDIutils")
```
Get the development version:
```{r eval=FALSE}
remotes::install_github("ropensci/EDIutils", ref = "development")
```
## Getting Started
```{r eval=FALSE}
library(EDIutils)
```
The unit of publication is the data package. It contains one or more data entities (i.e. files) described with [EML metadata](https://eml.ecoinformatics.org/), a metadata quality report, and a manifest of package contents. Data packages are immutable for reproducible research, yet versionable to allow updates and improved data quality through time. Each version is assigned a DOI and a unique package ID of the form "scope.identifier.revision". The "scope" is the organizational unit, "identifier" the series, and "revision" the version (e.g. "edi.100.2" is version "2" of data package "edi.100").
### Authentication
Authentication is required by data evaluation and upload functions, and to
access user audit logs and services. Contact EDI for an account
. Authenticate with the `login()`
function.
### Search and Access Data
The repository search service is a standard deployment of Apache Solr and indexes select metadata fields of data package metadata. For a list of searchable fields see `search_data_packages()`. For a browser based search experience, use the [EDI data portal](https://portal.edirepository.org/nis/advancedSearch.jsp).
```{r eval=FALSE}
# List data packages containing the term "water temperature"
res <- search_data_packages(query = 'q="water+temperature"&fl=*')
colnames(res)
#> [1] "abstract" "begindate" "doi"
#> [4] "enddate" "funding" "geographicdescription"
#> [7] "id" "methods" "packageid"
#> [10] "pubdate" "responsibleParties" "scope"
#> [13] "site" "taxonomic" "title"
#> [16] "authors" "spatialCoverage" "sources"
#> [19] "keywords" "organizations" "singledates"
#> [22] "timescales"
nrow(res)
#> [1] 798
```
Data entities are downloaded in raw bytes and parsed by a reader function.
```{r eval=FALSE}
# List data entities of data package edi.1047.1
res <- read_data_entity_names(packageId = "edi.1047.1")
res
#> entityId entityName
#> 1 3abac5f99ecc1585879178a355176f6d Environmentals.csv
#> 2 f6bfa89b48ced8292840e53567cbf0c8 ByCatch.csv
#> 3 c75642ddccb4301327b4b1a86bdee906 Chinook.csv
#> 4 2c9ee86cc3f3ffc729c5f18bfe0a2a1d Steelhead.csv
#> 5 785690848dd20f4910637250cdc96819 TrapEfficiencyRelease.csv
#> 6 58b9000439a5671ea7fe13212e889ba5 TrapEfficiencySummary.csv
#> 7 86e61c1a501b7dcf0040d10e009bfd87 TrapOperations.csv
# Read raw bytes of Steelhead.csv (i.e. the 4th data entity)
raw <- read_data_entity(packageId = "edi.1047.1", entityId = res$entityId[4])
head(raw)
#> [1] ef bb bf 44 61 74
# Parse with a .csv reader
data <- readr::read_csv(file = raw)
data
#> # A tibble: 2,926 x 14
#> Date trapVisitID subSiteName catchRawID releaseID commonName
#>
#> 1 1/12/~ 326 North Chan~ 32123 0 Steelhead ~
#> 2 1/14/~ 336 North Chan~ 33980 0 Steelhead ~
#> 3 1/15/~ 337 North Chan~ 32683 0 Steelhead ~
#> 4 1/16/~ 339 North Chan~ 32971 0 Steelhead ~
#> 5 1/17/~ 341 North Chan~ 33104 0 Steelhead ~
#> 6 1/18/~ 342 North Chan~ 33304 0 Steelhead ~
#> 7 1/19/~ 343 North Chan~ 33432 0 Steelhead ~
#> 8 1/21/~ 349 North Chan~ 34083 0 Steelhead ~
#> 9 1/21/~ 349 North Chan~ 34084 0 Steelhead ~
#> 10 1/23/~ 351 North Chan~ 34384 0 Steelhead ~
#> # ... with 2,916 more rows, and 8 more variables:
#> # lifeStage , forkLength , weight , n ,
#> # mort , fishOrigin , markType ,
#> # CatchRaw.comments
```
### Evaluate and Upload Data
The EDI data repository has a "[staging](https://portal-s.edirepository.org/nis/home.jsp)" environment to test the upload and rendering of new data packages before publishing to "[production](https://portal.edirepository.org/nis/home.jsp)". Authentication is required by functions involving data evaluation and upload. Request an account from support@edirepository.org.
```{r eval=FALSE}
# Authenticate
login()
#> User name: "my_name"
#> User password: "my_secret"
```
Data package reservations prevent conflicting use of the same identifier.
```{r eval=FALSE}
# Reserve a data package identifier
identifier <- create_reservation(scope = "edi", env = "staging")
identifier
#> [1] 595
```
Evaluation checks for metadata accuracy and completeness.
```{r eval=FALSE}
# Evaluate data package
transaction <- evaluate_data_package(
eml = paste0(tempdir(), "/edi.595.1.xml"),
env = "staging")
transaction
#> [1] "evaluate_163966785813042760"
# Check status
status <- check_status_evaluate(transaction, env = "staging")
status
#> [1] TRUE
# Read the evaluation report
report <- read_evaluate_report(transaction, as = "char", env = "staging")
message(report)
#> ===================================================
#> EVALUATION REPORT
#> ===================================================
#>
#> PackageId: edi.595.1
#> Report Date/Time: 2021-12-16T08:17:40
#> Total Quality Checks: 29
#> Valid: 21
#> Info: 8
#> Warn: 0
#> Error: 0
#>
#> ---------------------------------------------------
#> DATASET REPORT
#> ---------------------------------------------------
#>
#> IDENTIFIER: packageIdPattern
#> NAME: packageId pattern matches "scope.identifier.revision"
#> DESCRIPTION: Check against LTER requirements for scope.identifier.revision
#> EXPECTED: 'scope.n.m', where 'n' and 'm' are integers and 'scope' is one ...
#> FOUND: edi.595.1
#> STATUS: valid
#> EXPLANATION:
#> SUGGESTION:
#> REFERENCE:
#>
#> IDENTIFIER: emlVersion
#> NAME: EML version 2.1.0 or beyond
#> DESCRIPTION: Check the EML document declaration for version 2.1.0 or higher
#> EXPECTED: eml://ecoinformatics.org/eml-2.1.0 or higher
#> FOUND: https://eml.ecoinformatics.org/eml-2.2.0
#> STATUS: valid
#> EXPLANATION: Validity of this quality report is dependent on this check ...
#> SUGGESTION:
#> REFERENCE:
#> ...
```
Upload after errors and warnings are fixed.
```{r eval=FALSE}
# Create a new data package
transaction <- create_data_package(
eml = paste0(tempdir(), "/edi.595.1.xml"),
env = "staging")
transaction
#> [1] "create_163966765080210573__edi.595.1"
# Check status
status <- check_status_create(
transaction = transaction,
env = "staging")
status
#> [1] TRUE
```
Once everything looks good in the "staging" environment, then repeat the above reservation and upload steps in the "production" environment where the data package will be assigned a DOI and made discoverable with other published data.
## Getting help
Use [GitHub Issues](https://github.com/ropensci/EDIutils/issues) for bug reporting, feature requests, and general questions/discussions. When filing bug reports, please include a minimal reproducible example.
## Contributing
Community contributions are welcome! Please reference our [contributing guidelines](https://github.com/ropensci/EDIutils/blob/master/CONTRIBUTING.md) for details.
-----
Please note that this package is released with a [Contributor Code of Conduct](https://ropensci.org/code-of-conduct/). By contributing to this project, you agree to abide by its terms.
Owner
- Name: rOpenSci
- Login: ropensci
- Kind: organization
- Email: info@ropensci.org
- Location: Berkeley, CA
- Website: https://ropensci.org/
- Twitter: rOpenSci
- Repositories: 307
- Profile: https://github.com/ropensci
GitHub Events
Total
- Issues event: 1
- Issue comment event: 1
Last Year
- Issues event: 1
- Issue comment event: 1
Committers
Last synced: about 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| Colin Smith | c****h@w****u | 243 |
| Corinna | c****s@w****u | 1 |
Committer Domains (Top 20 + Academic)
wisc.edu: 2
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 31
- Total pull requests: 28
- Average time to close issues: 9 months
- Average time to close pull requests: 8 days
- Total issue authors: 15
- Total pull request authors: 3
- Average comments per issue: 2.42
- Average comments per pull request: 0.14
- Merged pull requests: 25
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- clnsmth (11)
- srearl (3)
- yvanlebras (2)
- earnaud (2)
- kzollove (2)
- atn38 (1)
- cgries (1)
- gkamener (1)
- paschatz (1)
- BrennieDev (1)
- njlyon0 (1)
- mobb (1)
- gremau (1)
- ARC-LTER (1)
- scelmendorf (1)
Pull Request Authors
- clnsmth (21)
- cgries (4)
- kzollove (2)
Top Labels
Issue Labels
enhancement (10)
bug (3)
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 285 last-month
- Total dependent packages: 0
- Total dependent repositories: 3
- Total versions: 4
- Total maintainers: 1
cran.r-project.org: EDIutils
An API Client for the Environmental Data Initiative Repository
- Homepage: https://github.com/ropensci/EDIutils
- Documentation: http://cran.r-project.org/web/packages/EDIutils/EDIutils.pdf
- License: MIT + file LICENSE
-
Latest release: 1.0.3
published over 2 years ago
Rankings
Stargazers count: 16.5%
Dependent repos count: 16.8%
Forks count: 17.2%
Average: 22.2%
Dependent packages count: 27.8%
Downloads: 32.9%
Maintainers (1)
Last synced:
6 months ago
Dependencies
DESCRIPTION
cran
- curl * imports
- httr * imports
- jsonlite * imports
- xml2 * imports
- knitr * suggests
- readr * suggests
- rmarkdown * suggests
- roxygen2 * suggests
- testthat * suggests
- vcr * suggests
.github/workflows/check-standard-real-requests.yaml
actions
- actions/checkout v2 composite
- actions/upload-artifact main composite
- r-lib/actions/check-r-package v1 composite
- r-lib/actions/setup-pandoc v1 composite
- r-lib/actions/setup-r v1 composite
- r-lib/actions/setup-r-dependencies v1 composite
.github/workflows/check-standard.yaml
actions
- actions/checkout v2 composite
- actions/upload-artifact main composite
- r-lib/actions/check-r-package v1 composite
- r-lib/actions/setup-pandoc v1 composite
- r-lib/actions/setup-r v1 composite
- r-lib/actions/setup-r-dependencies v1 composite