ArctosR

Make requests to the Arctos database in R

https://github.com/hrhwilliams/arctosr

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.4%) to scientific vocabulary

Keywords

biology gsoc-2024 r r-package research specimen-records
Last synced: 10 months ago · JSON representation

Repository

Make requests to the Arctos database in R

Basic Info
  • Host: GitHub
  • Owner: hrhwilliams
  • License: gpl-3.0
  • Language: R
  • Default Branch: main
  • Homepage:
  • Size: 384 KB
Statistics
  • Stars: 4
  • Watchers: 1
  • Forks: 2
  • Open Issues: 0
  • Releases: 0
Topics
biology gsoc-2024 r r-package research specimen-records
Created about 2 years ago · Last pushed 10 months ago
Metadata Files
Readme Changelog License

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%",
  dpi = 400
)
```

# ArctosR logo 



[![R-CMD-check](https://github.com/hrhwilliams/ArctosR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/hrhwilliams/ArctosR/actions/workflows/R-CMD-check.yaml)
[![codecov.io](https://codecov.io/gh/hrhwilliams/ArctosR/branch/main/graphs/badge.svg)](https://app.codecov.io/gh/hrhwilliams/ArctosR?branch=main)
[![shields.io](https://img.shields.io/cran/v/ArctosR)](https://CRAN.R-project.org/package=ArctosR)


## GSoC project description

**Student**: Harlan Williams

**GSoC Mentors**: Marlon Cobos, Vijay Barve, Jocelyn Colella, Michelle Koo

**Organization**: R Project For Statistical Computing

### Motivation

Arctos (https://arctosdb.org/) has an extensive database that connects >100 data
fields to physical specimen records using standard DarwinCore vocabulary, many 
of which are only accessible through its web interface. Data can be accessed 
through the web interface, but downloads are memory intensive, such that only a 
subset of fields or specimens can be queried at once. The goal of this package 
is to provide a programmatic way to access these data for researchers, in hopes
of improving their workflows and the accessibility of biodiversity data stored
on Arctos.

The main difficulties in accessing Arctos via the API is pagination of records,
requiring multiple API queries, and hierarchical data where specific columns
in Arctos records could themselves be tables or point to other Arctos records.
This package was developed specifically to handle these two difficulties for
the user. Pagination is handled by package internals so that the user only has
to ask for all records pertaining to a query to get all of those records.

The user also is able to expand columns representing hierarchical data and
explore that data within RStudio natively, making analysis of that data much
more intuitive.

### Status of the project

At the time of submission for GSoC 2024 a set of functions for querying from
Arctos as well as looking up documentation for Arctos are available to the user.
The user is also able to explore downloaded records in a hierarchical manner
by expanding columns returned from Arctos which represent tables of tables.
These records are then able to be saved in a CSV format or stored as R objects
for further data analysis tasks by researchers.

With these functions, ArctosR can be integrated into existing data analysis
pipelines to provide updated records. Each query in ArctosR is also accompanied
by metadata, allowing for better data documentation and query reproducibility by
other researchers. At this stage the project fulfills the goals set out in the
proposal, the only thing remaining is for it to be submitted to CRAN.

A complete history of commits can be accessed here.

## Installation

### CRAN

ArctosR can be installed from [CRAN](https://cran.r-project.org/) by running the
command in R:

``` r
install.packages("ArctosR")
```

### GitHub

You can install the development version of ArctosR from [GitHub](https://github.com/) with:

``` r
install.packages("remotes")
remotes::install_github("hrhwilliams/ArctosR")
```

## Example

```{r example}
library(ArctosR)

# Request a list of all result parameters. These are the names that can show up
# as columns in a dataframe returned by ArctosR.
result_params <- get_result_parameters()

# Print the first six rows and first 3 columns to the console.
result_params[1:6, 1:3]

# If using RStudio, view the entire dataframe of result parameters.
View(result_params)

# Request just the number of records matching a query.
count <- get_record_count(
  scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm"
)

# Request to download data. This is limited to 100 records by default.
response <- get_records(
  scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm",
  columns = list("guid", "parts", "partdetail")
)

# Request to download all available data.
response <- get_records(
  scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm",
  columns = list("guid", "parts", "partdetail"),
  all_records = TRUE
)

# Grab the dataframe of records from the response and save that as a csv.
df <- response_data(response)
```

Owner

  • Name: Harlan Williams
  • Login: hrhwilliams
  • Kind: user

i'm me

GitHub Events

Total
  • Issues event: 4
  • Issue comment event: 7
  • Push event: 22
  • Pull request event: 2
Last Year
  • Issues event: 4
  • Issue comment event: 7
  • Push event: 22
  • Pull request event: 2

Packages

  • Total packages: 1
  • Total downloads:
    • cran 226 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 2
  • Total maintainers: 1
cran.r-project.org: ArctosR

An Interface to the 'Arctos' Database

  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 226 Last month
Rankings
Dependent packages count: 26.0%
Dependent repos count: 32.0%
Average: 48.0%
Downloads: 85.9%
Maintainers (1)
Last synced: 10 months ago

Dependencies

DESCRIPTION cran
.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v4 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml actions
  • actions/checkout v4 composite
  • actions/upload-artifact v4 composite
  • codecov/codecov-action v4 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite