taxizedb

Tools for Working with Taxonomic SQL Databases

https://github.com/ropensci/taxizedb

Science Score: 59.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: ncbi.nlm.nih.gov, zenodo.org
  • Committers with academic emails
    1 of 10 committers (10.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.5%) to scientific vocabulary

Keywords

itis r r-package rstats taxize taxonomic-databases taxonomy

Keywords from Contributors

genome biodiversity caching mock nomenclature darwincore biology geocode conservation http-mock
Last synced: 6 months ago · JSON representation

Repository

Tools for Working with Taxonomic SQL Databases

Basic Info
  • Host: GitHub
  • Owner: ropensci
  • License: other
  • Language: R
  • Default Branch: master
  • Homepage:
  • Size: 453 KB
Statistics
  • Stars: 33
  • Watchers: 5
  • Forks: 9
  • Open Issues: 14
  • Releases: 7
Topics
itis r r-package rstats taxize taxonomic-databases taxonomy
Created almost 10 years ago · Last pushed 9 months ago
Metadata Files
Readme Changelog Contributing License Codemeta

README.Rmd

---
output: github_document
editor_options: 
  chunk_output_type: console
---


taxizedb
========

```{r echo=FALSE}
knitr::opts_chunk$set(
  warning = FALSE,
  message = FALSE,
  collapse = TRUE,
  comment = "#>"
)
```

[![status](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![cran checks](https://badges.cranchecks.info/worst/taxizedb.svg)](https://badges.cranchecks.info/worst/taxizedb.svg)
[![R-check](https://github.com/ropensci/taxizedb/workflows/R-check/badge.svg)](https://github.com/ropensci/taxizedb/actions)
[![codecov](https://codecov.io/gh/ropensci/taxizedb/branch/master/graph/badge.svg)](https://app.codecov.io/gh/ropensci/taxizedb)
[![rstudio mirror downloads](https://cranlogs.r-pkg.org/badges/taxizedb)](https://github.com/r-hub/cranlogs.app)
[![Total Downloads](https://cranlogs.r-pkg.org/badges/grand-total/taxizedb?color=blue)](https://cran.r-project.org/package=taxizedb)
[![cran version](https://www.r-pkg.org/badges/version/taxizedb)](https://cran.r-project.org/package=taxizedb)
[![DOI](https://zenodo.org/badge/53961466.svg)](https://zenodo.org/badge/latestdoi/53961466)

`taxizedb` - Tools for Working with Taxonomic Databases

Docs: 

`taxizedb` is an R package for interacting with taxonomic databases. Its functionality can be divided in two parts: 1. You can download the databases to your platform 2. You can query the downloaded databases to retrieve taxonomic information.

This two step approach is different from tools which interact with web services for each query, and has a number of advantages:

* Once you download a database you can work with it offline
* Once you download a database querying it is super fast
* As long as you store your database files all the queries in your analysis will be fully reproducible

## Data sources

When you download a database with `taxizedb` it will automatically convert it to SQLite and then all query functions will interact with this SQLite database. However, not all taxonomic databases are publicly available, or can be converted to SQLite. The following databases are supported:

- [NCBI Taxonomy](https://www.ncbi.nlm.nih.gov/taxonomy)
- [ITIS](https://itis.gov/)
- [World Flora Online (WFO)](https://www.worldfloraonline.org/)
- [Catalogue of Life (COL)](https://www.catalogueoflife.org/)
- [Global Biodiversity Information Facility (GBIF)](https://www.gbif.org/)
- [Wikidata](https://zenodo.org/records/1213477)

Get in touch [in the issues](https://github.com/ropensci/taxizedb/issues) with
any ideas on new data sources.

## Data sources - legacy support

[The Plant List (TPL)](https://en.wikipedia.org/wiki/The_Plant_List) has been replaced by World Flora Online. The website seems to be down so `taxizedb` can no longer facilitate new downloads. However, already downloaded database files can still be queried using `taxizedb` functions, for reproducibility.

## Package API

This package for each data sources performs the following tasks:

* Downloaded taxonomic databases `db_download_*`
* Create `dplyr` SQL backend via `dbplyr::src_dbi` - `src_*` 
* Query and get data back into a data.frame - `sql_collect`
* Manage cached database files - `tdb_cache`
* Retrieve immediate descendents of a taxon - `children`
* Retrieve the taxonomic hierarchies from local database - `classification`
* Retrieve all taxa descending from a vector of taxa - `downstream`
* Convert species names to taxon IDs - `name2taxid`
* Convert taxon IDs to species names - `taxid2name`
* Convert taxon IDs to ranks - `taxid2rank`

You can use the `src` connections with `dplyr`, etc. to do operations downstream. Or use the database connection to do raw SQL queries.

## Installation

CRAN version

```{r eval=FALSE}
install.packages("taxizedb")
```

dev version

```{r eval=FALSE}
remotes::install_github("ropensci/taxizedb")
```

## Citation

To cite taxizedb in publications use:

* Chamberlain S, Arendsee Z, Stirling T (2023). taxizedb: Tools for Working with 'Taxonomic' Databases. R package version 0.3.1. 

## Meta

* Please [report any issues, bugs or feature requests](https://github.com/ropensci/taxizedb/issues).
* License: MIT
* Get citation information for `taxizedb` in R with `citation(package = 'taxizedb')`
* Please note that this package is released with a [Contributor Code of Conduct](https://ropensci.org/code-of-conduct). By contributing to this project, you agree to abide by its terms.

[![ropensci](https://ropensci.org/public_images/github_footer.png)](https://ropensci.org)

Owner

  • Name: rOpenSci
  • Login: ropensci
  • Kind: organization
  • Email: info@ropensci.org
  • Location: Berkeley, CA

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "identifier": "taxizedb",
  "description": "Tools for working with 'taxonomic' databases, including utilities for downloading databases, loading them into various 'SQL' databases, cleaning up files, and providing a 'SQL' connection that can be used to do 'SQL' queries directly or used in 'dplyr'.",
  "name": "taxizedb: Tools for Working with 'Taxonomic' Databases",
  "relatedLink": "https://docs.ropensci.org/taxizedb/",
  "codeRepository": "https://github.com/ropensci/taxizedb",
  "issueTracker": "https://github.com/ropensci/taxizedb/issues",
  "license": "https://spdx.org/licenses/MIT",
  "version": "0.3.1",
  "programmingLanguage": {
    "@type": "ComputerLanguage",
    "name": "R",
    "url": "https://r-project.org"
  },
  "runtimePlatform": "R version 4.2.2 (2022-10-31)",
  "provider": {
    "@id": "https://cran.r-project.org",
    "@type": "Organization",
    "name": "Comprehensive R Archive Network (CRAN)",
    "url": "https://cran.r-project.org"
  },
  "author": [
    {
      "@type": "Person",
      "givenName": "Scott",
      "familyName": "Chamberlain"
    },
    {
      "@type": "Person",
      "givenName": "Zebulun",
      "familyName": "Arendsee"
    }
  ],
  "contributor": [
    {
      "@type": "Person",
      "givenName": "Tams",
      "familyName": "Stirling",
      "email": "stirling.tamas@gmail.com"
    }
  ],
  "maintainer": [
    {
      "@type": "Person",
      "givenName": "Tams",
      "familyName": "Stirling",
      "email": "stirling.tamas@gmail.com"
    }
  ],
  "softwareSuggestions": [
    {
      "@type": "SoftwareApplication",
      "identifier": "testthat",
      "name": "testthat",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=testthat"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "taxize",
      "name": "taxize",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=taxize"
    }
  ],
  "softwareRequirements": {
    "1": {
      "@type": "SoftwareApplication",
      "identifier": "curl",
      "name": "curl",
      "version": ">= 2.4",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=curl"
    },
    "2": {
      "@type": "SoftwareApplication",
      "identifier": "DBI",
      "name": "DBI",
      "version": ">= 0.6-1",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=DBI"
    },
    "3": {
      "@type": "SoftwareApplication",
      "identifier": "RSQLite",
      "name": "RSQLite",
      "version": ">= 1.1.2",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=RSQLite"
    },
    "4": {
      "@type": "SoftwareApplication",
      "identifier": "dplyr",
      "name": "dplyr",
      "version": ">= 0.7.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=dplyr"
    },
    "5": {
      "@type": "SoftwareApplication",
      "identifier": "tibble",
      "name": "tibble",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=tibble"
    },
    "6": {
      "@type": "SoftwareApplication",
      "identifier": "rlang",
      "name": "rlang",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=rlang"
    },
    "7": {
      "@type": "SoftwareApplication",
      "identifier": "readr",
      "name": "readr",
      "version": ">= 1.1.1",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=readr"
    },
    "8": {
      "@type": "SoftwareApplication",
      "identifier": "dbplyr",
      "name": "dbplyr",
      "version": ">= 1.0.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=dbplyr"
    },
    "9": {
      "@type": "SoftwareApplication",
      "identifier": "magrittr",
      "name": "magrittr",
      "version": ">= 1.5",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=magrittr"
    },
    "10": {
      "@type": "SoftwareApplication",
      "identifier": "hoardr",
      "name": "hoardr",
      "version": ">= 0.1.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=hoardr"
    },
    "SystemRequirements": null
  },
  "fileSize": "138.754KB",
  "releaseNotes": "https://github.com/ropensci/taxizedb/blob/master/NEWS.md",
  "readme": "https://github.com/ropensci/taxizedb/blob/master/README.md",
  "contIntegration": [
    "https://github.com/ropensci/taxizedb/actions?query=workflow%3AR-check",
    "https://circleci.com/gh/ropensci/taxizedb",
    "https://codecov.io/gh/ropensci/taxizedb"
  ],
  "keywords": [
    "taxonomy",
    "taxonomic-databases",
    "itis",
    "rstats",
    "taxize",
    "r",
    "r-package"
  ]
}

GitHub Events

Total
  • Issues event: 2
  • Watch event: 3
  • Issue comment event: 4
  • Push event: 7
  • Pull request event: 10
  • Fork event: 2
Last Year
  • Issues event: 2
  • Watch event: 3
  • Issue comment event: 4
  • Push event: 7
  • Pull request event: 10
  • Fork event: 2

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 232
  • Total Committers: 10
  • Avg Commits per committer: 23.2
  • Development Distribution Score (DDS): 0.444
Past Year
  • Commits: 25
  • Committers: 3
  • Avg Commits per committer: 8.333
  • Development Distribution Score (DDS): 0.16
Top Committers
Name Email Commits
Scott Chamberlain m****s@g****m 129
Zebulun Arendsee a****e@i****u 46
Tamas Stirling s****s@g****m 21
Carl Boettiger c****g@g****m 16
Scott Chamberlain s****t@f****m 10
T D James t****1@u****m 4
Maëlle Salmon m****n@y****e 3
Gaopeng Li l****c@g****m 1
Rekyt m****e@e****r 1
rOpenSci Bot m****t@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 65
  • Total pull requests: 39
  • Average time to close issues: 11 months
  • Average time to close pull requests: 12 days
  • Total issue authors: 25
  • Total pull request authors: 9
  • Average comments per issue: 3.52
  • Average comments per pull request: 2.23
  • Merged pull requests: 33
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 16
  • Average time to close issues: N/A
  • Average time to close pull requests: about 10 hours
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.31
  • Merged pull requests: 15
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • sckott (28)
  • stitam (9)
  • KaiAragaki (4)
  • arendsee (3)
  • cboettig (2)
  • kwymangrothem (1)
  • sagesteppe (1)
  • dlebauer (1)
  • pederengelstad (1)
  • GossypiumH (1)
  • gpli (1)
  • andzandz11 (1)
  • brunobrr (1)
  • mkhemmani (1)
  • NgAMB (1)
Pull Request Authors
  • stitam (24)
  • arendsee (7)
  • sckott (3)
  • KaiAragaki (2)
  • cboettig (2)
  • gpli (1)
  • tdjames1 (1)
  • kylebuscaglia (1)
  • Rekyt (1)
Top Labels
Issue Labels
bug (3) data-source (2) feature (1)
Pull Request Labels

Packages

  • Total packages: 2
  • Total downloads:
    • cran 359 last-month
  • Total docker downloads: 88,618
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 10
    (may contain duplicates)
  • Total versions: 9
  • Total maintainers: 1
cran.r-project.org: taxizedb

Offline Access to Taxonomic Databases

  • Versions: 7
  • Dependent Packages: 0
  • Dependent Repositories: 10
  • Downloads: 359 Last month
  • Docker Downloads: 88,618
Rankings
Docker downloads count: 0.0%
Dependent repos count: 9.2%
Forks count: 9.6%
Stargazers count: 9.7%
Average: 14.9%
Dependent packages count: 28.7%
Downloads: 32.3%
Maintainers (1)
Last synced: 6 months ago
conda-forge.org: r-taxizedb
  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 34.0%
Stargazers count: 43.4%
Average: 44.1%
Forks count: 47.7%
Dependent packages count: 51.2%
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • DBI >= 0.6 imports
  • RSQLite >= 1.1.2 imports
  • curl >= 2.4 imports
  • dbplyr >= 1.0.0 imports
  • dplyr >= 0.7.0 imports
  • hoardr >= 0.1.0 imports
  • magrittr >= 1.5 imports
  • readr >= 1.1.1 imports
  • rlang * imports
  • tibble * imports
  • taxize * suggests
  • testthat * suggests
.github/workflows/R-check.yaml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/upload-artifact v2 composite
  • r-lib/actions/setup-pandoc v1 composite
  • r-lib/actions/setup-r v1 composite
.github/workflows/build-docs.yml actions
  • actions/cache v1 composite
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc master composite
  • r-lib/actions/setup-r master composite
.circleci/Dockerfile docker
  • rocker/tidyverse latest build
docker-compose.yml docker
  • mariadb latest
  • postgres latest