https://github.com/digital-botanical-gardens-initiative/taxonomical-utils

A set of Python scripts to proceed to taxonomical resolution and retrieval of upper taxonomies.

https://github.com/digital-botanical-gardens-initiative/taxonomical-utils

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.4%) to scientific vocabulary
Last synced: 5 months ago · JSON representation

Repository

A set of Python scripts to proceed to taxonomical resolution and retrieval of upper taxonomies.

Basic Info
  • Host: GitHub
  • Owner: digital-botanical-gardens-initiative
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 689 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 3
  • Releases: 4
Created almost 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme Contributing License

README.md

taxonomical-utils

Release Build status codecov Commit activity License

A set of Python scripts to proceed to taxonomical resolution and retrieval of upper taxonomies.

Description

This repository contains a set of Python scripts to proceed to taxonomical resolution and retrieval of upper taxonomies. For now it uses the Open Tree of Life as a source of taxonomical data. The taxonomical-utils are merely wrappers around the python opentree package. It includes functions for resolving taxonomic names, appending upper taxonomic lineage information, and merging data files.

Installation

To install the Taxonomical Utils, follow these steps:

Clone the repository:

bash git clone https://github.com/digital-botanical-gardens-initiative/taxonomical-utils.git

Navigate to the project directory:

bash cd taxonomical-utils

Install the required dependencies using Poetry:

bash poetry install

Usage

CLI Commands

Taxonomical Utils provides several command-line interface (CLI) commands to process taxonomic data. Each command can be run individually or as part of a pipeline.

1. Resolve Taxa

This command resolves taxonomic names from an input file and generates a resolved taxa file.

Command:

bash poetry run taxonomical-utils resolve --input-file <input_file> --output-file <resolved_taxa_file> --org-column-header <org_column_header>

  • : Path to the input CSV/TSV file containing taxonomic names.
  • : Path to the output file where resolved taxa will be saved.
  • : Column header in the input file that contains the taxonomic names.

Example:

bash poetry run taxonomical-utils resolve --input-file ./data/in/example.csv --output-file ./data/out/resolved_taxa.csv --org-column-header idTaxon

2. Append Upper Taxa Lineage

This command appends upper taxonomic lineage information to the resolved taxa file.

Command:

bash poetry run taxonomical-utils append-taxonomy --input-file <resolved_taxa_file> --output-file <upper_taxa_lineage_file>

  • : Path to the resolved taxa file generated by the resolve command.
  • : Path to the output file where the upper taxa lineage information will be saved.

Example:

bash poetry run taxonomical-utils append-taxonomy --input-file data/out/resolved_taxa.csv --output-file data/out/upper_taxa_lineage.csv

3. Merge Data Files

This command merges the original input file with the resolved taxa file and upper taxa lineage file to produce a fully resolved dataset.

Command:

bash poetry run taxonomical-utils merge --input-file <input_file> --resolved-taxa-file <resolved_taxa_file> --upper-taxa-lineage-file <upper_taxa_lineage_file> --output-file <final_output_file> --org-column-header <org_column_header>

  • : Path to the original input CSV/TSV file.
  • : Path to the resolved taxa file generated by the resolve command.
  • : Path to the upper taxa lineage file generated by the append-taxonomy command.
  • : Path to the final output file where the merged data will be saved.
  • : Column header in the input file that contains the taxonomic names.

Example:

bash poetry run taxonomical-utils merge --input-file data/example.csv --resolved-taxa-file data/out/resolved_taxa.csv --upper-taxa-lineage-file data/out/upper_taxa_lineage.csv --output-file data/out/final_output.csv --org-column-header idTaxon

Running the Full Pipeline

To run the entire pipeline, you can execute the commands sequentially:

Resolve Taxa:

bash poetry run taxonomical-utils resolve --input-file data/example.csv --output-file data/out/resolved_taxa.csv --org-column-header idTaxon

Append Upper Taxa Lineage:

bash poetry run taxonomical-utils append-taxonomy --input-file data/out/resolved_taxa.csv --output-file data/out/upper_taxa_lineage.csv

Merge Data Files:

bash poetry run taxonomical-utils merge --input-file data/example.csv --resolved-taxa-file data/out/resolved_taxa.csv --upper-taxa-lineage-file data/out/upper_taxa_lineage.csv --output-file data/out/final_output.csv --org-column-header idTaxon

Running the Commands as a Pipeline

You can also run the commands in a pipeline using && to ensure each command runs only if the previous command succeeds:

bash poetry run taxonomical-utils resolve --input-file data/example.csv --output-file data/out/resolved_taxa.csv --org-column-header idTaxon && \ poetry run taxonomical-utils append-taxonomy --input-file data/out/resolved_taxa.csv --output-file data/out/upper_taxa_lineage.csv && \ poetry run taxonomical-utils merge --input-file data/example.csv --resolved-taxa-file data/out/resolved_taxa.csv --upper-taxa-lineage-file data/out/upper_taxa_lineage.csv --output-file data/out/final_output.csv --org-column-header idTaxon

Testing

To run the tests, use the following command:

bash make test

This will execute the test suite and ensure that all functions are working correctly.

Contributing

Contributions are welcome! Please submit a pull request or open an issue to discuss any changes.


Repository initiated with fpgmaas/cookiecutter-poetry.

Owner

  • Name: The Digital Botanical Garden Initiative
  • Login: digital-botanical-gardens-initiative
  • Kind: organization

GitHub Events

Total
Last Year

Issues and Pull Requests

Last synced: 5 months ago

All Time
  • Total issues: 2
  • Total pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 2
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • oolonek (2)
Pull Request Authors
  • dependabot[bot] (2)
Top Labels
Issue Labels
Pull Request Labels
dependencies (2)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 10 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 13
  • Total maintainers: 1
pypi.org: taxonomical_utils

A set of Python scripts to proceed to taxonomical resolution and retrieval of upper taxonomies.

  • Versions: 13
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 10 Last month
Rankings
Dependent packages count: 10.9%
Average: 36.2%
Dependent repos count: 61.5%
Maintainers (1)
Last synced: 5 months ago

Dependencies

.github/actions/setup-poetry-env/action.yml actions
  • actions/cache v3 composite
  • actions/setup-python v4 composite
  • snok/install-poetry v1 composite
.github/workflows/main.yml actions
  • ./.github/actions/setup-poetry-env * composite
  • actions/cache v3 composite
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • codecov/codecov-action v3 composite
  • snok/install-poetry v1 composite
.github/workflows/on-release-main.yml actions
  • ./.github/actions/setup-poetry-env * composite
  • actions/checkout v3 composite
.github/workflows/validate-codecov-config.yml actions
  • actions/checkout v3 composite
Dockerfile docker
  • python 3.9-slim-buster build
poetry.lock pypi
  • babel 2.15.0
  • cachetools 5.3.3
  • certifi 2024.2.2
  • cfgv 3.4.0
  • chardet 5.2.0
  • charset-normalizer 3.3.2
  • click 8.1.7
  • colorama 0.4.6
  • coverage 7.5.1
  • deptry 0.12.0
  • distlib 0.3.8
  • exceptiongroup 1.2.1
  • filelock 3.14.0
  • ghp-import 2.1.0
  • griffe 0.45.1
  • identify 2.5.36
  • idna 3.7
  • iniconfig 2.0.0
  • jinja2 3.1.4
  • markdown 3.6
  • markupsafe 2.1.5
  • mergedeep 1.3.4
  • mkdocs 1.6.0
  • mkdocs-autorefs 1.0.1
  • mkdocs-get-deps 0.2.0
  • mkdocs-material 9.5.24
  • mkdocs-material-extensions 1.3.1
  • mkdocstrings 0.23.0
  • mkdocstrings-python 1.8.0
  • mypy 1.10.0
  • mypy-extensions 1.0.0
  • nodeenv 1.8.0
  • packaging 24.0
  • paginate 0.5.6
  • pathspec 0.12.1
  • platformdirs 4.2.2
  • pluggy 1.5.0
  • pre-commit 3.7.1
  • pygments 2.18.0
  • pymdown-extensions 10.8.1
  • pyproject-api 1.6.1
  • pytest 7.4.4
  • pytest-cov 4.1.0
  • python-dateutil 2.9.0.post0
  • pyyaml 6.0.1
  • pyyaml-env-tag 0.1
  • regex 2024.5.15
  • requests 2.32.2
  • setuptools 70.0.0
  • six 1.16.0
  • tomli 2.0.1
  • tox 4.15.0
  • typing-extensions 4.11.0
  • urllib3 2.2.1
  • virtualenv 20.26.2
  • watchdog 4.0.0
pyproject.toml pypi
  • deptry ^0.12.0 develop
  • mypy ^1.5.1 develop
  • pre-commit ^3.4.0 develop
  • pytest ^7.2.0 develop
  • pytest-cov ^4.0.0 develop
  • tox ^4.11.1 develop
  • mkdocs ^1.4.2 docs
  • mkdocs-material ^9.2.7 docs
  • mkdocstrings ^0.23.0 docs
  • python >=3.10,<4.0