MiscMetabar

MiscMetabar: an R package to facilitate visualization and reproducibility in metabarcoding analysis - Published in JOSS (2023)

https://github.com/adrientaudiere/miscmetabar

Science Score: 98.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 6 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: ncbi.nlm.nih.gov, joss.theoj.org, zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

amplicon amplicon-sequencing biodiversity-informatics ecology illumina metabarcoding ngs-analysis package r r-package visualization
Last synced: 4 months ago · JSON representation ·

Repository

R package MiscMetabar: Miscellaneous functions for metabarcoding analysis

Basic Info
Statistics
  • Stars: 18
  • Watchers: 3
  • Forks: 0
  • Open Issues: 0
  • Releases: 30
Topics
amplicon amplicon-sequencing biodiversity-informatics ecology illumina metabarcoding ngs-analysis package r r-package visualization
Created over 5 years ago · Last pushed 4 months ago
Metadata Files
Readme Changelog Contributing License Code of conduct Citation

README.Rmd

---
output: github_document
always_allow_html: yes
bibliography: paper/bibliography.bib
---

![R](https://img.shields.io/badge/r-%23276DC3.svg?style=for-the-badge&logo=r&logoColor=white)
DOI
[![codecov](https://codecov.io/gh/adrientaudiere/MiscMetabar/graph/badge.svg?token=NXFRSIKYC0)](https://app.codecov.io/gh/adrientaudiere/MiscMetabar)
[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.1-4baaaa.svg)](https://github.com/adrientaudiere/MiscMetabar/blob/master/CODE_OF_CONDUCT.md) 
[![License: GPL v3](https://img.shields.io/badge/License-GPL%20v3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![CodeFactor](https://www.codefactor.io/repository/github/adrientaudiere/miscmetabar/badge/master)](https://www.codefactor.io/repository/github/adrientaudiere/miscmetabar/overview/master)
[![R-CMD-check](https://github.com/adrientaudiere/MiscMetabar/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/adrientaudiere/MiscMetabar/actions/workflows/R-CMD-check.yaml)
[![DOI](https://joss.theoj.org/papers/10.21105/joss.06038/status.svg)](https://doi.org/10.21105/joss.06038)






```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%",
  message = FALSE
)
```

# MiscMetabar MiscMetabar website

See the pkgdown documentation site [here](https://adrientaudiere.github.io/MiscMetabar/) and the [package paper](https://doi.org/10.21105/joss.06038) in the Journal Of Open Softwares. 

Biological studies, especially in ecology, health sciences and taxonomy, need to describe the biological composition of samples. Over the last twenty years, (i) the development of DNA sequencing, (ii) reference databases, (iii) high-throughput sequencing (HTS), and (iv) bioinformatics resources have enabled the description of biological communities through metabarcoding. Metabarcoding involves the sequencing of millions (*meta*-) of short regions of specific DNA (*-barcoding*, @valentini2009) often from environmental samples (eDNA, @taberlet2012) such as human stomach contents, lake water, soil, and air.

`MiscMetabar` aims to facilitate the **description**, **transformation**, **exploration** and **reproducibility** of metabarcoding analyses using R. The development of `MiscMetabar` relies heavily on the R packages [`dada2`](https://benjjneb.github.io/dada2/index.html) [@callahan2016], [`phyloseq`](https://joey711.github.io/phyloseq/) [@mcmurdie2013] and [`targets`](https://books.ropensci.org/targets/) [@landau2021]. 

## Installation

A CRAN version of MiscMetabar is available.

```{r, results = 'hide', eval=FALSE}
install.packages("MiscMetabar")
```

You may need to install required bioconductor packages (dada2 and phyloseq) first. See their installation pages. 
One other solution is to use the package [pak](https://pak.r-lib.org/) to install MiscMetabar. It comes with the benefit to check for 
uninstalled dependencies on your computer (system requirements), thank you [pak](https://pak.r-lib.org/)! 

```{r, results = 'hide', eval=FALSE}
pak::pkg_install("MiscMetabar")
```



You can also install the stable development version from [GitHub](https://github.com/) with:

```{r, results = 'hide', eval=FALSE}
if (!require("devtools", quietly = TRUE)) {
  install.packages("devtools")
}
devtools::install_github("adrientaudiere/MiscMetabar")
```

You can install the unstable development version from [GitHub](https://github.com/) with:

```{r, results = 'hide', eval=FALSE}
if (!require("devtools", quietly = TRUE)) {
  install.packages("devtools")
}
devtools::install_github("adrientaudiere/MiscMetabar", ref = "dev")
```


## Some use of MiscMetabar

See articles in the [MiscMetabar](https://adrientaudiere.github.io/MiscMetabar/) website for more examples.

For an introduction to metabarcoding in R, see the [state of the field](https://adrientaudiere.github.io/MiscMetabar/articles/states_of_fields_in_R.html) article. The [import, export and tracking](https://adrientaudiere.github.io/MiscMetabar/articles/import_export_track.html) article explains how to import and export `phyloseq` objects. It also shows how to summarize useful information (number of sequences, samples and clusters) across bioinformatic pipelines. The article [explore data](https://adrientaudiere.github.io/MiscMetabar/articles/explore_data.html) takes a closer look at different ways to explore samples and taxonomic data from `phyloseq` object. 

 
If you are interested in ecological metrics, see the articles describing [alpha-diversity](https://adrientaudiere.github.io/MiscMetabar/articles/alpha-div.html) and [beta-diversity](https://adrientaudiere.github.io/MiscMetabar/articles/beta-div.html) analysis. 
The article [filter taxa and samples](https://adrientaudiere.github.io/MiscMetabar/articles/filter.html) describes some data filtering processes using MiscMetabar and the [reclustering](https://adrientaudiere.github.io/MiscMetabar/articles/Reclustering.html) tutorial introduces the different way of clustering already-clustered OTU/ASV. The article [tengeler](https://adrientaudiere.github.io/MiscMetabar/articles/tengeler.html) explore the dataset from Tengeler et al. (2020) using some MiscMetabar functions.

For developers, I also wrote an article describing some [rules of codes](https://adrientaudiere.github.io/MiscMetabar/articles/Rules.html). 

### Summarize a physeq object

```{r example}
#| fig.alt: >
#|   Four rectangles represent the four component of an example phyloseq
#|   dataset. In each rectangle, some informations about the component are
#|   shown.

library("MiscMetabar")
library("phyloseq")
library("magrittr")
data("data_fungi")
summary_plot_pq(data_fungi)
```

### Alpha-diversity analysis

```{r, fig.cap="Hill number 0"}
#| fig.alt: >
#|   Hill number 0, aka richness are plot in function of
#|   the height modality
p <- MiscMetabar::hill_pq(data_fungi, fact = "Height")
p$plot_Hill_0
```

```{r, fig.cap="Result of the Tuckey post-hoc test"}
#| fig.alt: >
#|   The result of the tuckey HSD test of hill number by the
#|   height modality.
p$plot_tuckey
```

### Beta-diversity analysis

```{r}
#| fig.alt: >
#|   A venn diagram showing the number of shared ASV and the percentage
#|   of shared ASV between the three modality of Height (low, middle and high).
if (!require("ggVennDiagram", quietly = TRUE)) {
  install.packages("ggVennDiagram")
}
ggvenn_pq(data_fungi, fact = "Height") +
  ggplot2::scale_fill_distiller(palette = "BuPu", direction = 1) +
  labs(title = "Share number of ASV among Height in tree")
```

### Note for non-Linux users

Some functions may not work on Windows (*e.g.* `track_wkflow()`, `cutadapt_remove_primers()`, `krona()`, `vsearch_clustering()`, ...). A solution is to exploit docker container, for example the using the great [rocker project](https://rocker-project.org/).

Here is a list of functions with some limitations or not working at all on Windows OS: 

- `build_phytree_pq()`
- `count_seq()`
- `cutadapt_remove_primers()`
- `krona()`
- `merge_krona()`
- `multipatt_pq()`
- `plot_tsne_pq()`
- `rotl_pq()`
- `save_pq()`
- `tax_datatable()`
- `track_wkflow()`
- `track_wkflow_samples()` 
- `tsne_pq()`
- `venn_pq()`

MiscMetabar is developed under Linux and the vast majority of functions may works on Unix system, but its functionning is not tested under iOS.

### Installation of other softwares for Debian Linux distributions

If you encounter any errors or have any questions about the installation of these softwares, please visit their dedicated websites. 

#### [blast+](https://blast.ncbi.nlm.nih.gov/doc/blast-help/downloadblastdata.html#downloadblastdata)

```sh
sudo apt-get install ncbi-blast+
```

#### [vsearch](https://github.com/torognes/vsearch)

```sh
sudo apt-get install vsearch
```

An other possibilities is to [install vsearch](https://bioconda.github.io/recipes/vsearch/README.html?highlight=vsearch#package-package%20'vsearch') with `conda`.

#### [swarm](https://github.com/torognes/swarm)

```sh
git clone https://github.com/torognes/swarm.git
cd swarm/
make
```

An other possibilities is to [install swarm](https://bioconda.github.io/recipes/swarm/README.html?highlight=swarm#package-package%20'swarm') with `conda`.

#### [Mumu](https://github.com/frederic-mahe/mumu)

```sh
git clone https://github.com/frederic-mahe/mumu.git
cd ./mumu/
make
make check
make install  # as root or sudo
```

#### [cutadapt](https://cutadapt.readthedocs.io/en/stable/)

```sh
conda create -n cutadaptenv cutadapt
```

Owner

  • Name: Adrien Taudiere
  • Login: adrientaudiere
  • Kind: user
  • Company: IdEst

Professional website: https://adrientaudiere.com

JOSS Publication

MiscMetabar: an R package to facilitate visualization and reproducibility in metabarcoding analysis
Published
December 19, 2023
Volume 8, Issue 92, Page 6038
Authors
Adrien Taudière ORCID
IdEst, Saint-Bonnet-de-Salendrinque, 30460 France
Editor
Kelly Rowland ORCID
Tags
Bioinformatic Metagenomics Barcoding Reproducibility

Citation (citation.cff)

cff-version: "1.2.0"
authors:
- family-names: Taudière
  given-names: Adrien
  orcid: "https://orcid.org/0000-0003-1088-1182"
doi: 10.5281/zenodo.10370781
message: If you use this software, please cite our article in the
  Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: Taudière
    given-names: Adrien
    orcid: "https://orcid.org/0000-0003-1088-1182"
  date-published: 2023-12-19
  doi: 10.21105/joss.06038
  issn: 2475-9066
  issue: 92
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 6038
  title: "MiscMetabar: an R package to facilitate visualization and
    reproducibility in metabarcoding analysis"
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.06038"
  volume: 8
title: "MiscMetabar: an R package to facilitate visualization and
  reproducibility in metabarcoding analysis"

GitHub Events

Total
  • Create event: 5
  • Release event: 5
  • Issues event: 2
  • Watch event: 4
  • Issue comment event: 4
  • Push event: 41
  • Pull request event: 19
Last Year
  • Create event: 5
  • Release event: 5
  • Issues event: 2
  • Watch event: 4
  • Issue comment event: 4
  • Push event: 41
  • Pull request event: 19

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 511
  • Total Committers: 4
  • Avg Commits per committer: 127.75
  • Development Distribution Score (DDS): 0.029
Past Year
  • Commits: 65
  • Committers: 3
  • Avg Commits per committer: 21.667
  • Development Distribution Score (DDS): 0.215
Top Committers
Name Email Commits
Adrien Taudiere a****e@t****m 496
adrien a****n@l****u 8
adrien a****n@l****u 6
Markus Ankenbrand m****s@a****e 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 18
  • Total pull requests: 92
  • Average time to close issues: 3 days
  • Average time to close pull requests: about 3 hours
  • Total issue authors: 8
  • Total pull request authors: 2
  • Average comments per issue: 2.28
  • Average comments per pull request: 0.03
  • Merged pull requests: 86
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 26
  • Average time to close issues: 3 days
  • Average time to close pull requests: 2 minutes
  • Issue authors: 2
  • Pull request authors: 1
  • Average comments per issue: 2.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 25
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • iimog (7)
  • tkchafin (4)
  • adrientaudiere (2)
  • gabyr2 (1)
  • ddsjoberg (1)
  • Julie-lmq (1)
  • RachBioHaz (1)
  • garanceleroy (1)
Pull Request Authors
  • adrientaudiere (110)
  • iimog (1)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 707 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 9
  • Total maintainers: 1
cran.r-project.org: MiscMetabar

Miscellaneous Functions for Metabarcoding Analysis

  • Versions: 9
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 707 Last month
Rankings
Dependent packages count: 28.2%
Dependent repos count: 36.2%
Average: 49.7%
Downloads: 84.7%
Maintainers (1)
Last synced: 4 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.5.0 depends
  • dada2 * depends
  • ggplot2 * depends
  • magrittr * depends
  • phyloseq * depends
  • lifecycle * imports
  • Biostrings * suggests
  • DECIPHER * suggests
  • DESeq2 * suggests
  • DT * suggests
  • circlize * suggests
  • edgeR * suggests
  • grid * suggests
  • gridExtra * suggests
  • here * suggests
  • lulu * suggests
  • metacoder * suggests
  • methods * suggests
  • mixtools * suggests
  • multcompView * suggests
  • networkD3 * suggests
  • pbapply * suggests
  • plyr * suggests
  • reshape2 * suggests
  • speedyseq * suggests
  • testthat >= 3.0.0 suggests
  • vegan * suggests
  • venneuler * suggests
  • viridis * suggests
.github/workflows/pkgdown.yaml actions
  • JamesIves/github-pages-deploy-action v4.4.1 composite
  • actions/checkout v3 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/draft-pdf.yaml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v1 composite
  • openjournals/openjournals-draft-action master composite
.github/workflows/test-coverage.yaml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v3 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite