mumu

C++ implementation of lulu, a R package for post-clustering curation of metabarcoding data

https://github.com/frederic-mahe/mumu

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.7%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

C++ implementation of lulu, a R package for post-clustering curation of metabarcoding data

Basic Info
  • Host: GitHub
  • Owner: frederic-mahe
  • License: gpl-3.0
  • Language: Shell
  • Default Branch: main
  • Homepage:
  • Size: 259 KB
Statistics
  • Stars: 9
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 2
Created almost 6 years ago · Last pushed 10 months ago
Metadata Files
Readme License Citation

README.md

mumu

C/C++ CI coverage

fast and robust C++ implementation of lulu, a R package for post-clustering curation of metabarcoding data

mumu is not a strict lulu clone. There is a bug in lulu that prevents some merging from happening. Additionaly, mumu can chain merges, not lulu. This results in slightly more merging with mumu (by a few percent).

mumu is fully tested, with 146 carefully crafted individual black-box tests, covering 100% of the application-specific C++ code. Tests are written using common Unix/Linux shell utilities. Some C++ internal tests are also used (assertions), but these are only active at compile-time, or at runtime when compiling with the debug flag.

mumu uses C++20 features to make the code simpler, easier to maintain and to port to other systems. The downside is that using mumu requires a recent C++ compiler (GCC 10 or more recent, clang 17 or more recent). If your system only provides an older compiler, a recipe for a singularity/Apptainer/docker image is available.

About the name of the project, m is simply the next letter after l, hence mumu. Any similarity to actual words is purely coincidental.

Getting Started

sh git clone https://github.com/frederic-mahe/mumu.git cd ./mumu/ make make check make install # as root or sudo

  • dependencies are minimal:

    • a GNU/Linux 64-bit system,
    • make (version 4 or more recent),
    • a recent GCC compiler (GCC 10 or more recent, clang 17 or more recent),
    • GNU Awk and other GNU tools for testing
  • run (see mumu --help and man mumu for details):

sh mumu \ --otu_table OTU.table \ --match_list matches.list \ --log /dev/null \ --new_otu_table new_OTU.table

  • alternatively, build an Apptainer (ex-singularity) image for systems with older compilers:

```sh

build image with singularity 3.8.5

(Alpine edge with GCC 11.2 [2022-02-25])

singularity \ build \ --fakeroot \ --force mumu-alpine.sif \ mumu-alpine.recipe

test (image is appr. 4 MB)

singularity run mumu-alpine.sif --help ```

Native compilation on Windows machine, as well as BSD systems is a work in progress.

wrapper

  • Adrien Taudière (@adrientaudiere) published mumu_pq, a wrapper that allows to use mumu on phyloseq objects (R).

Roadmap

mumu is currently feature-complete (nothing is missing), but refactoring will continue and new versions will be released as soon as more C++ features (C++20 modules, C++23 ranges, etc.) are standardized and supported by compilers.

  • [x] replicate lulu's results,
  • [x] fix lulu's bug,
  • [x] allow chained merges,
  • [x] high software quality score (softwipe),
  • [x] allow empty input files,
  • [x] allow process substitutions (input/output),
  • [x] compile without warnings with GCC 10 and 11,
  • [x] compile without warnings with GCC 12.2,
  • [ ] compile without warnings with GCC 12.3,
  • [x] compile without warnings with GCC 13 and 14 (alpha)
  • [x] compile with clang-17, 18, 19 and 20 (std::ranges is not supported in clang-16),
  • [x] investigate the five minor failed tests when running on Alpine (as root),
  • [ ] add a row of column header to the log file? (see issue https://github.com/frederic-mahe/mumu/issues/4)
  • [ ] silently strip quote symbols from input table? Exporters often quote strings, tripping some users,
  • [ ] allow named pipes (input/output),
  • [x] test performances on ARM64 GNU/Linux (Raspberry),
  • [ ] faster output with std::format (in 2025),
  • [ ] native compilation on Windows (issue with getopt.h) ,
  • [ ] native compilation on BSD (issue with the Makefile),
  • [ ] native compilation on macOS

mumu releases follow the Semantic Versioning 2.0.0 rules.

Owner

  • Name: Frédéric Mahé
  • Login: frederic-mahe
  • Kind: user
  • Location: Montpellier, France
  • Company: Cirad

bioinformatician

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Mahé"
  given-names: "Frédéric"
  orcid: "https://orcid.org/0000-0002-2808-0984"
title: "mumu: post-clustering curation tool for metabarcoding data"
version: 1.0.2
date-released: 2023-03-25
url: "https://github.com/frederic-mahe/mumu"

GitHub Events

Total
  • Watch event: 1
  • Delete event: 1
  • Push event: 4
Last Year
  • Watch event: 1
  • Delete event: 1
  • Push event: 4

Dependencies

.github/workflows/c-cpp.yml actions
  • actions/checkout v2 composite
.github/workflows/coverage.yml actions
  • actions/checkout v2 composite