mumu
C++ implementation of lulu, a R package for post-clustering curation of metabarcoding data
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.7%) to scientific vocabulary
Repository
C++ implementation of lulu, a R package for post-clustering curation of metabarcoding data
Basic Info
Statistics
- Stars: 9
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 2
Metadata Files
README.md
mumu
fast and robust C++ implementation of lulu, a R package for post-clustering curation of metabarcoding data
mumu is not a strict lulu clone. There is a bug in lulu that prevents some merging from happening. Additionaly, mumu can chain merges, not lulu. This results in slightly more merging with mumu (by a few percent).
mumu is fully tested, with 146 carefully crafted individual
black-box tests, covering 100% of the application-specific C++
code. Tests are written using common Unix/Linux shell utilities. Some
C++ internal tests are also used (assertions), but these are only
active at compile-time, or at runtime when compiling with the debug
flag.
mumu uses C++20 features to make the code simpler, easier to maintain and to port to other systems. The downside is that using mumu requires a recent C++ compiler (GCC 10 or more recent, clang 17 or more recent). If your system only provides an older compiler, a recipe for a singularity/Apptainer/docker image is available.
About the name of the project, m is simply the next letter after l, hence mumu. Any similarity to actual words is purely coincidental.
Getting Started
sh
git clone https://github.com/frederic-mahe/mumu.git
cd ./mumu/
make
make check
make install # as root or sudo
dependencies are minimal:
- a GNU/Linux 64-bit system,
make(version 4 or more recent),- a recent GCC compiler (GCC 10 or more recent, clang 17 or more recent),
- GNU Awk and other GNU tools for testing
run (see
mumu --helpandman mumufor details):
sh
mumu \
--otu_table OTU.table \
--match_list matches.list \
--log /dev/null \
--new_otu_table new_OTU.table
- alternatively, build an Apptainer (ex-singularity) image for systems with older compilers:
```sh
build image with singularity 3.8.5
(Alpine edge with GCC 11.2 [2022-02-25])
singularity \ build \ --fakeroot \ --force mumu-alpine.sif \ mumu-alpine.recipe
test (image is appr. 4 MB)
singularity run mumu-alpine.sif --help ```
Native compilation on Windows machine, as well as BSD systems is a work in progress.
wrapper
- Adrien Taudière (@adrientaudiere) published
mumu_pq,
a wrapper that allows to use
mumuon phyloseq objects (R).
Roadmap
mumu is currently feature-complete (nothing is missing), but refactoring will continue and new versions will be released as soon as more C++ features (C++20 modules, C++23 ranges, etc.) are standardized and supported by compilers.
- [x] replicate lulu's results,
- [x] fix lulu's bug,
- [x] allow chained merges,
- [x] high software quality score (softwipe),
- [x] allow empty input files,
- [x] allow process substitutions (input/output),
- [x] compile without warnings with GCC 10 and 11,
- [x] compile without warnings with GCC 12.2,
- [ ] compile without warnings with GCC 12.3,
- [x] compile without warnings with GCC 13 and 14 (alpha)
- [x] compile with clang-17, 18, 19 and 20 (
std::rangesis not supported in clang-16), - [x] investigate the five minor failed tests when running on Alpine (as root),
- [ ] add a row of column header to the log file? (see issue https://github.com/frederic-mahe/mumu/issues/4)
- [ ] silently strip quote symbols from input table? Exporters often quote strings, tripping some users,
- [ ] allow named pipes (input/output),
- [x] test performances on ARM64 GNU/Linux (Raspberry),
- [ ] faster output with
std::format(in 2025), - [ ] native compilation on Windows (issue with
getopt.h) , - [ ] native compilation on BSD (issue with the Makefile),
- [ ] native compilation on macOS
mumu releases follow the Semantic Versioning 2.0.0 rules.
Owner
- Name: Frédéric Mahé
- Login: frederic-mahe
- Kind: user
- Location: Montpellier, France
- Company: Cirad
- Website: http://mahé.org/
- Repositories: 20
- Profile: https://github.com/frederic-mahe
bioinformatician
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Mahé" given-names: "Frédéric" orcid: "https://orcid.org/0000-0002-2808-0984" title: "mumu: post-clustering curation tool for metabarcoding data" version: 1.0.2 date-released: 2023-03-25 url: "https://github.com/frederic-mahe/mumu"
GitHub Events
Total
- Watch event: 1
- Delete event: 1
- Push event: 4
Last Year
- Watch event: 1
- Delete event: 1
- Push event: 4
Dependencies
- actions/checkout v2 composite
- actions/checkout v2 composite