segul

An ultrafast and memory efficient tool for phylogenomics

https://github.com/hhandika/segul

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    1 of 4 committers (25.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.6%) to scientific vocabulary

Keywords

alignment bioinformatics command-line-tool phylogenetics phylogenomics rust
Last synced: 6 months ago · JSON representation ·

Repository

An ultrafast and memory efficient tool for phylogenomics

Basic Info
  • Host: GitHub
  • Owner: hhandika
  • License: mit
  • Language: Rust
  • Default Branch: main
  • Homepage: https://segul.app
  • Size: 1.86 MB
Statistics
  • Stars: 26
  • Watchers: 2
  • Forks: 2
  • Open Issues: 0
  • Releases: 66
Topics
alignment bioinformatics command-line-tool phylogenetics phylogenomics rust
Created over 4 years ago · Last pushed 7 months ago
Metadata Files
Readme Changelog License Code of conduct Citation

README.md

SEGUL segul logo

Segul-Tests Crate-IO GH-Release PyPI - Version install with bioconda Conda Version Crates-Download GitHub Downloads (all assets, all releases) Pepy Total Downloads Conda Downloads last-commit License LoC

SEGUL simplifies complex, tedious, and error-prone data wrangling and summarization for genomics and Sanger datasets. We develop it to be easy for beginners while providing advanced features for experienced users. SEGUL is also a high performance and memory efficient genomic tool. In our tests, it consistently offers a faster and more efficient (low memory footprint) alternative to existing applications for various genomic tasks (see benchmark).

SEGUL runs on many software platforms, from mobile devices and personal computers to high-performance computing clusters. It is available as a command-line interface (CLI), a graphical user interface (GUI) application, and a Rust library and Python package (see platform support below). SEGUL is part of our ongoing effort to ensure that genomic software is accessible to everyone, regardless of their bioinformatic skills and computing resources.

Learn more about SEGUL in the documentation. We welcome feedback if you find any issues, difficulties or have ideas to improve the app and its documentation (details below).

Citation

Handika, H., and J. A. Esselstyn. 2024. SEGUL: Ultrafast, memory-efficient and mobile-friendly software for manipulating and summarizing phylogenomic datasets. Molecular Ecology Resources. https://doi.org/10.1111/1755-0998.13964.

Essential Links

Supported File Formats

Sequence formats:

  1. NEXUS
  2. Relaxed PHYLIP
  3. FASTA
  4. FASTQ (gzipped and uncompressed)
  5. Multiple Alignment Format (MAF) (In development)
  6. Variant Call Format (VCF) (In development)

All formats are supported in interleave and sequential versions. The app supports DNA and amino acid sequences, except for FASTQ, MAF, and VCF (DNA only).

Alignment partition formats:

  1. RaXML
  2. NEXUS

The NEXUS partition can be written as a charset block embedded in NEXUS formatted sequences or a separate file.

Installation

GUI Version

Desktop

Microsoft Store download

Download on the Mac App Store

Get it from the Snap Store

Mobile

Download on the App Store

Get it on Google Play

Learn more about device requirements and GUI app installation in the documentation.

CLI Version

The CLI app may work in any Rust-supported platform. However, we only tested and officially support the following platforms:

  • Linux
  • MacOS
  • Windows
  • Windows Subsystem for Linux (WSL)

CLI Installation Methods

API version

The API version is available for Rust and other programming languages. For Rust users, you can install it via Cargo:

bash cargo add segul

Python

We provide binding for Python (called pysegul). Use SEGUL just like any other Python package:

python pip install pysegul

Learn more about using SEGUL API in the documentation.

Features

NOTES: To try beta features, follow the installation instruction for the beta version.

| Features | Supported Input Formats | Guideline Quick Links | | ------------------------------ | ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | Alignment concatenation | FASTA, NEXUS, PHYLIP | CLI / GUI / Python | | Alignment conversion | FASTA, NEXUS, PHYLIP | CLI / GUI / Python | | Alignment filtering | FASTA, NEXUS, PHYLIP | CLI / GUI / Python | | Alignment splitting | FASTA, NEXUS, PHYLIP | CLI / GUI / Python | | Alignment partition conversion | RaXML, NEXUS | CLI / GUI / Python | | Alignment summary statistics | FASTA, NEXUS, PHYLIP | CLI / GUI / Python | | Alignment trimming | FASTA, NEXUS, PHYLIP | CLI (Beta) / Coming soon | | Genomic summary statistics | FASTQ, FASTA (contigs) | CLI / GUI / Python | | Multiple alignment conversion | MAF | CLI (Beta) / Coming soon | | Sequence addition | FASTA, NEXUS, PHYLIP | CLI (Beta) / Coming soon | | Sequence extraction | FASTA, NEXUS, PHYLIP | CLI / GUI / Python | | Sequence filtering | FASTA, NEXUS, PHYLIP | CLI / Coming soon | | Sequence ID extraction | FASTA, NEXUS, PHYLIP | CLI / GUI / Python | | Sequence ID mapping | FASTA, NEXUS, PHYLIP | CLI / GUI / Python | | Sequence ID renaming | FASTA, NEXUS, PHYLIP | CLI / GUI / Coming soon | | Sequence removal | FASTA, NEXUS, PHYLIP | CLI / GUI / Python | | Sequence translation | FASTA, NEXUS, PHYLIP | CLI / GUI / Python |

Contribution

We welcome any contribution, from issue reporting and ideas to improve the app and documentation to code contribution. For ideas and issue reporting, please post on the Github issues page. For code contribution, please fork the repository and send pull requests to this repository.

Owner

  • Name: Heru Handika
  • Login: hhandika
  • Kind: user
  • Location: Baton Rouge, USA

A daily field biologist and taxonomist && nightly programmer.

Citation (CITATION.cff)

cff-version: 1.2.0
message: Please cite the following works when using this software.
preferred-citation:
  abstract: >-
    <jats:title>Abstract</jats:title><jats:p>Phylogenetic studies now routinely
    require manipulating and summarizing thousands of data files. For most of
    these tasks, currently available software requires considerable computing
    resources and substantial knowledge of command‐line applications. We develop
    an ultrafast and memory‐efficient software, SEGUL, that performs common
    phylogenomic dataset manipulations and calculates statistics summarizing
    essential data features. Our software is available as standalone
    command‐line interface (CLI) and graphical user interface (GUI)
    applications, and as a library for Rust, R and Python, with possible support
    of other languages. The CLI and library versions run native on Windows,
    Linux and macOS, including Apple ARM Macs. The GUI version extends support
    to include mobile iOS, iPadOS and Android operating systems. SEGUL leverages
    the high performance of the Rust programming language to offer fast
    execution times and low memory footprints regardless of dataset size and
    platform choice. The inclusion of a GUI minimizes bioinformatics barriers to
    phylogenomics while SEGUL's efficiency reduces economic barriers by allowing
    analysis on inexpensive hardware. Our support for mobile operating systems
    further enables teaching phylogenomics where access to computing power is
    limited.</jats:p>
  authors:
    - family-names: Handika
      given-names: Heru
    - family-names: Esselstyn
      given-names: Jacob A.
  doi: 10.1111/1755-0998.13964
  identifiers:
    - type: doi
      value: 10.1111/1755-0998.13964
    - type: url
      value: http://dx.doi.org/10.1111/1755-0998.13964
    - type: other
      value: urn:issn:1755-098X
  title: >-
    SEGUL: Ultrafast, memory‐efficient and mobile‐friendly software for
    manipulating and summarizing phylogenomic datasets
  url: http://dx.doi.org/10.1111/1755-0998.13964
  database: Crossref
  date-published: 2024-04-26
  year: 2024
  month: 4
  issn: 1755-098X
  journal: Molecular Ecology Resources
  languages:
    - en
  publisher:
    name: Wiley
  type: article

GitHub Events

Total
  • Watch event: 3
  • Push event: 40
Last Year
  • Watch event: 3
  • Push event: 40

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 1,405
  • Total Committers: 4
  • Avg Commits per committer: 351.25
  • Development Distribution Score (DDS): 0.005
Past Year
  • Commits: 141
  • Committers: 1
  • Avg Commits per committer: 141.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Heru Handika h****g@g****m 1,398
Jake Esselstyn e****n@l****u 4
Heru Handika 4****a 2
Heru Handika h****a@H****l 1
Committer Domains (Top 20 + Academic)
lsu.edu: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 4
  • Total pull requests: 0
  • Average time to close issues: about 1 month
  • Average time to close pull requests: N/A
  • Total issue authors: 3
  • Total pull request authors: 0
  • Average comments per issue: 2.25
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: about 15 hours
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 4.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • stiatragul (2)
  • hhandika (1)
  • AntonioBaeza (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cargo 67,420 total
  • Total dependent packages: 1
  • Total dependent repositories: 0
  • Total versions: 61
  • Total maintainers: 1
crates.io: segul

An ultrafast and memory-efficient tool for phylogenomics

  • Versions: 61
  • Dependent Packages: 1
  • Dependent Repositories: 0
  • Downloads: 67,420 Total
Rankings
Downloads: 17.9%
Stargazers count: 25.7%
Average: 27.2%
Forks count: 29.1%
Dependent repos count: 29.3%
Dependent packages count: 33.8%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/release.yml actions
  • actions/checkout v3 composite
  • dtolnay/rust-toolchain stable composite
  • svenstaro/upload-release-action v2 composite
.github/workflows/test.yml actions
  • actions-rs/cargo v1 composite
  • actions/checkout v3 composite
  • dtolnay/rust-toolchain stable composite
Cargo.toml cargo
  • assert_cmd 2.* development
  • predicates 2.* development
  • tempdir 0.3.7 development
  • ahash 0.8.*
  • alphanumeric-sort 1.4.*
  • anyhow 1.0.*
  • bytecount 0.6.*
  • chrono 0.4.*
  • clap 3.*
  • colored 2.0.0
  • dialoguer 0.*
  • glob 0.3.*
  • indexmap 1.*
  • indicatif 0.17.*
  • lazy_static 1.*
  • log 0.*
  • log4rs 1.2.*
  • nom 6.*
  • num-format 0.*
  • rayon 1.*
  • regex 1.*