https://github.com/althonos/pymuscle5

Cython bindings and Python interface to MUSCLE v5, a highly efficient and accurate multiple sequence alignment software.

https://github.com/althonos/pymuscle5

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 1 committers (100.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.2%) to scientific vocabulary

Keywords

bioinformatics cython-library genomics multiple-sequence-alignment muscle python-bindings python-library sequence-alignment
Last synced: 5 months ago · JSON representation

Repository

Cython bindings and Python interface to MUSCLE v5, a highly efficient and accurate multiple sequence alignment software.

Basic Info
  • Host: GitHub
  • Owner: althonos
  • License: gpl-3.0
  • Language: Cython
  • Default Branch: main
  • Homepage:
  • Size: 115 KB
Statistics
  • Stars: 21
  • Watchers: 4
  • Forks: 2
  • Open Issues: 3
  • Releases: 0
Topics
bioinformatics cython-library genomics multiple-sequence-alignment muscle python-bindings python-library sequence-alignment
Created almost 4 years ago · Last pushed almost 2 years ago
Metadata Files
Readme Contributing License

README.md

pyMUSCLE5 Stars

Cython bindings and Python interface to MUSCLE v5, a highly efficient and accurate multiple sequence alignment software.

Actions <!-- Coverage License PyPI Bioconda AUR Wheel Python Versions Python Implementations Source GitHub issues Docs Changelog Downloads -->

🗺️ Overview

MUSCLE is widely-used software for making multiple alignments of biological sequences. Version 5 of MUSCLE achieves highest scores on several benchmark tests and scales to thousands of sequences on a commodity desktop computer.

pyMUSCLE5 is a Python module that provides bindings to MUSCLE v5 using Cython. It directly interacts with the MUSCLE internals, which has the following advantages:

  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pymuscle5 as a dependency to your project, and stop worrying about the MUSCLE binaries being properly setup on the end-user machine.
  • no intermediate files: Everything happens in memory, in a Python object you fully control, so you don't have to invoke the MUSCLE CLI using a sub-process and temporary files. Sequences can be passed directly as strings or bytes, which avoids the overhead of formatting your input to FASTA for MUSCLE.
  • no OpenMP: The original MUSCLE code uses OpenMP to parallelize embarassingly-parallel tasks. In pyMUSCLE5 the dependency on OpenMP has been removed in favor of the Python threading module for better portability.

This library is in a very experimental stage at the moment, and consistency of the results across versions or platforms is not guaranteed yet.

🔧 Installing

At the moment pyMUSCLE5 is not available on PyPI. You can however install it directly from GitHub with:

console $ pip install git+https://github.com/althonos/pymuscle5

💡 Example

Let's load some sequences sequence from a FASTA file, use an Aligner to align proteins together, and print the alignment in two-line FASTA format.

🔬 Biopython

```python import os

import Bio.SeqIO import pymuscle5

path = os.path.join("pymuscle", "tests", "data", "swissprot-halorhodopsin.faa") records = list(Bio.SeqIO.parse(path, "fasta"))

sequences = [ pymuscle5.Sequence(record.id.encode(), bytes(record.seq)) for record in records ]

aligner = pymuscle5.Aligner() msa = aligner.align(sequences)

for seq in msa.sequences: print(f">{seq.name.decode()}") print(seq.sequence.decode()) ```

🧪 Scikit-bio

```python import os

import skbio.io import pymuscle5

path = os.path.join("pymuscle", "tests", "data", "swissprot-halorhodopsin.faa") records = list(skbio.io.read(path, "fasta"))

sequences = [ pymuscle5.Sequence(record.metadata["id"].encode(), record.values.view('B')) for record in records ]

aligner = pymuscle5.Aligner() msa = aligner.align(sequences)

for seq in msa.sequences: print(f">{seq.name.decode()}") print(seq.sequence.decode()) ```

We need to use the view method to get the sequence viewable by Cython as an array of unsigned char.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

📋 Changelog

This project adheres to Semantic Versioning and provides a changelog in the Keep a Changelog format.

⚖️ License

This library is provided under the GNU General Public License v3.0. The MUSCLE code was written by Robert Edgar and is distributed under the terms of the GPLv3 as well. See vendor/muscle/LICENSE for more information.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the original MUSCLE authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Owner

  • Name: Martin Larralde
  • Login: althonos
  • Kind: user
  • Location: Heidelberg, Germany
  • Company: EMBL / LUMC, @zellerlab

PhD candidate in Bioinformatics, passionate about programming, SIMD-enthusiast, Pythonista, Rustacean. I write poems, and sometimes they are executable.

GitHub Events

Total
  • Watch event: 5
Last Year
  • Watch event: 5

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 36
  • Total Committers: 1
  • Avg Commits per committer: 36.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Martin Larralde m****e@e****e 36
Committer Domains (Top 20 + Academic)
embl.de: 1

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 3
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 3
  • Total pull request authors: 0
  • Average comments per issue: 2.33
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 0
  • Average comments per issue: 0.5
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • jianshu93 (1)
  • Humpar (1)
  • WanlinLi2021 (1)
Pull Request Authors
Top Labels
Issue Labels
bug (1)
Pull Request Labels

Dependencies

.github/workflows/requirements.txt pypi
  • auditwheel *
  • codecov *
  • coverage *
  • cython *
  • importlib-resources *
  • setuptools >=46.4.0
  • wheel *
.github/workflows/package.yml actions
  • KSXGitHub/github-actions-deploy-aur v2.2.5 composite
  • actions/checkout v2 composite
  • actions/checkout v1 composite
  • actions/download-artifact v2 composite
  • actions/setup-python v2 composite
  • actions/upload-artifact v2 composite
  • addnab/docker-run-action v2 composite
  • docker/setup-qemu-action v1 composite
  • pypa/gh-action-pypi-publish master composite
  • rasmus-saks/release-a-changelog-action v1.0.1 composite
.github/workflows/test.yml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
setup.py pypi