Pynteny

Pynteny: a Python package to perform synteny-aware, profile HMM-based searches in sequence databases - Published in JOSS (2023)

https://github.com/robaina/pynteny

Science Score: 98.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 6 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org, zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

bioinformatics computational-biology genomics hmm hmmer metagenomics prokaryotic-genomes python synteny synteny-block

Scientific Fields

Engineering Computer Science - 60% confidence
Last synced: 4 months ago · JSON representation ·

Repository

Query sequence database by HMMs arranged in predefined synteny structure

Basic Info
Statistics
  • Stars: 13
  • Watchers: 1
  • Forks: 1
  • Open Issues: 2
  • Releases: 4
Topics
bioinformatics computational-biology genomics hmm hmmer metagenomics prokaryotic-genomes python synteny synteny-block
Created over 3 years ago · Last pushed about 2 years ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

logo

Synteny-aware hmm searches made easy

tests codecov docs

Project Status: Active – The project has reached a stable, usable state and is being actively developed. Anaconda-Server Badge license Contributor Covenant

Bioconda Anaconda-Server Badge GitHub release

Anaconda-Server Badge python Code style: black

pyOpenSci DOI <!-- DOI -->

1. :bulb: What is Pynteny?

Pynteny is Python tool to search for synteny blocks in (prokaryotic) sequence data through HMMs of the ORFs of interest and HMMER. By leveraging genomic context information, Pynteny can be employed to decrease the uncertainty of functional annotation of unlabelled sequence data due to the effect of paralogs. Pynteny can be accessed (i) through the command line or (ii) as a Python module.

Get more info in the documentation pages!

Check out the Pynteny paper in the Journal of Open Source Software!

2. :wrench: Setup

Install with conda:

  1. Pynteny requires Python 3.10. The easiest way to handle dependencies is by creating a dedicated conda environment:

bash conda create -n pynteny -c bioconda -c conda-forge python=3.10 pynteny conda activate pynteny

  1. Check that installation worked fine:

bash (pynteny) pynteny --help

2.1. Installing on Windows

Pynteny is designed to run on Linux machines. However, it can be installed within the Windows Subsystem for Linux via conda.

2.2. Installing on MacOS with the latest ARM64 architecture

Pynteny doesn't currently support the latest ARM64 architecture of silicon processors (e.g. MacBook M1 and M2). If that is your case, you can install Pynteny using the workaround below (based on this post):

bash CONDA_SUBDIR=osx-64 conda create -n pynteny_x86 python=3.10 conda activate pynteny_x86 conda config --env --set subdir osx-64 conda install -c bioconda pynteny

3. :rocket: Usage

Consider the following toy example of a syntenic block:

synteny example

Here, we are interested in four genes which colocate according to the pattern above: genes A-C show consecutive locations in the positive strand, followed by three (untargeted) genes and followed by gene D, which is located in the negative strand.

Pynteny can be run either as a command line tool or as a Python module. To run pynteny in the command line, execute:

bash conda activate pynteny pynteny <subcommand> <options>

pynyeny-cli

There are a number of available subcommands, which can be explored in the documentation pages.

For intance, to first download the PGAP's database containing a collection of profile HMMs as well as metadata:

bash pynteny download --outdir data/hmms --unpack

Next, to build a labelled peptide database from DNA assembly data:

```bash pynteny build \ --data assembly.fa \ --outfile labelled_peptides.faa

```

Finally, to search the peptide database for the syntenic structure displayed above: >gene_A 0 >gene_B 0 >gene_C 3 <gene_D, and using the downloaded PGAP database:

bash pynteny search \ --synteny_struc ">gene_A 0 >gene_B 0 >gene_C 3 <gene_D" \ --data labelled_peptides.faa \ --outdir results/ \ --gene_ids

4. :notebookwithdecorative_cover: Examples

Here are some Jupyter Notebooks with examples to show how Pynteny works:

You can find more notebooks in the examples directory. Find more info in the documentation.

5. :arrows_counterclockwise: Dependencies

Pynteny would not work without these awesome projects:

Thanks!

6. :octocat: Contributing

Contributions are always welcome! If you don't know where to start, you may find an interesting issue to work in here. Please, read our contribution guidelines first.

7. :black_nib: Citation

If you use this software, please cite it as below:

Semidán Robaina Estévez. (2023). Pynteny: synteny-aware hmm searches made easy (Version 1.0.0). Zenodo. https://zenodo.org/record/7696204

Owner

  • Name: Semidán Robaina
  • Login: Robaina
  • Kind: user
  • Location: Atlantic Ocean
  • Company: Hapdera

Computational Biology | Data Science | Python Dev. | Ph.D. Systems Biology

JOSS Publication

Pynteny: a Python package to perform synteny-aware, profile HMM-based searches in sequence databases
Published
March 22, 2023
Volume 8, Issue 83, Page 5289
Authors
Semidán Robaina-Estévez ORCID
Department of Microbiology. University of La Laguna. Spain.
José M. González ORCID
Department of Microbiology. University of La Laguna. Spain.
Editor
Kevin M. Moerman ORCID
Tags
bioinformatics HMMER synteny HMMs sequencing

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: "Robaina Estévez"
    given-names: "Semidán"
    orcid: "https://orcid.org/0000-0003-0781-1677"
    affiliation: "Department of Microbiology. University of La Laguna. Spain"
    email: "srobaina@ull.edu.es"
title: "Pynteny: synteny-aware hmm searches made easy"
version: 1.0.0
doi: 10.5281/zenodo.7696204
date-released: 2023-03-03
url: "https://github.com/Robaina/Pynteny"

GitHub Events

Total
Last Year

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 603
  • Total Committers: 2
  • Avg Commits per committer: 301.5
  • Development Distribution Score (DDS): 0.01
Past Year
  • Commits: 117
  • Committers: 1
  • Avg Commits per committer: 117.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Semidán Robaina Estévez s****a@g****m 597
Alex Batisse a****e@h****m 6
Committer Domains (Top 20 + Academic)
hey.com: 1

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 55
  • Total pull requests: 33
  • Average time to close issues: 20 days
  • Average time to close pull requests: 1 day
  • Total issue authors: 2
  • Total pull request authors: 2
  • Average comments per issue: 0.62
  • Average comments per pull request: 1.18
  • Merged pull requests: 32
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • Robaina (54)
  • wchow (1)
Pull Request Authors
  • Robaina (32)
  • Batalex (1)
Top Labels
Issue Labels
enhancement (34) bug (8) refactor (8) code review (6) documentation (5) good first issue (1) invalid (1)
Pull Request Labels
enhancement (9) refactor (9) code review (8) bug (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 20 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
pypi.org: pynteny

Multiple HMM - search via synteny structures in Python

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 20 Last month
Rankings
Dependent packages count: 4.8%
Dependent repos count: 6.3%
Downloads: 13.1%
Average: 14.0%
Stargazers count: 19.3%
Forks count: 26.6%
Last synced: about 1 year ago

Dependencies

.github/workflows/ci.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/docs.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v2 composite
.github/workflows/joss.yml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v1 composite
  • openjournals/openjournals-draft-action master composite
.github/workflows/tests.yml actions
  • actions/checkout v3 composite
  • codecov/codecov-action v3 composite
  • conda-incubator/setup-miniconda v2 composite
.devcontainer/Dockerfile docker
  • mcr.microsoft.com/devcontainers/miniconda 0-3 build
Dockerfile docker
  • mcr.microsoft.com/devcontainers/miniconda 0-3 build
pyproject.toml pypi
  • python ^3.8