lsabgc

lsaBGC - Lineage Specific Analysis of Biosynthetic Gene Clusters

https://github.com/kalan-lab/lsabgc

Science Score: 39.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.1%) to scientific vocabulary

Keywords

bgc bioinformatics biosynthetic-gene-clusters evolutionary-analysis genomics
Last synced: 6 months ago · JSON representation

Repository

lsaBGC - Lineage Specific Analysis of Biosynthetic Gene Clusters

Basic Info
  • Host: GitHub
  • Owner: Kalan-Lab
  • License: bsd-3-clause
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 319 MB
Statistics
  • Stars: 37
  • Watchers: 3
  • Forks: 4
  • Open Issues: 1
  • Releases: 20
Topics
bgc bioinformatics biosynthetic-gene-clusters evolutionary-analysis genomics
Created about 5 years ago · Last pushed 10 months ago
Metadata Files
Readme Changelog License Citation

README.md

lsaBGC

:warning: lsaBGC is being supported (e.g. will aim to fix issues with usability, bugs) but is no longer under active development (expect no new features). We ask you to please instead check out lsaBGC-Pan - a new version which features many of the core programs from lsaBGC as well as new modules! Note, also, the bioconda package lsabgc corresponds to lsaBGC-Pan.

Lineage Specific Analysis (lsa) of Biosynthetic Gene Clusters (BGC)

Manuscript Documentation Docker

lsaBGC offers modular programs, as well as workflows, designed for investigating and mining for biosynthetic gene cluster diversity across a focal lineage/taxa of interest.

Compatible with BGC predictions from antiSMASH, GECCO, and DeepBGC.

image

Documentation and How to Get Started:

Documentation can currently be found on this Github repo's wiki: https://github.com/Kalan-Lab/lsaBGC/wiki

  1. Background on lsaBGC - what it does and does not do
  2. An Overview of Final Results from lsaBGC
  3. Quick Start - Using the simple lsaBGC-Easy.py (bacterial) and lsaBGC-Euk-Easy.py (fungal) workflows
  4. Modular Usage - Exploring BGCs in Cutibacterium
  5. GSeeF - quick and simple visualization of GCFs/BGCs across a species phylogeny
  6. visualize_BGC-ome - quick and simple visualization of a sample's BGC-ome
  7. new: Investigate a single cluster of related BGCs using the sibling suite zol

IMPORTANT: PLEASE USE v1.52+:

Please make sure to use v1.52+ of the pipeline - if you are using lsaBGC-Easy.py with antiSMASH, the default settings for antiSMASH based BGC prediction from v1.38 to v1.51 included the argument --taxon fungi by mistake. It should only be the default for the analagous lsaBGC-Euk-Easy.py program.

In v1.53, we added missing singleton CDS features that were regarded as faulty by Panaroo analysis but might correspond to BGCs according to BGC prediction software. We also added a separate processing for resolved hierarchical ortholog groups from OrthoFinder analysis when working with fungal genomes to more directly use the results as is, rather than performing customized processing developed for investigation of bacterial genomes.

Installation:

Using Conda (for full usage of suite)

Installation can be performed via conda (see below for Docker) and should take ~5 minutes with mamba or ~10-20 minutes with conda and has been tested on both unix (specifically Ubuntu) and macOS. We are happy to attempt to address issues with installation if any arise, please open a Git Issues case:

```bash

1. download latest release and uncompress

curl -o lsaBGC.tar.gz https://github.com/Kalan-Lab/lsaBGC/archive/refs/tags/v1.55.tar.gz cd lsaBGC-1.54/

cd lsaBGC/

2. create conda environment using yaml file and activate it!

For a much faster installation replace "conda" in the following

commands with "mamba" (after installing mamba in your base conda

environment)

mamba env create -f lsaBGCenv.yml -p /path/to/lsaBGCcondaenv/ conda activate /path/to/lsaBGCconda_env/

3. complete python installation with the following commands:

python setup.py install pip install -e . ```

Optional, but recommended, command to download KOfams + PGAP HMMs + MIBiG protein FASTA for annotation:

```bash

Warning: can take >10 minutes!

Can skip to run tests first to make sure things are working properly.

within lsaBGC Git repo with conda environment activated:

setupannotationdbs.py ```

If clustering of BGCs into GCFs using BiG-SCAPE is preferred to lsaBGC-Cluster.py, setup BiG-SCAPE using the following:

setup_bigscape.py

A small test case is provided here and can be run after installation by simply issuing (takes around ~7 minutes using 4 cpus/threads):

```bash

Warning: uses 4 cpus/threads!

bash run_tests.sh ```

There are also additional test cases to demonstrate usage of individual programs along with expected outputs from commands. We also have a walk-through tutorial Wiki page to showcase the use of the suite and relations between core programs.

The major outputs of the final lsaBGC-AutoAnalyze.py run are in the resulting folder test_case/lsaBGC_AutoAnalyze_Results/Final_Results/ and described on this wiki page. Examples for the final AutoAnalyze results from an lsaBGC-Easy.py run on Cutibacterium avidum can be found here on Google Drive.

Using Docker (for major workflows only)

A docker image is provided for the lsaBGC-Easy.py and lsaBGC-Euk-Easy.py workflows together with a wrapper script. The image is pretty large (~26Gb) but includes all the databases and dependencies needed for lsaBGC, BiG-SCAPE, antiSMASH, and GECCO analysis. For lsaBGC, to save space, the KOfam database is not included. For antiSMASH, MEME is not incldued (for licensing reasons), thus RODEO and CASSIS analyses are not available.

To use the latest Docker image, please: (1) install Docker and (2) download the wrapper script:

```bash

1. download wrapper script for running image

wget https://raw.githubusercontent.com/Kalan-Lab/lsaBGC/main/docker/run_LSABGC.sh

2. run it

bash run_LSABGC.sh ```

Quick Start - using lsaBGC-Easy.py and lsaBGC-Euk-Easy.py

Check out how to use lsaBGC-Easy.py and lsaBGC-Euk-Easy.py on their wiki page!

image


image

Acknowledgements:

We would like to thank members of the Kalan lab, Currie lab, Kwan lab, Anantharaman lab, and Pepperell lab at UW Madison for feedback on the development of lsaBGC.

Feedback:

Issues or suggestions for new features / changes to approaches? Please create an issue/ticket on GitHub issues and let us know!

License:

``` BSD 3-Clause License

Copyright (c) 2021, Kalan-Lab All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

  3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ```

Owner

  • Name: Kalan-Lab
  • Login: Kalan-Lab
  • Kind: organization
  • Location: University of Wisconsin-Madison

A Place to Collaborate

GitHub Events

Total
  • Create event: 3
  • Release event: 3
  • Issues event: 6
  • Watch event: 2
  • Issue comment event: 4
  • Push event: 36
  • Pull request event: 2
Last Year
  • Create event: 3
  • Release event: 3
  • Issues event: 6
  • Watch event: 2
  • Issue comment event: 4
  • Push event: 36
  • Pull request event: 2

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 4
  • Total pull requests: 25
  • Average time to close issues: 4 months
  • Average time to close pull requests: 11 minutes
  • Total issue authors: 4
  • Total pull request authors: 2
  • Average comments per issue: 4.5
  • Average comments per pull request: 0.2
  • Merged pull requests: 25
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: less than a minute
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • devarajarun (1)
  • luisruis (1)
  • raufs (1)
  • Sam-Will (1)
Pull Request Authors
  • raufs (24)
  • althonos (1)
Top Labels
Issue Labels
enhancement (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 12 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 10
  • Total maintainers: 1
pypi.org: lsabgc

Suite for comparative genomic, population genetics and evolutionary analysis, as well as metagenomic mining of micro-evolutionary novelty in BGCs all in the context of a single lineage of interest.

  • Versions: 10
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 12 Last month
Rankings
Dependent packages count: 10.0%
Stargazers count: 11.4%
Forks count: 16.8%
Average: 18.9%
Dependent repos count: 21.7%
Downloads: 34.3%
Maintainers (1)
Last synced: 6 months ago

Dependencies

docs/requirements.txt pypi
  • IPython *
  • nbsphinx *
  • recommonmark *
  • sphinx *
setup.py pypi
docker/Dockerfile docker
  • continuumio/miniconda3 latest build