saccharopolyspora_manuscript

https://github.com/matinnuhamunada/saccharopolyspora_manuscript

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 12 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.4%) to scientific vocabulary

Last synced: 6 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: matinnuhamunada
License: mit
Language: HTML
Default Branch: main
Size: 12.2 MB

Statistics

Stars: 1
Watchers: 1
Forks: 2
Open Issues: 3
Releases: 1

Created almost 3 years ago · Last pushed about 2 years ago

Metadata Files

Readme License Citation

README.md

README

This repository contains the scripts (in jupyter notebooks) to generate the figure in the manuscript "BGCFlow: Systematic pangenome workflow for the analysis of biosynthetic gene clusters across large genomic datasets".

USAGE

1. Clone this repository

bash git clone https://github.com/matinnuhamunada/saccharopolyspora_manuscript.git

2. Set up BGCFlow

```bash

create and activate new conda environment

conda create -n bgcflow pip -y conda activate bgcflow

install BGCFlow wrapper

pip install git+https://github.com/NBChub/bgcflow_wrapper.git

clone BGCFlow to "bgcflow" folder

bgcflow clone bgcflow ```

2. Download the dataset

Donwload the dataset containing the BGCFlow runs from Zenodo

```bash

move to bgcflow dir

cd bgcflow

download and extract dataset

wget https://zenodo.org/record/8018055/files/saccharopolysporadataset.zip unzip saccharopolysporadataset.zip ```

3. Set configurations

```bash

go back to the manuscript dir

cd ../saccharopolyspora_manuscript/

edit the location of the bgcflow dir to the right directory

nano config.yaml ```

4. Setting up Conda Environments

Install these conda environments: bash mamba env create -f python_notebook.yaml mamba env create -f r_notebook.yaml mamba env create -f <bgcflow_dir>/workflow/envs/cblaster.yaml

5. Run the notebooks

There are two kind of notebooks, R (.R.ipynb) and python (.python.ipynb)
Run the notebook using the corresponding conda environment: python_notebook or r_notebook
Start jupyter session bash # for python conda activate python_notebook jupyter lab bash # for R conda activate r_notebook jupyter lab
Run the notebooks in order

Citation

Matin Nuhamunada, Omkar S. Mohite, Patrick V. Phaneuf, Bernhard O. Palsson, and Tilmann Weber. (2023). BGCFlow: Systematic pangenome workflow for the analysis of biosynthetic gene clusters across large genomic datasets. bioRxiv 2023.06.14.545018; doi: https://doi.org/10.1101/2023.06.14.545018

Nuhamunada, Matin, & Mohite, Omkar Satyavan. (2023). BGCFlow Analysis of Saccharopolyspora Genomes (0.1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.8018055

Owner

Name: Matin Nuhamunada
Login: matinnuhamunada
Kind: user
Location: Copenhagen

Website: https://matinnuhamunada.github.io/
Twitter: matinnuhamunada
Repositories: 3
Profile: https://github.com/matinnuhamunada

PhD Student at DTU Biosustain | Natural Products Genome Mining

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

saccharopolyspora_manuscript

Science Score: 49.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

README

USAGE

1. Clone this repository

2. Set up BGCFlow

create and activate new conda environment

install BGCFlow wrapper

clone BGCFlow to "bgcflow" folder

2. Download the dataset

move to bgcflow dir

download and extract dataset

3. Set configurations

go back to the manuscript dir

edit the location of the bgcflow dir to the right directory

4. Setting up Conda Environments

5. Run the notebooks

Citation

Owner

GitHub Events

Total

Last Year