saccharopolyspora_manuscript
https://github.com/matinnuhamunada/saccharopolyspora_manuscript
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 12 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.4%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: matinnuhamunada
- License: mit
- Language: HTML
- Default Branch: main
- Size: 12.2 MB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 2
- Open Issues: 3
- Releases: 1
Metadata Files
README.md
README
This repository contains the scripts (in jupyter notebooks) to generate the figure in the manuscript "BGCFlow: Systematic pangenome workflow for the analysis of biosynthetic gene clusters across large genomic datasets".
USAGE
1. Clone this repository
bash
git clone https://github.com/matinnuhamunada/saccharopolyspora_manuscript.git
2. Set up BGCFlow
```bash
create and activate new conda environment
conda create -n bgcflow pip -y conda activate bgcflow
install BGCFlow wrapper
pip install git+https://github.com/NBChub/bgcflow_wrapper.git
clone BGCFlow to "bgcflow" folder
bgcflow clone bgcflow ```
2. Download the dataset
- Donwload the dataset containing the BGCFlow runs from Zenodo
```bash
move to bgcflow dir
cd bgcflow
download and extract dataset
wget https://zenodo.org/record/8018055/files/saccharopolysporadataset.zip unzip saccharopolysporadataset.zip ```
3. Set configurations
```bash
go back to the manuscript dir
cd ../saccharopolyspora_manuscript/
edit the location of the bgcflow dir to the right directory
nano config.yaml ```
4. Setting up Conda Environments
Install these conda environments:
bash
mamba env create -f python_notebook.yaml
mamba env create -f r_notebook.yaml
mamba env create -f <bgcflow_dir>/workflow/envs/cblaster.yaml
5. Run the notebooks
- There are two kind of notebooks, R (.R.ipynb) and python (.python.ipynb)
- Run the notebook using the corresponding conda environment:
python_notebookorr_notebook - Start jupyter session
bash # for python conda activate python_notebook jupyter labbash # for R conda activate r_notebook jupyter lab - Run the notebooks in order
Citation
Matin Nuhamunada, Omkar S. Mohite, Patrick V. Phaneuf, Bernhard O. Palsson, and Tilmann Weber. (2023). BGCFlow: Systematic pangenome workflow for the analysis of biosynthetic gene clusters across large genomic datasets. bioRxiv 2023.06.14.545018; doi: https://doi.org/10.1101/2023.06.14.545018
Nuhamunada, Matin, & Mohite, Omkar Satyavan. (2023). BGCFlow Analysis of Saccharopolyspora Genomes (0.1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.8018055
Owner
- Name: Matin Nuhamunada
- Login: matinnuhamunada
- Kind: user
- Location: Copenhagen
- Website: https://matinnuhamunada.github.io/
- Twitter: matinnuhamunada
- Repositories: 3
- Profile: https://github.com/matinnuhamunada
PhD Student at DTU Biosustain | Natural Products Genome Mining