holmgenome
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.5%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: TiffanyFeng08
- License: mit
- Language: Python
- Default Branch: main
- Size: 68.6 MB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
HolmGenome Pipeline
HolmGenome is a one-step pipeline that performs assembly, quality control, and annotation of bacteria genomic data. It takes raw sequencing reads and produces a fully annotated genome in a single run.
Features
- Assembly: Uses SPAdes to assemble raw reads.
- QC: Employs Trimmomatic & FastQC for trimming and quality checks.
- Annotation: Runs Prokka to annotate assembled contigs.
Installation
Requirements
- Python Version: Python 3.6 or higher
- External Dependencies:
- FastQC: 0.12.0
- Trimmomatic: 0.39
- SPAdes: 3.15.4
- Prokka: 1.14.5
- BBMap: 38.86
- QUAST: 5.2.0
Development Version
```bash git clone https://github.com/TiffanyFeng08/HolmGenome.git
```
Check the environment before using (recommended)
cd HolmGenome
python HolmGenome.py --check
Usage
HolmGenome.py will detect all *.fastq or *.fastq.gz files in a directory and run on each sample it can pair
``` Usage: HolmGenome.py [OPTIONS] options: -h, --help show help message and exit -i INPUT, --input INPUT Path to the input directory -o OUTPUT, --output OUTPUT Path to the output directory --trimmomaticpath TRIMMOMATICPATH Path to the Trimmomatic executable or JAR --adapterspath ADAPTERSPATH Path to the adapters file --prokkadbpath PROKKADBPATH Path to the Prokka database --mincontiglength MINCONTIGLENGTH Minimum contig length (default: 1000) --check Check if all dependencies are installed -v, --version Show the pipeline version number and exit
```
python HolmGenome.py -i INPUT -o OUTPUT --trimmomatic_path TRIMMOMATIC_PATH --adapters_path ADAPTERS_PATH --prokka_db_path PROKKA_DB_PATH
Example bash script
```
!/bin/bash
SBATCH --account=your_account
SBATCH --time=5:00:00
SBATCH --job-name=HolmGenome
SBATCH --nodes=1
SBATCH --ntasks=1
SBATCH --mem-per-cpu=5G
SBATCH --cpus-per-task=4
SBATCH --mail-user=your.email@somemail.com
SBATCH --mail-type=BEGIN,END,FAIL,REQUEUE
module load StdEnv/2020 gcc/9.3.0 module load python/3.10.2 module load scipy-stack/2022a module load fastqc/0.12.0 module load trimmomatic/0.39 module load spades/3.15.4 module load prokka/1.14.5 module load bbmap/38.86 module load quast/5.2.0 source ~/HolmGenome/bin/activate
python HolmGenome.py -i /path/to/data -o /path/to/output \ --trimmomaticpath /path/to/trimmomatic-0.39.jar \ --adapterspath /path/to/adapters.fa \ --prokkadbpath /path/to/prokka_db ```
Owner
- Name: Zhixuan (Tiffany) Feng
- Login: TiffanyFeng08
- Kind: user
- Repositories: 1
- Profile: https://github.com/TiffanyFeng08
Citation (CITATION.cff)
cff-version: 1.2.0
title: HolmGenome Pipeline
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Zhixuan
family-names: Feng
- given-names: Zhangbin
family-names: Cai
url: 'https://github.com/TiffanyFeng08/HolmGenome'
abstract: >-
HolmGenome is a one-step pipeline that performs assembly,
quality control, and annotation of bacteria genomic data.
It takes raw sequencing reads and produces a fully
annotated genome in a single run.
license: MIT
GitHub Events
Total
- Watch event: 2
- Push event: 66
- Public event: 1
- Create event: 1
Last Year
- Watch event: 2
- Push event: 66
- Public event: 1
- Create event: 1
Dependencies
- PyYAML *
- argparse *
- logging *