https://github.com/broadinstitute/funpipe

A python3 library for building best practice fungal genomic analysis pipeline

https://github.com/broadinstitute/funpipe

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 7 committers (14.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.6%) to scientific vocabulary

Keywords

bioinformatics-pipeline fungal genetics genomics infectious-disease python-library python3

Keywords from Contributors

transformation
Last synced: 5 months ago · JSON representation

Repository

A python3 library for building best practice fungal genomic analysis pipeline

Basic Info
  • Host: GitHub
  • Owner: broadinstitute
  • License: mit
  • Language: HTML
  • Default Branch: main
  • Homepage:
  • Size: 3.34 MB
Statistics
  • Stars: 6
  • Watchers: 4
  • Forks: 4
  • Open Issues: 37
  • Releases: 1
Topics
bioinformatics-pipeline fungal genetics genomics infectious-disease python-library python3
Created almost 8 years ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.md

FunPipe: a python library for building best practice fungal genomic analysis pipeline

FunPipe is a python library designed for efficient implementation of bioinformatic tools and pipelines for fungal genomic analysis. It contains wrapper functions to popular tools, customized functions for specific analyses tasks, and command line tools developed using those functions. This package is developing to facilitate fungal genomics, but many of the functions are generally applicable to other genomic analysis as well.

Synposis

  • funpipe: a directory that contains python library
  • scripts: tools and established pipelines, doc here
  • tests: unit tests
  • docs: API documentation
  • README.md: this file
  • setup.py: pip setup script
  • conda_env.yml: spec file for setting up conda environment
  • Dockerfile: docker images
  • requirements.txt: sphinx requirement file (not requirement for this package)
  • LICENSE: MIT license

Installation

Install with Conda

It is recommended to install funpipe via conda, as it automatically setup all required bioinformatic tools. This is very useful on servers or clusters without root privilage. Make sure conda is available in your environment via which conda. If conda is not available in your system, install Python3.7 version of it here.

HTTP errors sometimes occur when creating the conda environment, simply rerun the conda env create -f conda_env.yml to continue creating the environment.

```sh

clone this repo

git clone git@github.com:broadinstitute/funpipe.git

setup conda environment

cd funpipe

conda env create -f conda_env.yml # this will take about 10 min conda list # verify new environment was installed correctly

activate funpipe environment

conda activate funpipe

the latest stable version of funpipe is available in this environment

to use the latest funpipe version, do

pip install .

deactivate the environment when done

conda deactivate

to complete remove the environment

conda remove -n funpipe --all ```

Note: * diamond=0.9.22 uses boost library, which depends on python 2.7. This conflicts with funpipe's python version. To use diamond, use it via docker.

Install via Docker

There's a bit more overhead using Docker, but it came along with the benefits of consistent environment (i.e.: including the operation systems). It's very useful when using funpipe on the cloud.

To use docker: ```

Download docker

docker pull broadinstitute/funpipe:latest

Run analysis interactively

docker run --rm -v $pathtodata/data -t broadinstitute/funpipe \ /bin/bash -c "/scripts/vcfqcmetr.py \ -p prefix --jar /bin/GenomeAnalysisTK.jar \ --fa /data/reference.fa " ```

You can use Dockerfile to compile the docker from scratch: sh cd funpipe docker build funpipe .

Install with PIP

This approach is for advanced users who don't like conda and want to integrate funpipe into their current working environment. Before starting pip installation, make sure the following list of bioinformatic tools (or a subset of tools of interest) are properly installed and add to your PATH. Path to Java tools (JARs) need to be specified when evocaking specific functions.

Requirements * Python >= 3.7 * Bioinformatic tool collections: can be automatically installed via conda here * Basic functions: - samtools>=1.9 - bwa>=0.7.8 - gatk>=3.8 - picard>=2.18.17 * Phylogenetics: - raxml>=8.2.12 - readseq>=2.1.30 * CNV: - breakdancer>=1.4.5 - cnvnator>=0.3 * Microbiome: - pilon>=1.23 - diamond>=0.9.22

To install with pip: ```sh

install latest stable release

pip install funpipe

install a specific version

pip install funpipe==0.1.0 ```

To install the latest version: funpipe sh git clone git@github.com:broadinstitute/funpipe.git cd funpipe pip install .

Documentation

Major analysis pipelines/tools: - Quality control modules - Reference genome quality evaluation with Pilon. - FASTQ quality control with fastqc. - BAM quality control using Picard. - VCF quality control using GATK VariantEval. - Variant Annotation with snpEff. - Genomic Variation - Coverage analysis - Mating type analysis - Copy number variation with CNVnator - Phylogenetic analysis - Dating analysis with BEAST. - Phylogenetic tree with FastTree, RAxML and IQTREE. - GWAS analysis with GEMMA.

Here are scripts to run each of the above pipelines, use <toolname> -h to see the manuals. ```sh

Quality control

runpilon.py # Evaluate reference genome quality with pilon fastqc.py # Fastq quality control bamqcmetr.py # Quality control of BAMs vcfqc_metr.py # Quality control of VCFs

Variant Annotation

runsnpeff.py # Annotation genomic variants with snpEff phyloanalysis.py # Phylogenetic analysis

Genomic Variations

coverage_analysis.py # Hybrid coverage and ploidy analysis

``` You can also use out APIs to build your customized analysis scripts or pipelines. The docs will be available here: https://funpipe.readthedocs.io

Owner

  • Name: Broad Institute
  • Login: broadinstitute
  • Kind: organization
  • Location: Cambridge, MA

Broad Institute of MIT and Harvard

GitHub Events

Total
  • Delete event: 1
  • Create event: 1
Last Year
  • Delete event: 1
  • Create event: 1

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 255
  • Total Committers: 7
  • Avg Commits per committer: 36.429
  • Development Distribution Score (DDS): 0.086
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
xiaolicbs x****s@g****m 233
SizheQiu s****1@D****n 13
xiaoli0 4
francisVieno f****o@1****m 2
Xiao Li x****s 1
Sizhe Qiu 3****u 1
Xiao Li x****i@v****g 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 59
  • Total pull requests: 6
  • Average time to close issues: 3 months
  • Average time to close pull requests: 8 minutes
  • Total issue authors: 2
  • Total pull request authors: 2
  • Average comments per issue: 0.08
  • Average comments per pull request: 0.0
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 2
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • xiaoli0 (58)
  • ariasamin (1)
Pull Request Authors
  • dependabot[bot] (3)
  • xiaoli0 (3)
Top Labels
Issue Labels
enhancement (47) low priority (4) bugs (2) wontfix (1)
Pull Request Labels
dependencies (3) enhancement (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 15 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 1
  • Total maintainers: 1
pypi.org: funpipe

A pipeline for analyzing fungal genomic data

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 15 Last month
Rankings
Dependent packages count: 10.0%
Forks count: 15.3%
Dependent repos count: 21.8%
Average: 24.6%
Stargazers count: 31.9%
Downloads: 43.8%
Maintainers (1)
Last synced: 6 months ago

Dependencies

requirements.txt pypi
  • includeREADME.md *
setup.py pypi
  • argparse >=1.1
  • crimson >=0.4.0
  • matplotlib >=3.0.2
  • pandas >=0.23.4
  • seaborn >=0.9.0
Dockerfile docker
  • ubuntu 16.04 build
docs/source/requirements.txt pypi
  • numpydoc ==1.5.0
  • sphinx-rtd-theme ==1.2.0