https://github.com/brentp/vembrane-benchmark
benchmark workflow for vcfexpress paper
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: ncbi.nlm.nih.gov, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.8%) to scientific vocabulary
Last synced: 9 months ago
·
JSON representation
Repository
benchmark workflow for vcfexpress paper
Basic Info
- Host: GitHub
- Owner: brentp
- License: mit
- Language: Python
- Default Branch: main
- Size: 128 KB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of vembrane/vembrane-benchmark
Created over 1 year ago
· Last pushed over 1 year ago
https://github.com/brentp/vembrane-benchmark/blob/main/
[](https://zenodo.org/badge/latestdoi/305723104) This is a fork of the vembrane workflow used to benchmark [vcfexpress](https://github.com/brentp/vcfexpress) # vembrane-benchmark This is a snakemake workflow for benchmarking different VCF filtering tools. It makes use of [GIAB](https://www.nist.gov/programs-projects/genome-bottle) samples `HG001`, `HG002`, `HG003` and `HG004` (see [Data sources](#data) below), [restricted to chromosome 1](https://github.com/vembrane/vembrane-benchmark/blob/503a49b46f78b5c0b2515bd6a3979b16dcbe01ba/workflow/rules/download.smk#L127-L139), [normalised with `bcftools norm -N -m-any`](https://github.com/vembrane/vembrane-benchmark/blob/503a49b46f78b5c0b2515bd6a3979b16dcbe01ba/workflow/rules/download.smk#L112-L124) and [annotated](https://github.com/vembrane/vembrane-benchmark/blob/v1.0.0/workflow/rules/annotation.smk) with [SnpEff](https://pcingola.github.io/SnpEff/se_introduction/) and [VEP](https://ensembl.org/info/docs/tools/vep/index.html).  ## Usage Either follow the [instructions in the snakemake workflows catalog](https://snakemake.github.io/snakemake-workflow-catalog?usage=vembrane/vembrane-benchmark) or clone this repository and run [snakemake](https://snakemake.github.io/): ```sh git clone https://github.com/vembrane/vembrane-benchmark.git cd vembrane-benchmark snakemake --use-conda ``` ### Configuration By editing the [`config/config.yaml`](config/config.yaml) file, you can easily adjust or extend the `benchmark`. With the `repeats` keyword, you can adjust the number of repeats run per combination of `filetypes` and scenarios (any of the scenarios under `annotations`) for each of the specified `tool`s. By adding to any of these levels, you can extend the benchmark. When adding filter scenarios and filetypes, please make sure that the respective tool can handle them. When adding a new tool, you also need to provide an `invocation` entry in the configuration file and a corresponding conda environment definition file in [`workflow/envs/`](workflow/envs). This file needs to be named exactly like the tool name you used in the [`config/config.yaml`](config/config.yaml) definition, i.e. `.yml`. For example, the entry for vembrane looks like this: ```yaml # The tool's name vembrane: # Filter scenarios are divided by annotation source # which at the moment can only be snpEff or VEP. annotations: vep: filter_all: 'False' filter_none: 'True' at_least_2_platforms: 'INFO["platforms"] >= 2' format_dp: 'any(FORMAT["DP"][s] > 1250 for s in SAMPLES if "DP" in FORMAT)' uncertain: '"uncertain_significance" in ANN["CLIN_SIG"] or not (ID and ID.startswith("rs"))' snpeff: impact_high: 'ANN["Annotation_Impact"] == "HIGH"' # Supported input filetypes filetypes: - "bcf_b" # compressed BCF - "vcf_v" # plaintext VCF # The tool's CLI invocation pattern, # where `$FILTER` is a template variable # which is replaced by the filter scenario # expressions defined above. # The input filename will be appended. # # For example, for scenario `at_least_2_platforms`, # this will expand to: # `vembrane filter 'INFO["platforms"] >= 2' input.vcf` invocation: vembrane filter '$FILTER' ``` ## Sources ### Data | GIAB sample | FTP download link | | ----------- | ----------------- | | `HG001` | [ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/NA12878_HG001/latest/GRCh38/HG001_GRCh38_GIAB_highconf_CG-IllFB-IllGATKHC-Ion-10X-SOLID_CHROM1-X_v.3.3.2_highconf_PGandRTGphasetransfer.vcf.gz](https://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/NA12878_HG001/latest/GRCh38/) | | `HG002` | [ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/AshkenazimTrio/HG002_NA24385_son/latest/GRCh38/HG002_GRCh38_1_22_v4.1_draft_benchmark.vcf.gz](https://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/AshkenazimTrio/HG002_NA24385_son/latest/GRCh38/) | | `HG003` | [ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/AshkenazimTrio/HG003_NA24149_father/latest/GRCh38/HG003_GRCh38_GIAB_highconf_CG-Illfb-IllsentieonHC-Ion-10XsentieonHC_CHROM1-22_v.3.3.2_highconf.vcf.gz](https://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/AshkenazimTrio/HG003_NA24149_father/latest/GRCh38/) | | `HG004` | [ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/AshkenazimTrio/HG004_NA24143_mother/latest/GRCh38/HG004_GRCh38_GIAB_highconf_CG-Illfb-IllsentieonHC-Ion-10XsentieonHC_CHROM1-22_v.3.3.2_highconf.vcf.gz](https://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/AshkenazimTrio/HG004_NA24143_mother/latest/GRCh38/) | ### Tools At the moment, the following tools are considered: - [`vembrane`](http://github.com/vembrane/vembrane) (v0.11.1) - [`bcftools`](https://samtools.github.io/bcftools/bcftools.html#expressions) (v1.15.1) - [`SnpSift`](https://pcingola.github.io/SnpEff/ss_introduction/) (v5.1) - [`filter_vep`](https://www.ensembl.org/info/docs/tools/vep/script/vep_filter.html) (v107.0) - [`slivar`](https://github.com/brentp/slivar) (v0.2.7) - [`bio-vcf`](https://github.com/vcflib/bio-vcf) (v0.9.5)
Owner
- Name: Brent Pedersen
- Login: brentp
- Kind: user
- Location: Oregon, USA
- Twitter: brent_p
- Repositories: 220
- Profile: https://github.com/brentp
Doing genomics
GitHub Events
Total
- Watch event: 1
- Push event: 1
Last Year
- Watch event: 1
- Push event: 1