https://github.com/bricoletc/simval_gramtools
simulated data-driven validation for gramtools
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (3.9%) to scientific vocabulary
Repository
simulated data-driven validation for gramtools
Basic Info
- Host: GitHub
- Owner: bricoletc
- Language: Python
- Default Branch: master
- Size: 12.7 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
A data-driven testing module for gramtools
Data production
From a fasta reference and a vcf against it, we can: * Slice both * Randomly pick derived alleles in the sliced vcf --> makes a mosaic sample * Randomly mutate the mosaic sample --> makes a mutant sample * Generate reads from the sample, using art_ilmn.
This can be done for any number of samples, specified at command line.
Validation
The mosaic sample is stored as vcf, and can serve to bench how well quasimap + infer are able to retrieve the correct mosaic.
The mutant sample is stored as a vcf, rebased against the original (sliced) reference, and can serve to bench how well discover is able to retrieve the variation on top of the mosaic.
Pipelines
- fullRun: runs a full build/quasimap/infer/discover on a sample
- Multisample : runs multi sample pipeline
TODOs
[] Combining the mutant vcfs + the original (sliced) vcf into genotyped multi-sample vcf. This can serve as a reference of what multi sample pipeline should ideally output. [] Automated assessment using gramtools outputs versus the truths.
Owner
- Name: Brice Letcher
- Login: bricoletc
- Kind: user
- Company: EMBL-EBI
- Twitter: bricoletc
- Repositories: 2
- Profile: https://github.com/bricoletc
Bioinformatician and early-career researcher - EMBL-EBI and CNRS ~~~~~~ Parsing my way through DNA sequence data
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0