Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.0%) to scientific vocabulary
Repository
python utils for genomic variant (WIP)
Basic Info
Statistics
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Python pakcage for genomic variant analysis
How to install?
pip install variant
How to use?
🧬 variant motif subcommand can fetch motif sequence around given site.
``` Usage: variant motif [OPTIONS]
Fetch genomic motif.
╭─ Options ─────────────────────────────────────────────────────────────────╮ │ --input -i TEXT Input position file. │ │ --output -o TEXT Output annotation file. │ │ * --fasta -f TEXT reference fasta file. [required] │ │ --npad -n TEXT Number of padding base to call motif. If you │ │ want to set different left and right pads, │ │ use comma to separate them. (eg. 2,3) │ │ --padding -p TEXT Padding base to use for motif. 'N' by default │ │ but can be set to any single letter │ │ --with-header -H With header line in input file. │ │ --columns -c TEXT Sets columns for site info. │ │ (Chrom,Pos,Strand) │ │ [default: 1,2,3] │ │ --to-upper -u Convert motif to upper case. │ │ --wrap-site -w Wrap motif site. │ │ --help -h Show this message and exit. │ ╰───────────────────────────────────────────────────────────────────────────╯ ```
demo:
I would like to get the 2 bases before the given sites, and 3 bases after the given sites, meanwhile, wrap the give sites with bracket. Moreover, the strand information should be taken into account.
use -n 2,3 -w
🧫 variant effect subcommand can infer the effect of a mutation
``` Usage: variant effect [OPTIONS]
Annotation genomic variant effect.
╭─ Options ────────────────────────────────────────────────────────────────╮ │ --input -i TEXT Input position file. │ │ --output -o TEXT Output annotation file │ │ --reference -r TEXT reference species │ │ --reference-gtf TEXT Customized reference gtf file. │ │ --reference-transcript TEXT Customized reference transcript │ │ fasta file. │ │ --reference-protein TEXT Customized reference protein fasta │ │ file. │ │ --release -e INTEGER ensembl release │ │ --strandness -s Use strand infomation or not? │ │ --pU-mode -u Make rRNA, tRNA, snoRNA into top │ │ priority. │ │ --npad -n INTEGER Number of padding base to call │ │ motif. │ │ --all-effects -a Output all effects. │ │ --with-header -H With header line in input file. │ │ --columns -c TEXT Sets columns for site info. │ │ (Chrom,Pos,Strand,Ref,Alt) │ │ [default: 1,2,3,4,5] │ │ --help -h Show this message and exit. │ ╰──────────────────────────────────────────────────────────────────────────╯ ```
demo:
Store the following table in file (sites.tsv).
| Chrom | Position | Strand | Ref | Alt | | ----- | --------- | ------ | --- | --- | | chr1 | 230703034 | - | C | T | | chr12 | 69353439 | + | A | T | | chr14 | 23645352 | + | G | T | | chr2 | 215361150 | - | A | T | | chr2 | 84906537 | + | C | T | | chr22 | 39319077 | - | T | A | | chr22 | 39319095 | - | T | A | | chr22 | 39319098 | - | T | A |
Run command:
bash
variant-effect -i sites.tsv -H -r human -e 108 -t RNA -H -c 1,2,3
-ispecify the input file-Hmeans the file is with header line, and the first row will be skipped;-ruse the specific genome, default is human-especify the Ensembl release version-cmeans only use some of the columns in the input file. default will use the first 5 columns.
You will have this output
| Chrom | Position | Strand | Ref | Alt | muttype | genetype | genename | genepos | transcriptname | transcriptpos | transcriptmotif | codingpos | codonref | aapos | aaref | distance2splice | | :---- | :-------- | :----- | :-- | :-- | :------------ | :------------- | :---------------------- | :------- | :-------------------------- | :------------- | :-------------------- | :--------- | :-------- | :----- | :----- | --------------- | | chr1 | 230703034 | - | C | T | ThreePrimeUTR | proteincoding | ENSG00000135744(AGT) | 42543 | ENST00000680041(AGT-208) | 1753 | TGTGTCACCCCCAGTCTCCCA | None | None | None | None | 295 | | chr12 | 69353439 | + | A | T | ThreePrimeUTR | proteincoding | ENSG00000090382(LYZ) | 5059 | ENST00000261267(LYZ-201) | 695 | TAGAACTAATACTGGTGAAAA | None | None | None | None | 286 | | chr14 | 23645352 | + | G | T | ThreePrimeUTR | proteincoding | ENSG00000100867(DHRS2) | 15238 | ENST00000344777(DHRS2-202) | 1391 | CTGCCATTCTGCCAGACTAGC | None | None | None | None | 210 | | chr2 | 215361150 | - | A | T | ThreePrimeUTR | proteincoding | ENSG00000115414(FN1) | 74924 | ENST00000323926(FN1-201) | 8012 | GGCCCGCAATACTGTAGGAAC | None | None | None | None | 476 | | chr2 | 84906537 | + | C | T | ThreePrimeUTR | proteincoding | ENSG00000034510(TMSB10) | 882 | ENST00000233143(TMSB10-201) | 327 | CCTGGGCACTCCGCGCCGATG | None | None | None | None | 148 | | chr22 | 39319077 | - | T | A | Intronic | proteincoding | ENSG00000100316(RPL3) | 1313 | ENST00000216146(RPL3-201) | None | None | None | None | None | None | None | | chr22 | 39319095 | - | T | A | Intronic | proteincoding | ENSG00000100316(RPL3) | 1295 | ENST00000216146(RPL3-201) | None | None | None | None | None | None | None | | chr22 | 39319098 | - | T | A | Intronic | protein_coding | ENSG00000100316(RPL3) | 1292 | ENST00000216146(RPL3-201) | None | None | None | None | None | None | None |
🧫 variant coordinate subcommand can mapping chrom name and positions between different reference coordinate
``` Usage: variant coordinate [OPTIONS]
Fetch genomic motif.
╭─ Options ───────────────────────────────────────────────────────────────────╮ │ --input -i TEXT Input position file. │ │ --output -o TEXT Output annotation file. │ │ --reference-mapping -m TEXT Mapping file for chrom name, first column is │ │ chrom in the input, second column is chrom │ │ in the reference db (sep by tab) │ │ --buildin-mapping -M TEXT Build-in mapping for chrom name: U2E (UCSC │ │ to Ensembl), E2U (Ensembl to UCSC) │ │ --with-header -H With header line in input file. │ │ --columns -c TEXT Sets columns for site info. (Chrom) │ │ [default: 1] │ │ --help -h Show this message and exit. │ ╰─────────────────────────────────────────────────────────────────────────────╯
```
⏳⏳⏳ more functions will be supported in the future
TODO:
Owner
- Name: Chang Y
- Login: y9c
- Kind: user
- Repositories: 134
- Profile: https://github.com/y9c
(yec)
GitHub Events
Total
- Issues event: 2
- Watch event: 1
- Issue comment event: 2
- Push event: 37
Last Year
- Issues event: 2
- Watch event: 1
- Issue comment event: 2
- Push event: 37
Issues and Pull Requests
Last synced: about 1 year ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: about 2 hours
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 1.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: about 2 hours
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 1.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- yangli04 (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- pytest ^5.2 develop
- setuptools 57.4.0 develop
- click ^8.1.3
- pyensembl ^2.0.0
- python ^3.7
- varcode *
- actions/checkout v3 composite
- actions/configure-pages v2 composite
- actions/deploy-pages v1 composite
- actions/jekyll-build-pages v1 composite
- actions/upload-pages-artifact v1 composite
- actions/checkout v3 composite
- actions/setup-python v3 composite
- pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite