domainanalyzer
Analyze domains regarding their taxonomic distribution, motifs with other domains and export their fasta sequence
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.8%) to scientific vocabulary
Keywords
Repository
Analyze domains regarding their taxonomic distribution, motifs with other domains and export their fasta sequence
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Domain Analyzer
created 2023 by gaenssle written in Python 3.8
Process
- The input is an ID, e.g. domain name (PFAM or UniProt ID)
- Various databases are accessed via Genome.jp, e.g. KEGG or UnitProt
- All available protein data associated with the input ID are downloaded
- The downloaded data is extracted, accumulated and counted
- Various output files are generated, including FASTA
Data
- Downloaded sequence data:
- Assigned description
- Organism and taxonomy
- Sequence
- Domain architecture
- Available IDs from other databases
- Count:
- Taxonomy (Phylum and organism)
- Domain architecture
- Save data in the following files:
- Gene IDs
- Gene details (organism, architecture, sequence, etc)
- Summary domain architecture
- Summary of taxonomic distribution
- Fasta files (containing entire sequence)
- Fasta files (only containing the target domain sequence)
Dependencies
- The program used python 3.8 and the following modules:
- pandas
- argparse
- multiprocessing (optional for Linux)
How to use
``` python3 Main.py name [-h] [-m] [-ask] [-db DBLIST] [-a ACTION] [-st SEARCHTYPE] [-c CUTOFF] [-sam SAMPLESIZE] [-f FOLDER] [-cs CLUSTERSIZE] [-ft FILETYPE] [-sep SEPARATOR]
positional arguments:
name name of the domain
optional arguments: -h, --help show this help message and exit -m, --multiprocess turn on mutltiprocessing (only for Linux) -ask, --askoverwrite ask before overwriting files -db DBLIST, --dblist DBLIST list databases to be searched, separated by ',' (default: UniProt;KEGG;PDB;swissprot) -a ACTION, --action ACTION add actions to be conducted: a=all, i=entry IDs, d=protein data, m=KEGG motif, e=extract (default: a) -st SEARCHTYPE, --searchtype SEARCHTYPE type of the searched id (default: pf) -c CUTOFF, --cutoff CUTOFF min E-Value of Pfam domains (default: 0.0001) -sam SAMPLESIZE, --samplesize SAMPLESIZE max number of downloaded entries (default: 0) -f FOLDER, --folder FOLDER name of the parent folder (default: same as 'name') -cs CLUSTERSIZE, --clustersize CLUSTERSIZE entries/frament files (default: 100) -ft FILETYPE, --filetype FILETYPE type of the generated files (default: .csv) -sep SEPARATOR, --separator SEPARATOR separator between columns in the output files (default: ;) ```
Owner
- Login: gaenssle
- Kind: user
- Repositories: 1
- Profile: https://github.com/gaenssle
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: Domain Analyzer
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- orcid: 'https://orcid.org/0000-0002-9488-5086'
given-names: Lucie
family-names: Gaenssle
email: a.l.o.gaenssle@rug.nl
affiliation: University of Groningen
GitHub Events
Total
Last Year
Committers
Last synced: about 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| gaenssle | a****e@g****t | 30 |
| gaenssle | 1****e | 3 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: about 2 years ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0