https://github.com/bebatut/pylprotpredictor
Prediction of PYL proteins
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.6%) to scientific vocabulary
Repository
Prediction of PYL proteins
Basic Info
- Host: GitHub
- Owner: bebatut
- License: apache-2.0
- Language: Python
- Default Branch: master
- Homepage: http://bebatut.fr/PylProtPredictor/
- Size: 5.49 MB
Statistics
- Stars: 0
- Watchers: 4
- Forks: 0
- Open Issues: 6
- Releases: 2
Metadata Files
README.md
Detection of pyrrolysine proteins
Context
Pyrrolysine is an amino acid that is used in the biosynthesis of proteins in some methanogenic archaea and bacterium. It is encoded in mRNA by the UAG codon, which in most organisms is the 'amber' stop codon.
Some methanogenic archaea and bacterium have the pylT gene, which encodes an unusual transfer RNA (tRNA) with a CUA anticodon, and the pylS gene, which encodes a class II aminoacyl-tRNA synthetase that charges the pylT-derived tRNA with pyrrolysine. In some proteins, the UAG codon can then code for pyrrolysine, and no more for a STOP codon.
These proteins are difficult to identify. Indeed, in CDS prediction, UAG codons are seen as STOP codons. The predicted CDS are then cut when the first UAG codon is found.
Here, we propose a solution to detect proteins using Pyrrolisine amino acid. Have a look to the scheme explaining how the tool is working.
Installation
Requirements
The following software are required:
- git
- conda:
```
$ make install-conda
$ make configure-conda
```
Install the tool
Clone this repository (or get the release)
$ git clone https://github.com/bebatut/PylProtPredictor.gitMove into the directory
$ cd pyl_protein_predictionPrepare the environment (only once)
$ make create-envActivate the conda environment
$ source activate PylProtPredictorBuild the package
$ make init
Usage
``` $ source activate PylProtPredictor # once to activate the conda environment $ pylprotpredictor --help usage: pylprotpredictor [-h] --genome GENOME --output OUTPUT [--referencefastadb REFERENCEFASTADB] [--referencedmnddb REFERENCEDMNDDB]
PylProtPredictor Pipeline
optional arguments: -h, --help show this help message and exit --genome GENOME path to a FASTA file with full or contig sequences of a genome to analyze --output OUTPUT path to the output directory --referencefastadb REFERENCEFASTADB path to FASTA file with reference database --referencedmnddb REFERENCEDMNDDB path to Diamond formatted file with reference database ```
To exit the environment, you can execute
$ source deactivateBut don't do that before running the analysis.
Database setup
The first run will be long: the reference database should be downloaded and prepare for the similarity search.
If you already have the Uniref90 database on your machine, you can simply link it when running the main script.
Otherwise, the pipeline will download and format it. Make sure you have at least 25GB available for the reference database. It can take several hours, depending on your connection.
Support & Bug Reports
You can file a GitHub issue.
Contributing
First off, thanks for taking the time to contribute!
Tests
The code is covered by tests. They are run automatically on CircleCI but we also recommend to run them locally before pushing to GitHub with:
$ make test
Any added code should be covered by new tests.
Documentation
Documentation about ENASearch is available online at http://bebatut.fr/PylProtPredictor
To update it:
- Make the changes in
src/docs Generate the doc:
$ make docCheck it by opening the
docs/index.htmlfile in a web browserPropose the changes via a Pull Request
Contributors
- Bérénice Batut
- Jean-François Brugère
- Kévin Gravouil
- Cécile Hilpert
- Ylana Sauvaget
Citation
You can cite the latest release on Zenodo
Owner
- Name: Bérénice Batut
- Login: bebatut
- Kind: user
- Location: Clermont-Ferrand, France
- Company: University of Freiburg
- Website: http://research.bebatut.fr/
- Twitter: bebatut
- Repositories: 86
- Profile: https://github.com/bebatut
@galaxyproject training, @usegalaxy-eu, @open-life-science, @StreetScienceCommunity, @gallantries
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: about 1 year ago
All Time
- Total issues: 11
- Total pull requests: 12
- Average time to close issues: 10 months
- Average time to close pull requests: 1 day
- Total issue authors: 3
- Total pull request authors: 3
- Average comments per issue: 1.18
- Average comments per pull request: 1.17
- Merged pull requests: 12
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- bebatut (9)
- ylana (1)
- keuv-grvl (1)
Pull Request Authors
- bebatut (9)
- keuv-grvl (2)
- ylana (1)