Syn-CpG-Spacer
Syn-CpG-Spacer: A Panel web app for synonymous recoding of viral genomes with CpG dinucleotides - Published in JOSS (2024)
Science Score: 98.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: ncbi.nlm.nih.gov, joss.theoj.org -
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Repository
A Panel progressive web app for synonymous recoding of viral genomes with CpG dinucleotides
Basic Info
- Host: GitHub
- Owner: oleksulkowski
- License: bsd-3-clause
- Language: Python
- Default Branch: main
- Homepage: https://oleksulkowski.github.io/Syn-CpG-Spacer/app/
- Size: 481 KB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 2
Topics
Metadata Files
README.md
Syn-CpG-Spacer
Syn-CpG-Spacer is a Progressive Web App (PWA) for biomedical scientists written in Python using Panel, Bokeh, Biopython libraries. It allows for synonymous recoding of genetic sequences to increase the frequency of CpG dinucleotides by setting constraints on their spacing. The primary usecase are experiments with attenuation of viruses.
The software changes codons along a sequence to synonymous alternatives that form CpG dinucleotides according to the user's settings. This can be done at codon positions 1-2, 2-3 and 3-1 (split over two subsequent codons).
Using Panel's Pyodide integration, the app is hosted on GitHub Pages in this repository and is available on the following address:
https://oleksulkowski.github.io/Syn-CpG-Spacer/app/
Installation
In browsers such as Chrome, Safari or Edge, it is possible to install the app onto your machine for offline use by clicking the browser prompt after opening the link. An installed app will download and apply updates automatically when they become available.
Usage
The software allows the user to load their own FASTA sequence or to use a pre-loaded sample sequence (part of HIV-1 Gag).
Important: The loaded sequence must start in-frame, contain only codons present in the codon table and be of a length divisible by 3. If more than one sequence is present in the loaded FASTA file, they must all be of equal length. Only the sequence at the top of the file will be recoded.
The user can then either set a minimum gap between newly added CpG's or set a desired average gap between CpG's. With the latter option, the software will find a minimum gap that will result in as close a possible average gap to the user's setting using a binary search algorithm.
The program allows protecting a set number of initial and final nucleotides from changes, which might be biologically relevant. As increasing the CpG content can decrease the frequency of A in a sequence, the user can also decide to make the remaining sequence synonymously A-rich after CpG's have been added.
Every new recoded sequence requires input of a unique ID. The sequences are displayed on an interactive alignment view that highlights CpG dinucleotides. A table shows statistical data. The user can adjust the settings and compare the sequences. When finished, the user can download the outputs as a FASTA file.
Algorithm outline
- The user configures the minimum CpG gap, protected terminal nucleotide length and chooses whether to make the sequence A-rich after adding CpG's.
- If the user sets a target average gap, a binary search algorithm will perform the steps below to find a minimum CpG gap that results in the closest average CpG gap to the desired one.
- Codon instances are generated for every codon along the sequence. It is checked whether the codon already contains a CpG or forms a split CpG with the next codon.
- It is determined which codons can potentially be transformed into CpG-forming alternatives based on their position in the sequence. The criterium is being at least the minimum CpG gap away from existing CpG's.
- The initial and final number of nucleotides are protected against changes, if specified by the user.
- Codons are mutated to synonymous CpG-forming alternatives along the sequence. Minimum CpG gap between newly added CpG's is ensured.
- The sequences' synonymity is checked, along with the preservation of terminal signals and adherence to the minimum gap settings.
- If the A-enrichment option is selected, the rest of the sequence is synonymously recoded into more A-rich codons, without impacting CpG's.
- The same checks as those described in step 6 are performed.
Development
Use the environment.yml file to create an environment with all the dependencies:
conda env create -f environment.yml
conda activate Syn-CpG-Spacer
As per Panel documentation, develop locally in index.py using
panel serve index.py --autoreload
After making changes, convert index.py to the Pyodide PWA:
panel convert index.py --to pyodide-worker --out docs/app --title Syn-CpG-Spacer --pwa
You can run the Pyodide app locally on http://localhost:8000/docs/app by using
python3 -m http.server
Tests
Syn-CpG-Spacer uses Pytest for checking if code changes introduced errors into the recoding algorithm by comparing the new output to a set of validated sequences. This is hooked up to Github Actions CI. Run the tests using
pytest
Within the app, each algorithm run is checked to ensure correct application of user-defined variables.
Community contributions
Please use the issues tab for bug reports and feature requests.
Acknowledgements
The Bokeh sequence viewer is based on code by Damien Farrell (@dmnfarrell).
Owner
- Name: Aleksander Sułkowski
- Login: oleksulkowski
- Kind: user
- Repositories: 1
- Profile: https://github.com/oleksulkowski
JOSS Publication
Syn-CpG-Spacer: A Panel web app for synonymous recoding of viral genomes with CpG dinucleotides
Authors
Tags
virology molecular biology genetics dinucleotide CpG genome recodingCitation (CITATION.cff)
cff-version: "1.2.0"
authors:
- family-names: Sulkowski
given-names: Aleksander
orcid: "https://orcid.org/0000-0002-0624-428X"
- family-names: Bouton
given-names: Clément
orcid: "https://orcid.org/0000-0001-9607-6533"
- family-names: Swanson
given-names: Chad
orcid: "https://orcid.org/0000-0002-6650-3634"
contact:
- family-names: Sulkowski
given-names: Aleksander
orcid: "https://orcid.org/0000-0002-0624-428X"
doi: 10.5281/zenodo.10781374
message: If you use this software, please cite our article in the
Journal of Open Source Software.
preferred-citation:
authors:
- family-names: Sulkowski
given-names: Aleksander
orcid: "https://orcid.org/0000-0002-0624-428X"
- family-names: Bouton
given-names: Clément
orcid: "https://orcid.org/0000-0001-9607-6533"
- family-names: Swanson
given-names: Chad
orcid: "https://orcid.org/0000-0002-6650-3634"
date-published: 2024-04-03
doi: 10.21105/joss.06332
issn: 2475-9066
issue: 96
journal: Journal of Open Source Software
publisher:
name: Open Journals
start: 6332
title: "Syn-CpG-Spacer: A Panel web app for synonymous recoding of
viral genomes with CpG dinucleotides"
type: article
url: "https://joss.theoj.org/papers/10.21105/joss.06332"
volume: 9
title: "Syn-CpG-Spacer: A Panel web app for synonymous recoding of viral
genomes with CpG dinucleotides"
GitHub Events
Total
Last Year
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| mroleksul | s****0@g****m | 48 |
| Aleksander Sułkowski | 6****i | 20 |
| Aleksander Sułkowski | 6****l | 4 |
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 5
- Total pull requests: 0
- Average time to close issues: 5 days
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 1.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- babinyurii (5)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v4 composite
- actions/upload-artifact v1 composite
- openjournals/openjournals-draft-action master composite
