Syn-CpG-Spacer

Syn-CpG-Spacer: A Panel web app for synonymous recoding of viral genomes with CpG dinucleotides - Published in JOSS (2024)

https://github.com/oleksulkowski/syn-cpg-spacer

Science Score: 98.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: ncbi.nlm.nih.gov, joss.theoj.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

dinucleotide-frequencies genetics genome-browser molecular-biology virology
Last synced: 6 months ago · JSON representation ·

Repository

A Panel progressive web app for synonymous recoding of viral genomes with CpG dinucleotides

Basic Info
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 2
Topics
dinucleotide-frequencies genetics genome-browser molecular-biology virology
Created over 2 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License Citation

README.md

Syn-CpG-Spacer

DOI GitHub Release GitHub License

Syn-CpG-Spacer is a Progressive Web App (PWA) for biomedical scientists written in Python using Panel, Bokeh, Biopython libraries. It allows for synonymous recoding of genetic sequences to increase the frequency of CpG dinucleotides by setting constraints on their spacing. The primary usecase are experiments with attenuation of viruses.

The software changes codons along a sequence to synonymous alternatives that form CpG dinucleotides according to the user's settings. This can be done at codon positions 1-2, 2-3 and 3-1 (split over two subsequent codons).

Using Panel's Pyodide integration, the app is hosted on GitHub Pages in this repository and is available on the following address:

https://oleksulkowski.github.io/Syn-CpG-Spacer/app/

Installation

In browsers such as Chrome, Safari or Edge, it is possible to install the app onto your machine for offline use by clicking the browser prompt after opening the link. An installed app will download and apply updates automatically when they become available.

Usage

The software allows the user to load their own FASTA sequence or to use a pre-loaded sample sequence (part of HIV-1 Gag).

Important: The loaded sequence must start in-frame, contain only codons present in the codon table and be of a length divisible by 3. If more than one sequence is present in the loaded FASTA file, they must all be of equal length. Only the sequence at the top of the file will be recoded.

The user can then either set a minimum gap between newly added CpG's or set a desired average gap between CpG's. With the latter option, the software will find a minimum gap that will result in as close a possible average gap to the user's setting using a binary search algorithm.

The program allows protecting a set number of initial and final nucleotides from changes, which might be biologically relevant. As increasing the CpG content can decrease the frequency of A in a sequence, the user can also decide to make the remaining sequence synonymously A-rich after CpG's have been added.

Every new recoded sequence requires input of a unique ID. The sequences are displayed on an interactive alignment view that highlights CpG dinucleotides. A table shows statistical data. The user can adjust the settings and compare the sequences. When finished, the user can download the outputs as a FASTA file.

Algorithm outline

  1. The user configures the minimum CpG gap, protected terminal nucleotide length and chooses whether to make the sequence A-rich after adding CpG's.
    • If the user sets a target average gap, a binary search algorithm will perform the steps below to find a minimum CpG gap that results in the closest average CpG gap to the desired one.
  2. Codon instances are generated for every codon along the sequence. It is checked whether the codon already contains a CpG or forms a split CpG with the next codon.
  3. It is determined which codons can potentially be transformed into CpG-forming alternatives based on their position in the sequence. The criterium is being at least the minimum CpG gap away from existing CpG's.
  4. The initial and final number of nucleotides are protected against changes, if specified by the user.
  5. Codons are mutated to synonymous CpG-forming alternatives along the sequence. Minimum CpG gap between newly added CpG's is ensured.
  6. The sequences' synonymity is checked, along with the preservation of terminal signals and adherence to the minimum gap settings.
  7. If the A-enrichment option is selected, the rest of the sequence is synonymously recoded into more A-rich codons, without impacting CpG's.
  8. The same checks as those described in step 6 are performed.

Development

Use the environment.yml file to create an environment with all the dependencies:

conda env create -f environment.yml conda activate Syn-CpG-Spacer

As per Panel documentation, develop locally in index.py using panel serve index.py --autoreload

After making changes, convert index.py to the Pyodide PWA: panel convert index.py --to pyodide-worker --out docs/app --title Syn-CpG-Spacer --pwa

You can run the Pyodide app locally on http://localhost:8000/docs/app by using python3 -m http.server

Tests

Syn-CpG-Spacer uses Pytest for checking if code changes introduced errors into the recoding algorithm by comparing the new output to a set of validated sequences. This is hooked up to Github Actions CI. Run the tests using

pytest

Within the app, each algorithm run is checked to ensure correct application of user-defined variables.

Community contributions

Please use the issues tab for bug reports and feature requests.

Acknowledgements

The Bokeh sequence viewer is based on code by Damien Farrell (@dmnfarrell).

Owner

  • Name: Aleksander Sułkowski
  • Login: oleksulkowski
  • Kind: user

JOSS Publication

Syn-CpG-Spacer: A Panel web app for synonymous recoding of viral genomes with CpG dinucleotides
Published
April 03, 2024
Volume 9, Issue 96, Page 6332
Authors
Aleksander Sulkowski ORCID
Department of Infectious Diseases, King's College London, London, United Kingdom
Clément Bouton ORCID
Department of Infectious Diseases, King's College London, London, United Kingdom
Chad Swanson ORCID
Department of Infectious Diseases, King's College London, London, United Kingdom
Editor
Frederick Boehm ORCID
Tags
virology molecular biology genetics dinucleotide CpG genome recoding

Citation (CITATION.cff)

cff-version: "1.2.0"
authors:
- family-names: Sulkowski
  given-names: Aleksander
  orcid: "https://orcid.org/0000-0002-0624-428X"
- family-names: Bouton
  given-names: Clément
  orcid: "https://orcid.org/0000-0001-9607-6533"
- family-names: Swanson
  given-names: Chad
  orcid: "https://orcid.org/0000-0002-6650-3634"
contact:
- family-names: Sulkowski
  given-names: Aleksander
  orcid: "https://orcid.org/0000-0002-0624-428X"
doi: 10.5281/zenodo.10781374
message: If you use this software, please cite our article in the
  Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: Sulkowski
    given-names: Aleksander
    orcid: "https://orcid.org/0000-0002-0624-428X"
  - family-names: Bouton
    given-names: Clément
    orcid: "https://orcid.org/0000-0001-9607-6533"
  - family-names: Swanson
    given-names: Chad
    orcid: "https://orcid.org/0000-0002-6650-3634"
  date-published: 2024-04-03
  doi: 10.21105/joss.06332
  issn: 2475-9066
  issue: 96
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 6332
  title: "Syn-CpG-Spacer: A Panel web app for synonymous recoding of
    viral genomes with CpG dinucleotides"
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.06332"
  volume: 9
title: "Syn-CpG-Spacer: A Panel web app for synonymous recoding of viral
  genomes with CpG dinucleotides"

GitHub Events

Total
Last Year

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 72
  • Total Committers: 3
  • Avg Commits per committer: 24.0
  • Development Distribution Score (DDS): 0.333
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
mroleksul s****0@g****m 48
Aleksander Sułkowski 6****i 20
Aleksander Sułkowski 6****l 4

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 5
  • Total pull requests: 0
  • Average time to close issues: 5 days
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • babinyurii (5)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

.github/workflows/CI.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/draft-pdf.yml actions
  • actions/checkout v4 composite
  • actions/upload-artifact v1 composite
  • openjournals/openjournals-draft-action master composite