Science Score: 62.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: ncbi.nlm.nih.gov
  • Academic email domains
  • Institutional organization owner
    Organization saliba-lab has institutional domain (www.helmholtz-hiri.de)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.8%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: saliba-lab
  • Language: R
  • Default Branch: main
  • Size: 1.12 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed 7 months ago
Metadata Files
Readme Citation

README.md

Append Editing (ADPr-TAE) sequencing data analysis

This repository contains scripts for analysis and visualization of specific edits observed using Nanopore and Illumina (NGS) sequencing technologies. They were used in the following paper:

Targeted DNA ADP-ribosylation drives distinct editing outcomes in bacteria and eukaryotes (2024)

Constantinos Patinios, Darshana Gupta, Harris V. Bassett, Scott P. Collins, Charlotte Kamm, Anuja Kibe, Christophe Toussaint, Katie Vollen, Chengsong Zhao, Yanyan Wang, Thuan Nguyen, Alessandro Del Re, Irene Calvin, Tatjana Achmedov, Kathryn Polkoff, Angela Migur, Emmanuel Saliba, Nathan Crook, Anna Stepanova, Jose M. Alonso, Chase L. Beisel

Data Accessibility

NGS and Nanopore raw sequencing data used in the paper are available at SRA.

Example of processed data to run each of the scripts present in this repository are available in the data folder.

Repository Structure

data: Directory containing example data.

analysis: Directory containing the R scripts.

outputs: This directory is created when running the scripts. It will contain the processed data and different tables/plots.

Code Execution

1- Download repository

Option 1: Download manually the repository as a ZIP archive and extract it locally on your computer

Option 2: Clone the repository shell git clone https://github.com/saliba-lab/MBE_analysis.git cd MBE_analysis/analysis

2- Install R dependencies

See Dependencies section.

3- Description of R scripts in the analysis directory

Make sure to set the analysis directory as the working directory when running the scripts.

Scripts 1 to 3 are related to data obtained by Illumina sequencing. AllelefrequencytablearoundsgRNA_ files (in .txt format) generated by CRISPResso2 are used as inputs for running these scripts. They also require sample specific metadata indicated in sample sheets (also located in the analysis directory). The metadata can refer to a minimal read count number for including a sample in the analysis (Thresholdreadcounts), position of interest in the sequencing read to look for mutations (MutationPosition) or replicate information (Replicategroup).

  1. Script1: This script sums up %Reads containing nucleotides different from the reference, at a specified position.

  2. Script2: This script sums up %Reads containing a specific nucleotide for all positions along the length of the read.

  3. Script3: This script sums up %Reads containing nucleotides different from the reference, at two positions specified in the sample sheet and produce graphical representations for visualization.

Scripts 4 to 6 are related to data obtained by Nanopore sequencing.

  1. Script4: Related to figure 1h and S3. This script reads BAM files, then takes two actions. First, it calculates and plots the fraction of unedited, edited, and ambiguous reads in a region. Second, it calculates and plots SNVs at a specific position in an otherwise unedited region, as a percent of all reads.

  2. Script5: Related to figure 1i. This script reads a CSV file, then calculates and plots frequency of SNVs that exceed filtering criteria.

  3. Script6: Related to figure S5. This script reads a CSV file, then takes two actions. First, it plots individual growth curves. Second, it plots final values of absorbance at 600 nm.

Dependencies

List of R packages necessary to run the scripts.

  • R 4.0.3
  • dplyr 1.0.10
  • openxlsx 4.2.3
  • ggplot2 3.4.0
  • cowplot 1.1.1
  • tidyverse 1.3.0
  • GenomicAlignments 1.40.0

Owner

  • Name: Saliba Lab
  • Login: saliba-lab
  • Kind: organization
  • Email: infection-atlas@helmholtz-hiri.de
  • Location: Würzburg

Single Cell Analysis (HIRI)

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If this repository is of interest to you, please cite the associated paper. Cite it as below if youd need only the Github repository."
authors:
- family-names: "Toussaint"
  given-names: "Christophe"
  orcid: "https://orcid.org/0000-0003-3419-490X"
- family-names: "Gupta"
  given-names: "Darshana"
  orcid: "https://orcid.org/0009-0005-5930-6279"
- family-names: "Bassett"
  given-names: "Harris"
title: "Append Editing (ADPr-TAE) sequencing data analysis"
date-released: 2025-06-11
url: "https://github.com/saliba-lab/ADPr_TAE_analysis"

GitHub Events

Total
  • Push event: 1
  • Public event: 1
Last Year
  • Push event: 1
  • Public event: 1