https://github.com/casper-schutte/fantastic-lamp

Repository for the FantasticLamp pipeline

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.2%) to scientific vocabulary

Keywords

bioinformatics bioinformatics-pipeline pangenome-graph

Last synced: 9 months ago · JSON representation

Repository

Repository for the FantasticLamp pipeline

Basic Info

Host: GitHub
Owner: casper-schutte
License: mit
Language: Python
Default Branch: main
Homepage:
Size: 9.48 MB

Statistics

Stars: 6
Watchers: 3
Forks: 1
Open Issues: 0
Releases: 1

Topics

bioinformatics bioinformatics-pipeline pangenome-graph

Created about 4 years ago · Last pushed about 2 years ago

Metadata Files

Readme License

Fantastic-lamp:

Project Description:

This project aims to evaluate the success of genome editing by measuring the coverage of reads mapping to edited regions compared to the corresponding reference sequence. This is accomplished by aligning reads from the edited genome to a genome graph built from the reference and intended edits. The pipeline can simultaneously calculate the coverage of multiple populations, making it an efficient tool for quantifying the success of novel editing methods or verifying multiple edits. The efficacy of the edits can be inferred from the output, which is a TSV file containing the list of intended edits, their homology coverage, and their reference coverage.

Installation and Dependencies:

This pipeline is initiated with the "find_coverage.sh" script from the command line and requires no explicit installation. All the dependencies can be installed with Conda from the "environment.yaml" file. This pipeline was developed and tested on Ubuntu 20.04 with Python 3.10, although earlier versions of Python may also be compatible. For more information about the system configuration that has been confirmed to run this pipeline correctly, please refer to the "Test.yml" file in the /workflows directory.

Verification and testing:

The following files from the /Test folder are strictly necessary: - DesignLibraryDetailsODD126.withEditWindow.csv - Datanames.txt - environment.yaml - refandmt.fna - simple_test.fastq.gz

Copy the following scripts from the main page into the data folder: - findcoverage.sh - comparecoveragereadinfo.py

ODD126augmentedCB39.fasta is not strictly necessary, but there will be an error message if the pipeline does not find it. However, the pipeline will still run correctly, as this test does not include reads from a vector plasmid sequence.

The pipeline needs to be run with Conda Install deps: conda env update --file environment.yaml The Python script "comparecoveragereadinfo.py" and the bash script "findcoverage.sh" need to be copied to the Test folder. In the main folder, run the following command: cp compare_coverage_read_info.py find_coverage.sh Test/ Run pipeline: (Use this exact command) ``` conda run -n fantastic-lamp bash find_coverage.sh

```

Descriptions of steps and files used by the pipeline:

1) Homology arms (homarms) and the reference sequence for each homology arm (refhomarms) are extracted from DesignLibraryDetailsODD126.withEditWindow.csv and combined into a single file: ODD126refandhomarms.fa

2) minimap2 is used to map the homarms and refhomarms to the reference sequence (refandmt.fna) which includes both the reference sequence and its mitochondrial (mtDNA) sequence. The alignment is saved as ODD126refandhomarms.paf 3) refandmt.fna and ODD126refandhomarms.fa are combined with ODD126augmentedCB39.fasta (this is the plasmid sequence) to make yeast+edits.fa 4) Using yeast+edits.fa and the alignment from step 2, seqwish is used to create the variation graph (yeast+edits.gfa). 5) The graph is sorted and chopped using odgi and then converted into xg format (yeast+edits.og.gfa.xg), before finally being indexed -> yeast+edits.og.gfa.gcsa 6) The file Datanames.txt contains the names of the files which contain the sequencing reads. They are in .fastq.gz format. The script can handle paired-end reads. This can be changed in the bash script (the files need to be named appropriately) This file is iterated over and for each line (file) the following steps (7 & 8) are executed: 7) The reads from both files are mapped onto the graph (yeast+edits.og.gfa.xg), creating "filename".gaf 8) The python script "comparecoverage.py" is called with the .gaf file and the yeast+edits.og file as input. From the yeast+edits.og file, a dictionary is created mapping nodeid to path names. From this dictionary, homology arm and reference homology arm paths are created and edges are created from the nodes. Edges that are shared between the refhomarms and homarms are discarded. Then, the number of reads mapping to edges within homarms and refhomarms are counted (from the .gaf file) and put into a dictionary mapping edges to the read count for that edge. Coverage for a path calculated as the sum of the number of reads mapping to an edge in the path divided by the number of edges in the path. These coverages are written to a .tsv file.

Compiling the paper:

Download the /paper/ folder.
In the folder, run: ```
make clean
make ``` This will compile the main.pdf

Owner

Login: casper-schutte
Kind: user

Repositories: 1
Profile: https://github.com/casper-schutte

GitHub Events

Total

Last Year

Dependencies

.github/workflows/Test.yml actions

actions/checkout v3 composite
actions/setup-python v3 composite

.github/workflows/draft-pdf.yml actions

actions/checkout v3 composite
actions/upload-artifact v1 composite
openjournals/openjournals-draft-action master composite

Dockerfile docker

continuumio/miniconda3 latest build

Test/environment.yaml pypi

environment.yaml pypi

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/casper-schutte/fantastic-lamp

Science Score: 13.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Fantastic-lamp:

Project Description:

Installation and Dependencies:

Verification and testing:

Descriptions of steps and files used by the pipeline:

Compiling the paper:

Owner

GitHub Events

Total

Last Year

Dependencies