https://github.com/akikuno/dajin
One-step genotyping software using a long-read sequencer
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 4 DOI reference(s) in README -
✓Academic publication links
Links to: plos.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.9%) to scientific vocabulary
Keywords
Repository
One-step genotyping software using a long-read sequencer
Basic Info
- Host: GitHub
- Owner: akikuno
- License: other
- Language: Python
- Default Branch: master
- Homepage: https://doi.org/10.1371/journal.pbio.3001507
- Size: 36.3 MB
Statistics
- Stars: 5
- Watchers: 2
- Forks: 2
- Open Issues: 4
- Releases: 0
Topics
Metadata Files
README.md
[!CAUTION]
DAJIN is deprecated. Please use DAJIN2, which has become the successor to DAJIN.
DAJIN is a genotyping software for genome-edited organisms using Nanopore sequencer.
DAJIN is named after 一網打尽 (Ichimou DAJIN or Yī Wǎng Dǎ jìn), meaning catching all in one net.
Features
- Capturing mutations from SNV to SV, such as a point mutation, knock-out, knock-in, and inversion.
- Automatic allele clustering and annotation.
- Genotyping ~100 samples with different genome-editing designs at a single run.
Setup
We recommend Linux OS with NVIDIA GPU to reduce computation time.
If you use a Windows PC with NVIDIA GPU, please follow this instruction.
We confirmed DAJIN's operation on these environments.
1. Install conda
```bash
Install miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x8664.sh chmod +x Miniconda3-latest-Linux-x8664.sh bash Miniconda3-latest-Linux-x86_64.sh -b -f -p /usr/local/ ```
2. Clone this repository
sh
git clone https://github.com/akikuno/DAJIN.git
3. Prepare design files and fastq directory
We recommend the following directory tree.
You can rename design.txt, design.fasta and fastq.
``` ├── DAJIN ├── design.txt ├── design.fasta ├── fastq │ ├── barcode01.fastq │ ├── barcode02.fastq │ ├── barcode03.fastq │ ├── ......
```
Prepare inputs
design.txt
design.txt is formatted as below:
design=DAJIN/example/example.fa
input_dir=DAJIN/example/fastq
control=barcode01
grna=CCTGTCCAGAGTGGGAGATAGCC,CCACTGCTAGCTGTGGGTAACCC
genome=mm10
output_dir=DAJIN_example
threads=10
- design: PATH to a FASTA file including sequences of each genotype. ">wt" and ">target" must be included.
- input_dir: PATH to a directory containing demultiplexed FASTQ files
- control: control barcode ID
- grna: gRNA sequence(s) with PAM. multiple gRNA sequences must be delimitated by comma.
- genome(optional): reference genome. e.g. hg38,mm10 (not GRCh38,GRCm38)
- output_dir (optional): output directory name. Default is
DAJIN_results - threads (optional: integer): Default is
2/3of available CPU threads. - filter (optional:
onoroff): set filter to remove very minor allele (less than 3%). Default ison.
design,input_dir,control,genome,grnaare required, but there are in no particular order.
design.fasta
design.fasta is a multi-FASTA file, which contains a WT and target sequence, as well as byproducts.
This is an example file of flox design.
In flox design, 6 allele types (WT, Target, Left LoxP, Right LoxP, flox deletion, and Inversion) may be produced.
DAJIN defines SV (structural variants) allele that is different from these alleles.
Usage
sh
./DAJIN/DAJIN -i design.txt
Example usage
You can test DAJIN by an example small dataset.
sh
./DAJIN/DAJIN -i DAJIN/example/design.txt
Outputs
DAJIN outputs two files and two folders: Details.csv, Details.pdf, Consensus, BAM.
Details.csv
Details.csv contains allele information.
The allele with target mutation is labeled + in the Design column.
| Sample | Allele ID | % of reads | Allele type | Indel | Large indel | Design | | --------- | --------- | ---------- | ------------- | ----- | ----------- | ------ | | barcode01 | 1 | 100 | wt | - | - | - | | barcode02 | 1 | 11.8 | SV | + | + | - | | barcode02 | 2 | 88.2 | target | - | - | + | | barcode03 | 1 | 9.9 | SV | + | + | - | | barcode03 | 2 | 38.5 | SV | + | + | - | | barcode03 | 3 | 51.6 | flox_deletion | - | - | - |
Details.pdf
The output directory contains a figure of whole-allelic profile.
This is an example result of three samples.
The barcode01 is a wild-type mouse as a control, whereas the barcode02 and barcode03 are genome-edited founder mice with a flox knock-in design.
This result shows that most Nanopore reads of barcode02 are labeled as "intact target" (flox), and indicates that barcode02 is a candidate for the desired homozygous mice.
Consensus
The Conseusus folder includes FASTA and HTML files that display the consensus sequence in each allele.
Here is an example of DAJIN's consensus sequence using the point mutation.
BAM
The BAM folder includes BAM files from all and each allele.
The BAM files can be visualized by IGV.
License
This project is under the MIT License - see the LICENSE file for details
Citation
@article{Kuno_2022,
title={DAJIN enables multiplex genotyping to simultaneously validate intended and unintended target genome editing outcomes},
volume={20},
url={https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001507},
DOI={10.1371/journal.pbio.3001507},
number={1},
journal={PLOS Biology},
author={Kuno, Akihiro and Ikeda, Yoshihisa and Ayabe, Shinya and Kato, Kanako and Sakamoto, Kotaro and Suzuki, Sayaka R. and Morimoto, Kento and Wakimoto, Arata and Mikami, Natsuki and Ishida, Miyuki and et al.},
year={2022},
month={Jan},
pages={e3001507}
}
Owner
- Name: Akihiro Kuno
- Login: akikuno
- Kind: user
- Location: Tsukuba, Ibaraki, Japan
- Company: University of Tsukuba
- Website: https://researchmap.jp/7000027584/?lang=en
- Twitter: akikuno_sh
- Repositories: 12
- Profile: https://github.com/akikuno
Bioinformatician working at the Laboratory Animal Resource Center
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 21
- Total pull requests: 27
- Average time to close issues: 6 months
- Average time to close pull requests: about 6 hours
- Total issue authors: 3
- Total pull request authors: 2
- Average comments per issue: 0.57
- Average comments per pull request: 0.11
- Merged pull requests: 26
- Bot issues: 0
- Bot pull requests: 1
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- akikuno (19)
- amv33576 (1)
- chengarthur (1)
Pull Request Authors
- akikuno (26)
- dependabot[bot] (1)