https://github.com/ax-ekk/proseq_alignment.sh
Pipeline shell script for aligning paired-end PRO-seq data with spike-in and UMIs
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.3%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Pipeline shell script for aligning paired-end PRO-seq data with spike-in and UMIs
Basic Info
- Host: GitHub
- Owner: ax-ekk
- License: mit
- Default Branch: master
- Size: 14.6 KB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of JAJ256/PROseq_alignment.sh
Created over 3 years ago
· Last pushed almost 6 years ago
https://github.com/ax-ekk/PROseq_alignment.sh/blob/master/
# PROseq_alignment.sh [](https://zenodo.org/badge/latestdoi/254700530) This is a pipeline script for aligning paired-end PRO-seq data that has cells of a different species spiked in for normalization, and uses some combination of random UMI sequences on the ligation end of either the 5' or 3' adapter, or both. Run this script in a directory that has one folder named "fastq" which contains the data. Fastq files must have identical names other than ending in _R1.fastq and _R2.fastq. # Parameters These parameters are at the top of the file and can be changed depending on your data. THREADS: (Integer) Number of threads to spawn for each step in the process. UMI_LEN: (Integer) length of the UMI in basepairs. The if both 5' and 3' UMIs are used, this is the lenght for both. Fastp cannot handle multiple UMIs of different length. FIVEP_UMI: (String) Set to "Y" if the UMI is on the 5' adapter THREEP_UMI: (String) Set to "Y" if the UMI is on the 3' adapter Note: both FIVEP_UMI and THREEP_UMI can be "Y" if there are UMIs on both sides of the insert ADAPTOR_1 and ADAPTOR_2: (String) adapter sequences to trim. Default is TruSeq Small RNA sequences. These sequences are only here for backup in the case that fastp cannot automatically determine the adapter sequences by overlap analysis. GENOME_EXP: (String) Path to the bowtie2 index for your experimental genome GENOME_SPIKE: (String) Path to the bowtie2 index for your spike-in genome. To prepare a spike-in genome, combine your experimental genome with a repeat-masked version of your spike-in organism's genome. You must first modify the chromosome labels of the spike-in genome so alignments can be sorted later. SPIKE_PREFIX: (String) This is the prefix you've used on your spike in chromosomes, ie >spikechr1 RDNA: (String) Path to the bowtie2 index for the rDNA repeat for your organism(s) MAPQ: (Integer) Mapq score cutoff for filtering multimappers
Owner
- Login: ax-ekk
- Kind: user
- Repositories: 1
- Profile: https://github.com/ax-ekk