https://github.com/broadinstitute/broad-epi-repeats-analysis
Workflow to quantify transposable elements expression.
https://github.com/broadinstitute/broad-epi-repeats-analysis
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.3%) to scientific vocabulary
Repository
Workflow to quantify transposable elements expression.
Basic Info
Statistics
- Stars: 0
- Watchers: 4
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Transposable Elements Pipeline Analysis
Introduction
This repository contains a collection of pipelines and scripts to analyze RNA-seq and ChIP-seq data taking into account multi-mapping reads for accurate transposable elements quantification.
RNA
TEtranscript toolkit is used for multi-mappers-awar family-level quantifications and for differential analyses.
TElocal is used for multi-mappers-awar loci-level quantifications.
Histone modifcations
SmartMap is used for multi-mappers-awar quantifications.
Getting started
The first step is to download this repository to your computer using the following commands:
bash
$ git clone git@github.com:broadinstitute/broad-epi-repeats-analysis.git
$ cd broad-epi-repeats-analysis
Genome, genes, and repeats annotations
The script in src/bash/create-mus-musculus-annotations-mm10.bash will download and create all the necessary annotations.
The annotations will be place in a new folder mm10 with three sub-folders called genome, genes, and repeats.
Genome index
Using the genome FASTA file insided mm10/genome you can build the index for STAR and bowtie2 using the star-build-index wdl or the bowtie2-build-index wdl respectively. Both are located in the workflows folder.
Quantifications
workflow/align-quantify-repeats.wdl workflow will align the FASTQs to the genome using STAR and compute genes and TEs quantifications using TElocal and TEcount. Four count files will be reported:
- family-level-unique-counts
- family-level-multimappers-counts
- loci-level-unique-counts
- loci-level-multimappers-counts.
bowtie2-repeats-quantification workflow will align the FASTQs to the genome using bowtie2 and a weighted bedgraph accounting for multi-mappers will be generated.
gs://encode-pipeline-genome-data/mm10/bowtie2index/ENCFF309GLL.tar.gz
gs://encode-pipeline-genome-data/mm10/mm10no_alt.chrom.sizes.tsv
smartmapprep outputp is a bed.gz file that ends with ("coord_scores.bed.gz") and a folder called splits.
TODO: - Report unique-mappers track for ChIP - Create tracks for RNA using unique-mappers and apportioning multi-mappers. It will not be the same counts as in the count files but it will give us an idea. - Annotate each loci with the percentage of gained counts when including multi-mappers. If unique counts were 3 and with multi-mappers is 100 it is suspicious compared to something going from 50 to 100.
Owner
- Name: Broad Institute
- Login: broadinstitute
- Kind: organization
- Location: Cambridge, MA
- Website: http://www.broadinstitute.org/
- Twitter: broadinstitute
- Repositories: 1,083
- Profile: https://github.com/broadinstitute
Broad Institute of MIT and Harvard
GitHub Events
Total
- Member event: 1
- Push event: 22
- Create event: 1
Last Year
- Member event: 1
- Push event: 22
- Create event: 1
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0