Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
    Links to: pubmed.ncbi, ncbi.nlm.nih.gov, nature.com
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.3%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: UK-SBCoA-EbbertLab
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 125 MB
Statistics
  • Stars: 12
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 3 years ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

README.md

Mapping medically relevant RNA isoform diversity in the aged human frontal cortex with deep long-read RNA-seq

Link to Article: https://doi.org/10.1038/s41587-024-02245-9

This repository contains all code and documentation used for the analysis contained in the article above.

Repository structure:

article_analysis - Scripts used for data analysis and figure generation for article publication. Uses data from output from illumina_pipeline and nanopore_pipeline. Some figures were created using the scripts in website

illumina_pipeline - In house NextFlow pipeline optimized for analysis of Illumina paired-end short-read sequencing data.

nanopore_pipeline - In house NextFlow pipeline optimized for analysis of Oxford Nanopore PCR Amplified cDNA sequencing data.

proteomics - Analysis pipeline to validate new transcripts at the protein level using publicly available Mass Spec data. Also explains downstream analysis steps and contains custom script used for downstream analysis and figure generation.

singularity_containers - Directory with container definition files and pull commands. With the exception of the Fragpipe pipeline (proteomics_pipeline) and the Rshiny web app (website), all the software used in this GitHub repository is in these singularity containers.

website - Contains Rshiny app scripts that allows users to perform gene queries and visualize RNA isoform expression from the data used in this publication. Access website here

Data availability

Raw nanopore sequencing fastq files generated in this study are available here. Also available through NIH SRA (Accession number: SRP456327)

Proteomics (Mass spec) data from cell-lines used in this experiment are publicly available here. For more information about this data see: https://pubmed.ncbi.nlm.nih.gov/36959352/

Proteomics (Mass spec) data from round 2 of the ROSMAP TMT brain Proteomics are puclicly available here. For more information about this data see: https://www.nature.com/articles/s41597-020-00650-8

Final output files from transcriptomics/RNAseq and proteomics analysis and annotations/references used in this study are available here

GTEx long-read RNAseq data used for validation of our study results is available here

ROSMAP short-read RNAseq data used for validation of our study results is available here

More information

Each directory within this GitHub repository contains documentation for the analysis performed in that directory. If you have any questions please submit and issue.

Owner

  • Name: UK Sanders-Brown Center on Aging - Ebbert Lab
  • Login: UK-SBCoA-EbbertLab
  • Kind: organization

Ebbert BioInformatics Lab

Citation (CITATIONS.md)

# Citations

## Main

[Nextflow](https://www.nextflow.io/docs/latest/index.html)

[Singularity](https://docs.sylabs.io/guides/latest/user-guide/)


## Pre-processing

[Guppy](https://timkahlke.github.io/LongRead_tutorials/BS_G.html)

[Pychopper](https://github.com/epi2me-labs/pychopper)


## Quality Control

[PycoQC](https://github.com/a-slide/pycoQC)

[RSeQC](https://rseqc.sourceforge.net/)

[MultiQC](https://multiqc.info/)


## Mapping

[Minimap2](https://github.com/lh3/minimap2)


## Transcriptomics

[GFFread](https://github.com/gpertea/gffread)

[GFFcompare](https://ccb.jhu.edu/software/stringtie/gffcompare.shtml)

[StringTie](https://github.com/gpertea/stringtie)

[FLAIR](https://github.com/BrooksLabUCSC/flair)

[Bambu](https://github.com/GoekeLab/bambu)


## Other Genomics Tools

[Bedtools](https://github.com/arq5x/bedtools2)

[Samtools](https://github.com/samtools/samtools)

[MEME-SUITE](https://meme-suite.org/meme/)


## R Packages

[BioConductor](https://www.bioconductor.org/)

[Bambu](https://github.com/GoekeLab/bambu)

[ggtranscript](https://github.com/dzhang32/ggtranscript)

[tidyverse](https://www.tidyverse.org/)

[Rshiny](https://shiny.rstudio.com/)

[RnaSeqSampleSize](https://bioconductor.org/packages/release/bioc/html/RnaSeqSampleSize.html)

[DESeq2](https://bioconductor.org/packages/release/bioc/html/DESeq2.html)

## Python Packages

[numpy](https://numpy.org/)

[pandas](https://pandas.pydata.org/)

[regex](https://docs.python.org/3/library/re.html)

[plotly](https://plotly.com/python/)

[matplotlib](https://matplotlib.org/)

[seaborn](https://seaborn.pydata.org/)

[matplotlib_venn](https://pypi.org/project/matplotlib-venn/)

[wordcloud](https://pypi.org/project/wordcloud/)

[notebook](https://pypi.org/project/notebook/)

## Other

[Conda](https://docs.conda.io/en/latest/)

[Bioconda](https://bioconda.github.io/)

[pip](https://pypi.org/project/pip/)

GitHub Events

Total
  • Issues event: 1
  • Watch event: 7
  • Issue comment event: 1
  • Push event: 1
Last Year
  • Issues event: 1
  • Watch event: 7
  • Issue comment event: 1
  • Push event: 1