xpresspipe
An alignment and analysis pipeline for Ribosome Profiling and RNA-seq data
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: biorxiv.org, ncbi.nlm.nih.gov, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.4%) to scientific vocabulary
Keywords
Repository
An alignment and analysis pipeline for Ribosome Profiling and RNA-seq data
Basic Info
- Host: GitHub
- Owner: XPRESSyourself
- License: gpl-3.0
- Language: Python
- Default Branch: main
- Homepage: https://xpresspipe.readthedocs.io/en/latest/
- Size: 418 MB
Statistics
- Stars: 13
- Watchers: 0
- Forks: 4
- Open Issues: 2
- Releases: 25
Topics
Metadata Files
README.md
An alignment and analysis pipeline for RNAseq data
Please refer to the documentation for more in depth details.
Citation:
Berg JA, et. al. (2020). XPRESSyourself: Enhancing, standardizing, and
automating ribosome profiling computational analyses yields improved insight
into data. PLoS Comp Biol. doi: https://doi.org/10.1371/journal.pcbi.1007625
Installation:
Installing from source
The following is a short tutorial showing you how to install XPRESSpipe:
NOTE: Previous versions utilized the pip install . command to install. Users of >= v0.6.3 should instead use bash install.sh
- Make sure you let Anaconda set up the PATH info for you.
- If the help menu is not displayed when testing, try adding the path where you installed XPRESSpipe to the system PATH
$ echo 'export PATH=$PATH:/path/to/xpresspipe' >> ~/.bash_profile - If you do not have a file names
~/.bash_profile, try looking for one called~/.profile - The commands used in the video above are summarized here:
$ curl -L -O https://github.com/XPRESSyourself/XPRESSpipe/archive/refs/tags/v0.6.3.zip $ unzip XPRESSpipe-v0.6.3.zip $ cd XPRESSpipe-v0.6.3/ $ conda install -c conda-forge mamba $ mamba env create -f requirements.yml # Or requirements_frozen.yml for a recent working dependency set $ conda activate xpresspipe $ bash install.sh $ xpresspipe -h $ xpresspipe test - Be sure to specify the correct release version in the first URL
QuickStart:
You can also use the XPRESSpipe command builder and executor for reference curation or running the pipeline by executing the following:
$ xpresspipe build
Important Notes:
Basic Starting Input
inputdirectory with raw sequence data- Sequence data files should be
FASTQformat and end in.fastqor.fqand can be.zipor.gzcompressed
- Sequence data files should be
- An empty
outputdirectory - A
referencedirectory (see documentation forcurateReferencefor more details)
Naming Conventions
In order for ordered output after alignment (except for generation of a raw counts table), recommended file naming conventions should be followed.
- Download your raw sequence data and place in a folder -- this folder should contain all the sequence data and nothing else.
- Make sure files follow a pattern naming scheme. For example, if you had 3 genetic backgrounds of ribosome profiling data, the naming scheme would go as follows:
ExperimentName_BackgroundA_FP.fastq(.qz) ExperimentName_BackgroundA_RNA.fastq(.qz) ExperimentName_BackgroundB_FP.fastq(.qz) ExperimentName_BackgroundB_RNA.fastq(.qz) ExperimentName_BackgroundC_FP.fastq(.qz) ExperimentName_BackgroundC_RNA.fastq(.qz) - If the sample names are replicates, their sample number needs to be indicated.
- If you want the final count table to be in a particular order and the samples ordered that way are not alphabetically, append a letter in front of the sample name to force this ordering.
ExperimentName_a_WT.fastq(.qz) ExperimentName_a_WT.fastq(.qz) ExperimentName_b_exType.fastq(.qz) ExperimentName_b_exType.fastq(.qz) - If you have replicates:
ExperimentName_a_WT_1.fastq(.qz) ExperimentName_a_WT_1.fastq(.qz) ExperimentName_a_WT_2.fastq(.qz) ExperimentName_a_WT_2.fastq(.qz) ExperimentName_b_exType_1.fastq(.qz) ExperimentName_b_exType_1.fastq(.qz) ExperimentName_b_exType_2.fastq(.qz) ExperimentName_b_exType_2.fastq(.qz)
Running a test dataset:
- We can run a test dataset as in the associated manuscript by downloading the FASTQ files from GSE65778 using the SRAtoolkit.
- We can curate the reference like so:
$ xpresspipe curateReference -o /path/to/reference -f /path/to/reference/genome_fastas -g /path/to/reference/transcripts.gtf -p -t --sjdbOverhang 49 - And we can process the dataset like so:
xpresspipe riboseq -i /path/to/input -o /path/to/output -r /path/to/reference/ --gtf /path/to/reference//transcripts_CT.gtf -e isrib_test_study -a CTGTAGGCACCATCAAT --sjdbOverhang 49 - The above steps will be very computationally intensive, so we recommend running this on a supercomputing cluster
Scripts used to analyze this data can be found here and here and here
Alternatively, smaller test datasets can be found within the XPRESSpipe
testsfolder and an outline of commands to run can be found here
Updates
Information on updates to the software can be found here.
Owner
- Name: XPRESSyourself
- Login: XPRESSyourself
- Kind: organization
- Location: University of Utah
- Repositories: 1
- Profile: https://github.com/XPRESSyourself
Ribosome Profiling and RNA-seq processing and analysis made easy
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Dependencies
- actions/checkout v3 composite
- conda-incubator/setup-miniconda v2 composite
- actions/checkout v3 composite
- conda-incubator/setup-miniconda v2 composite
- actions/checkout v3 composite
- conda-incubator/setup-miniconda v2 composite
