Science Score: 65.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 7 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
✓Institutional organization owner
Organization zavolanlab has institutional domain (www.biozentrum.unibas.ch) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (17.0%) to scientific vocabulary
Keywords
Repository
The Zavolab Automated RNA-seq Pipeline
Basic Info
- Host: GitHub
- Owner: zavolanlab
- License: apache-2.0
- Language: Python
- Default Branch: dev
- Homepage: https://zavolanlab.github.io/zarp/
- Size: 27 MB
Statistics
- Stars: 35
- Watchers: 6
- Forks: 5
- Open Issues: 14
- Releases: 5
Topics
Metadata Files
README.md
ZARP (Zavolab Automated RNA-seq Pipeline) is a generic RNA-Seq analysis workflow that allows users to process and analyze Illumina short-read sequencing libraries with minimum effort. Better yet: With our companion ZARP-cli command line interface, you can start ZARP runs with the simplest and most intuitive commands.
RNA-seq analysis doesn't get simpler than that!
The workflow is developed in Snakemake, a widely used workflow management system in the bioinformatics community. ZARP will pre-process, align and quantify your single- or paired-end stranded bulk RNA-seq sequencing libraries with publicly available state-of-the-art bioinformatics tools. ZARP's browser-based rich reports and visualizations will give you meaningful initial insights in the quality and composition of your sequencing experiments - fast and simple. Whether you are an experimentalist struggling with large scale data analysis or an experienced bioinformatician, when there's RNA-seq data to analyze, just zarp 'em!
Documentation
For the full documentation please visit the ZARP website.
Quick installation
IMPORTANT: Rather than installing the ZARP workflow as described in this section, we recommend installing ZARP-cli for most use cases! If you follow its installation instructions, you can skip the instructions below.
Quick installation requires the following: - Linux - Git - Conda >= 24.11.3 - Optional: Apptainer >=1.3.6 (required only if you want to use containers to manage tool environments)
bash
git clone https://github.com/zavolanlab/zarp.git
cd zarp
conda env create -f install/environment.yml
conda activate zarp
Basic usage
You can trigger ZARP without ZARP-cli. This is convenient for users who have some experience with Snakemake and don't want to use a CLI to trigger their runs. Extensive documentation of the usage is available in the usage documentation, while below you can find the basic steps to trigger a run.
Assuming that your current directory is the workflow repository's root directory, create a directory for your workflow run and move into it with:
bash mkdir config/my_run cd config/my_runCreate an empty sample table and a workflow configuration file:
bash touch samples.tsv touch config.yamlUse your editor of choice to populate these files with appropriate values. Have a look at the examples in the
tests/directory to see what the files should look like, specifically:
- [samples.tsv](https://github.com/zavolanlab/zarp/blob/dev/tests/input_files/samples.tsv)
- [config.yaml](https://github.com/zavolanlab/zarp/blob/dev/tests/input_files/config.yaml)
Create a runner script. Pick one of the following choices for either local or cluster execution. Before execution of the respective command, you need to remember to update the argument of the
--apptainer-argsoption of a respective profile (file:profiles/{profile}/config.yaml) so that it contains a comma-separated list of all directories containing input data files (samples and any annotation files etc) required for your run.Runner script for local execution:
```bash cat << "EOF" > run.sh
!/bin/bash
snakemake \ --profile="../../profiles/local-apptainer" \ --configfile="config.yaml"
EOF ```
OR
Runner script for Slurm cluster execution (note that you may need to modify the arguments to
--jobsand--coresin the file:profiles/slurm-apptainer/config.yamldepending on your HPC and workload manager configuration):```bash cat << "EOF" > run.sh
!/bin/bash
mkdir -p logs/cluster_log snakemake \ --profile="../profiles/slurm-apptainer" \ --configfile="config.yaml" EOF ```
Note: When running the pipeline with Conda you should use
local-condaandslurm-condaprofiles instead.Note: The slurm profiles are adapted to a cluster that uses the quality-of-service (QOS) keyword. If QOS is not supported by your slurm instance, you have to remove all the lines with "qos" in
profiles/slurm-config.json.Start your workflow run:
bash bash run.sh
Contributing
This project lives off your contributions, be it in the form of bug reports, feature requests, discussions, or fixes and other code changes. Please refer to the contributing guidelines if you are interested to contribute. Please mind the code of conduct for all interactions with the community.
Contact
For questions or suggestions regarding the code, please use the [issue tracker][issue-tracker]. For any other inquiries, please contact us by email.
Owner
- Name: Zavolan Lab
- Login: zavolanlab
- Kind: organization
- Location: Basel
- Website: https://www.biozentrum.unibas.ch/research/groups-platforms/overview/unit/zavolan/
- Repositories: 15
- Profile: https://github.com/zavolanlab
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "ZARP: A user-friendly and versatile RNA-seq analysis workflow"
version: 1.0.0
doi: 10.12688/f1000research.149237.1
date-released: 2024-05-24
url: "https://github.com/zavolanlab/zarp"
preferred-citation:
type: article
authors:
- family-names: "Katsantoni"
given-names: "Maria"
orcid: "https://orcid.org/0000-0003-1961-0223"
- family-names: "Gypas"
given-names: "Foivos"
orcid: "https://orcid.org/0000-0002-7233-8794"
- family-names: "Herrmann"
given-names: "Christina J."
orcid: "https://orcid.org/0000-0003-1649-3463"
- family-names: "Burri"
given-names: "Dominik"
orcid: "https://orcid.org/0000-0002-8131-9309"
- family-names: "Bak"
given-names: "Maciej"
orcid: "https://orcid.org/0000-0003-1361-7301"
- family-names: "Iborra"
given-names: "Paula"
orcid: "https://orcid.org/0000-0003-0504-3029"
- family-names: "Agarwal"
given-names: "Krish"
orcid: "https://orcid.org/0000-0001-6809-5024"
- family-names: "Ataman"
given-names: "Meric"
orcid: "https://orcid.org/0000-0002-7942-9226"
- family-names: "Balajti"
given-names: "Máté"
orcid: "https://orcid.org/0009-0000-3932-3964"
- family-names: "Pozzan"
given-names: "Noè"
- family-names: "Schlusser"
given-names: "Niels"
- family-names: "Moon"
given-names: "Youngbin"
orcid: "https://orcid.org/0009-0001-5728-3959"
- family-names: "Mironov"
given-names: "Aleksei"
- family-names: "Boersch"
given-names: "Anastasiya"
orcid: "https://orcid.org/0000-0003-3392-5272"
- family-names: "Zavolan"
given-names: "Mihaela"
orcid: "https://orcid.org/0000-0002-8832-2041"
- family-names: "Kanitz"
given-names: "Alexander"
orcid: "https://orcid.org/0000-0002-3468-0652"
doi: "10.12688/f1000research.149237.1"
journal: "F1000Research"
month: 05
title: "ZARP: A user-friendly and versatile RNA-seq analysis workflow"
year: 2024
GitHub Events
Total
- Create event: 6
- Release event: 1
- Issues event: 16
- Watch event: 1
- Delete event: 5
- Issue comment event: 20
- Push event: 119
- Pull request event: 10
- Pull request review comment event: 49
- Pull request review event: 44
Last Year
- Create event: 6
- Release event: 1
- Issues event: 16
- Watch event: 1
- Delete event: 5
- Issue comment event: 20
- Push event: 119
- Pull request event: 10
- Pull request review comment event: 49
- Pull request review event: 44
Dependencies
- actions/checkout v3 composite
- conda-incubator/setup-miniconda v2 composite
- Requarks/changelog-action v1 composite
- actions/checkout v2 composite
- ncipollo/release-action v1 composite
- stefanzweifel/git-auto-commit-action v4 composite
- srvaroa/labeler master composite