Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: nature.com -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.6%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: magdalenahuebner
- License: mit
- Language: Jupyter Notebook
- Default Branch: main
- Size: 2.41 MB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
MagPipe
MagPipe is a comprehensive analysis pipeline developed specifically for performing differential phosphosite occupancy analysis on phosphoproteomics data. It has been tailored to process the dataset published in the following paper, which explores phosphorylation patterns across various cell lines and kinase perturbations:
Hijazi M, Smith R, Rajeeve V, Bessant C, Cutillas PR. Reconstructing kinase network topologies from phosphoproteomics data reveals cancer-associated rewiring. Nat Biotechnol. 2020 Apr
However, MagPipe can also be used to analyse other phosphoproteomics datasets, as it includes functionalities for harmonising protein IDs, quality control, quantile normalisation, batch correction, and differential phosphosite occupancy analysis. Unique features of the MagPipe package are its method for estimating missing fold changes and the calculation of signal-intensity dependent confidence scores.
MagPipe was created with cookiecutter-bioinformatics-project, a bioinformatics pipeline template provided by the Max Planck Institute of Immunobiology and Epigenetics, which builds on cookiecutter-fair-data-science and Snakemake. This ensures a streamlined, reproducible workflow for phosphoproteomics analysis.
Getting Started
Dependencies
- Python 3.8+
- R 4.0+
- Required Python packages: snakemake, uniprot-id-mapper (see
environment.yamlfor a complete list) - Required R packages: limma, reshape2, plyr
Cloning and Running the Snakemake Project
Clone the MagPipe repository:
bash git clone git@github.com:magdalenahuebner/magpipe.git cd magpipeActivate the Conda environment:
bash conda env create -f environment.yaml conda activate magpipeImport your data: Add your phosphoproteomics data and corresponding metadata to
resources/raw_data. Make sure the data is formatted correctly (see Input File Formatting). Edit the Snakefile inworkflow/Snakefileand replacephosphodata_aup.tsvandmetadata.tsvwith the names of your data and metadata files.Execute the pipeline:
bash snakemake --cores 1
Alternative: Using MagPipe as a Module
MagPipe can also be integrated into your Python projects as a module:
Install MagPipe:
bash python -m pip install git+https://github.com/magdalenahuebner/magpipe.gitImport as library:
python import magpipeInput File Formatting
Input files should be in TSV format. The phosphoproteomics dataset file should contain the quantified phosphosites (e.g. area under the peak values from MS1 spectra) formatted with phosphosite names as rows and samples as columns. The metadata file should contain a column named 'sampleID', with sample names/IDs corresponding to the column names of the phosphoproteomics data.
Workflow Overview
The MagPipe workflow encompasses several critical steps:

A more detailed explanation of the workflow is provided in a collection of Jupyter notebooks, which can be found in notebooks/.
Owner
- Name: Magdalena Huebner
- Login: magdalenahuebner
- Kind: user
- Repositories: 1
- Profile: https://github.com/magdalenahuebner
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Magdalena Huebner
orcid: https://orcid.org/0009-0006-7054-3491
title: magpipe
version: 0.1.0
date-released: 2024-00-03/07/24
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Dependencies
- bioconductor-limma
- ipykernel
- matplotlib
- numpy
- pandas
- pip
- python 3.*
- r-plyr
- r-reshape2
- requests
- scikit-learn
- scipy
- snakemake
- tqdm