Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: nature.com
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.3%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: yslproteomics
  • License: gpl-3.0
  • Language: R
  • Default Branch: main
  • Size: 44.2 MB
Statistics
  • Stars: 3
  • Watchers: 1
  • Forks: 0
  • Open Issues: 1
  • Releases: 0
Created over 1 year ago · Last pushed 8 months ago
Metadata Files
Readme License Citation

README.md

KdeggeR

Overview

The KdeggeR package is designed to perform peptide and protein turnover rates estimation from dynamic SILAC labeling (pulse-SILAC, or pSILAC) proteomic experiments analyzed using multiplex DIA-MS. The package is optimized to handle DIA-MS data processed using commong raw MS data processing tools such as Spectronaut, DIA-NN, or Fragpipe, but can handle data measured using DDA-MS and analyzed with software tools like MaxQuant. The package offers optimized input data filtering, several kloss estimation methods for precursor/peptide level data, and functions for protein kloss and kdeg aggregation. The curve fits can be also inspected and exported using provided functions.

Documentation

See the vignette for detailed instructions and example code.

The .pdf version of the vignette is provided here.

See the KdeggeR paper for a more detailed description of the package's functions and applications.

Hardware Requirements

This package requires only a standard computer with enough RAM to support the operations defined by a user.

The package was developed and tested in a computer with the following specs: RAM 32 GB, CPU: 6 cores, 3.2 GHz/core.

Software requirements

This package was tested on macOS and Windows operating systems. The development version of the package has been tested on the following systems:

  • macOS Sonoma, version 14.3.1
  • Microsoft Windows 11, version 10.0.22631

Before setting up this package, users should have R version 4.3.0 or higher, and install the dependencies as specified below.

Installation Guide

To install all dependencies:

```{r}

Required packages

if(!require(pacman)) install.packages("pacman") pacman::p_load(dplyr, purrr, stringr, tibble, outliers)

Optional R package for robust linear model fitting

install.packages("MASS") ```

To install the package:

{r} library("devtools") install_github("yslproteomics/KdeggeR", build_vignettes = TRUE)

To open the vignette with detailed instructions and example code:

{r} vignette("KdeggerUserManual", package = "KdeggeR")

The installation takes about 30 seconds in a computer with the following specs: RAM 32 GB, CPU: 6 cores, 3.2 GHz/core.

How to Cite

If you use the KdeggeR package in your work, please cite the following preprint.

A Comprehensive and Robust Multiplex-DIA Workflow Profiles Protein Turnover Regulations Associated with Cisplatin Resistance Barbora Salovska, Wenxue Li, Oliver M. Bernhardt, Pierre-Luc Germain, Tejas Gandhi, Lukas Reiter, Yansheng Liu bioRxiv 2024.10.28.620709; doi: https://doi.org/10.1101/2024.10.28.620709

Quick demo analysis guide

This guide provides a general workflow how to run the demo analysis using provided datasets. For more details and options please see the vignette and documentation.

The example dataset contains a data.frame containing the first 20,000 unique precursors from the A2780Cis and A2780 parental cell line dataset analyzed using the labeled workflow in Spectronaut v19 using the Group Q-value filtering. The full dataset will be available through the PRIDE repository with the dataset identifier PXD057632.

Example data

To list the example data with description, run the following code.

{r} data(package = "KdeggeR")

See the example input from the analysis in Spectronaut v19.

{r} KdeggeR::example_spectronaut %>% dplyr::glimpse()

See the example design tables.

```{r}

Design table without replicate design

KdeggeR::examplespectronautdesign %>% dplyr::glimpse()

Design table with replicate design

KdeggeR::examplespectronautdesign_replicates %>% dplyr::glimpse() ```

See the example output pSILAC object.

{r} KdeggeR::example_spectronaut_pSILAC_object %>% View()

1. Generate pSILAC class object

```{r}

Analysis without replicate design

inputdata <- KdeggeR::examplespectronaut inputdesign <- KdeggeR::examplespectronaut_design

pSILACobject <- KdeggeR::generatepSILACObject(dataset = inputdata, design = input_design, inputDataType = "spectronaut", aggregate.replicates = NA, # if NA, replicates are not aggregated filterPeptides = T, ncores = NULL, noiseCutoff = 8)

Analysis with replicate design

inputdata <- KdeggeR::examplespectronaut inputdesign <- KdeggeR::examplespectronautdesignreplicates

pSILACobjectreplicates <- KdeggeR::generatepSILACObject(dataset = inputdata, design = inputdesign, inputDataType = "spectronaut", aggregate.replicates = "mean", # can be "mean" or "median" filterPeptides = T, ncores = NULL, noiseCutoff = 8) ```

This step takes about 6 seconds and 13 seconds (with replicate aggregation) in a computer with the following specs: RAM 32 GB, CPU: 6 cores, 3.2 GHz/core.

2. Data quality filtering

```{r}

Filter based on valid values

pSILACobject <- KdeggeR::filterValidValues(pSILACobject, valuescutoff = 2, skiptime_point = 1)

Filter based on monotone trend

pSILACobject <- KdeggeR::filterMonotone(pSILACobject, skiptimepoint = 1) pSILACobject <- KdeggeR::filterMonotoneTimePoint1(pSILACobject)

Filter based on linear regression

pSILACobject <- KdeggeR::filterLinearRegression(pSILACobject, skiptimepoint = 1, R2cutoff = 0.9, pcutoff = 0.05) ```

This step takes about 1.5 min in a computer with the following specs: RAM 32 GB, CPU: 6 cores, 3.2 GHz/core.

3. Calculate k_loss

This is a convenient wrapper function to perform precursor-level kloss estimation using all three models and protein-level kloss aggregation using weighted average of precursor-level k_loss values using default parameters.

{r} pSILAC_object <- KdeggeR::calcAllRates(pSILAC_object, method = "RIA", ag.metric = "mean", ag.weights = "both")

This step takes about 1.6 min in a computer with the following specs: RAM 32 GB, CPU: 6 cores, 3.2 GHz/core.

4. Calculate protein degradation rates (k_deg) and halflives

See the example k_cd table (experimentally-determined cell division rates).

{r} KdeggeR::example_kcd %>% dplyr::glimpse()

Calculate k_deg:

```{r}

Using experimentally-derived k_cd

inputkcd <- KdeggeR::examplekcd pSILACobject <- KdeggeR::calcKdeg(pSILACobject, ratedf = inputkcd, type = "kcd")

Using estimated kcd (kperc)

pSILACobject <- KdeggeR::calcKdeg(pSILACobject, ratedf = NULL, type = "kperc", percneg = 0.01) ```

Calculate t(1/2):

{r} pSILAC_object <- KdeggeR::calcHalflife(pSILAC_object)

These steps take less than 1 s in a computer with the following specs: RAM 32 GB, CPU: 6 cores, 3.2 GHz/core.

5. Export protein degradation rates

```{r}

Protein degradation rates are stored in the protein.kdeg dataframe

proteintable <- pSILACobject$protein.kdeg %>% tibble::rownamestocolumn("Protein_ID") %>% dplyr::glimpse() ```

6. Visualize results

Plot precursor RIA model:

{r} KdeggeR::plotPeptideRIA(pSILAC_object, peptide = "_MLIPYIEHWPR_.3")

Plot precursor ln(H/L +1) model:

{r} KdeggeR::plotPeptideHoL(pSILAC_object, peptide = "_MLIPYIEHWPR_.3")

Plot protein RIA:

{r} KdeggeR::plotProteinRIA(pSILAC_object, protein = "O75208")

Plot protein HoL:

{r} KdeggeR::plotProteinHol(pSILAC_object, protein = "O75208")

Plot protein summary:

{r} KdeggeR::plotProtein(pSILAC_object, protein = "O75208")

Owner

  • Login: yslproteomics
  • Kind: user

Citation (CITATION)

citeHeader("to cite this package in a publication, please use:")

citation <- bibentry(
  bibtype = "Article",
  author = person("Barbora", "Salovska"),
  title = "A Comprehensive and Robust Multiplex-DIA Workflow Profiles Protein Turnover Regulations Associated with Cisplatin Resistance",
  journal = "bioRxiv",
  year = "2024",
  doi = "10.1101/2024.10.28.620709"
)

citEntry(citation)

GitHub Events

Total
  • Issues event: 1
  • Watch event: 3
  • Public event: 2
  • Push event: 26
Last Year
  • Issues event: 1
  • Watch event: 3
  • Public event: 2
  • Push event: 26

Dependencies

DESCRIPTION cran
  • R >= 2.10 depends
  • dplyr * imports
  • knitr * imports
  • magrittr * imports
  • purrr * imports
  • MASS * suggests
  • limma * suggests
  • rmarkdown * suggests
  • rmdformats * suggests