https://github.com/barbarahelena/olink-analyses-charite

Olink data analyses

https://github.com/barbarahelena/olink-analyses-charite

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.9%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Olink data analyses

Basic Info
  • Host: GitHub
  • Owner: barbarahelena
  • Language: R
  • Default Branch: main
  • Size: 1.41 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 3 years ago · Last pushed almost 2 years ago
Metadata Files
Readme

README.md

Analysis Olink data

Set-up mouse experiment

There were two conditions at 1 timepoint: - Control + CI = Carotid Injury + PBS, this is the control, N = 5 - Pneumonia + CI = Carotid Injury + streptococcus pneumonia, this is the infection group, N = 5

Olink panel

The Olink mouse exploratory panel (ww.olink.com/mouse-exploratory) was performed using EDTA plasma from the mice (40 ul samples). This panel includes 92 proteins of which the list can be found in the data folder. All samples passed quality control (QC).

Explanation on data processing

In the data processing script (230421_olink.R), first we open the sample data frame and protein expression data frame. From the protein expression table, we extract the proteins and their metadata (such as olink ID and Uniprot ID).

For determining significant group differences, I used 2-sided t-tests for the 92 proteins. I applied FDR (Benjamini-Hochberg) multiple testing correction on the p-values to generate a q-value (= adjusted p-value). Thereafter, we calculate the group means, sd and n per protein. The table with the results from the t-tests and the means per group was saved as proteins_ttest_diff.csv in the results folder.

The expression data, so the protein expression values for all samples and the 92 proteins, were saved in proteins_expression_data.csv. For the heatmaps and boxplots I used scaled expression data, so the expression that was measured scaled to a mean of 0 and sd of 1, to make the proteins easier to visualize in for instance a heatmap. For the PCA plot i used unscaled data, as scaling is already performed in the PCA function itself.

PCA

For the principal component analysis (PCA) in 230425_pca.R, I used the prcomp function with scaling and centering of the data. The first two principal components are plotted. I made a version with labels to make it easier to assess which samples are more alike.

Heatmaps

Heatmaps were drawn using the ComplexHeatmap R package (v2.12.1), see script 230425_heatmaps.R. I made different versions: different selections of proteins (all proteins, or sig based on p-value or q-value (= FDR-adjusted)); showing a scale for significance and without.

Boxplots

Boxplots were drawn with ggplot2 while comparing the two groups with stat_compare_means from the ggpubr package (v0.6.0) using t-tests. See script 230425_boxplots.R.

Combination plots

The three plots above were combined in one plot in the 230425_combineplots.R script.

R packages

To improve reproducibility, we created a lockfile in August 2024 and reran all the analyses to ensure that everything still works. The renv lockfile, activate.R and Rprofile can be found in this repository. Please refer to the renv documentation for instructions on how to use these to recreate the R environment with all packages needed.

Owner

  • Name: Barbara Verhaar
  • Login: barbarahelena
  • Kind: user
  • Location: Amsterdam

PhD candidate @ Amsterdam UMC Vascular medicine

GitHub Events

Total
Last Year