https://github.com/cwieder/pathintegrate_scripts

Benchmarking and application scripts for PathIntegrate

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
✓
Academic publication links
Links to: ncbi.nlm.nih.gov
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.6%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

Benchmarking and application scripts for PathIntegrate

Basic Info

Host: GitHub
Owner: cwieder
License: gpl-3.0
Language: Jupyter Notebook
Default Branch: main
Size: 10.2 MB

Statistics

Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created over 2 years ago · Last pushed over 2 years ago

https://github.com/cwieder/PathIntegrate_scripts/blob/main/

# PathIntegrate_scripts

This repository contains benchmarking and application scripts for the PathIntegrate manuscript. Any analyses using the COVID-19 data by Su et al. 2020 ([DOI](https://doi.org/10.1016/j.cell.2020.10.037)) can be replicated using code and data within this repository. Any analyses requiring the COPDgene multi-omics data require access to the COPDgene consortium data* (see data availability section in the manuscript).

**This is not the repository for the PathIntegrate package. The PathIntegrate package can be found at [PathIntegrate](https://github.com/cwieder/PathIntegrate)**

>*The COPDgene multi-omics data can be found at the following sources: Clinical Data and SOMAScan data are available through COPDGene (https://www.ncbi.nlm.nih.gov/gap/, ID: phs000179.v6.p2). RNA-Seq data is available through dbGaP (https://www.ncbi.nlm.nih.gov/gap/, ID: phs000765.v3.p2). Metabolon data is available at Metabolomics Workbench (https://www.metabolomicsworkbench.org/ ID: PR000907).

## Prerequisites

- All scripts are run using Python 10
- Dependencies are listed in the `requirements.txt` file
- Benchmarking scripts are developed for running on a PBS HPC cluster
- Application scripts are run within Jupyter notebooks

## Installation

- Clone this respoitory to access the scripts and data

## Data and pathways

- COVID_data: contains the COVID-19 metabolomics and proteomics data from Su et al. 2020
- Pathway_databases: contains the KEGG and Reactome pathway databases for metabolites and proteins. Gene pathways should be downloaded from https://reactome.org/download-data or using the sspa package.

## Benchmarking scripts

- Sim0: scripts for univariate simulations comparing sensitivity of detection of pathway vs. molecular-level signals
- Sim1: scripts for comparing PathIntegrate predictive performance to molecular level models and DIABLO
- Sim2: script for benchmarking detection of target enriched pathway using
- Sim3: script for benchmarking predictive performance at the molecular vs pathway level
- Sim4: script for benchmarking predictive performance with varying sub-sample sizes
- MBPLS_Permutation_Testing.py - script for performing permutation testing on the MBPLS model

## Contact

Email cw2019@ic.ac.uk for troubleshooting or questions about the scripts.

Owner

Login: cwieder
Kind: user
Location: London, UK
Company: Imperial College London

Repositories: 15
Profile: https://github.com/cwieder

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/cwieder/pathintegrate_scripts

Science Score: 23.0%

Repository

Basic Info

Statistics

https://github.com/cwieder/PathIntegrate_scripts/blob/main/

Owner

GitHub Events

Total

Last Year