bloodbased-pancancer-diagnosis

Benchmarking study of feature extraction methods for cancer diagnosis using blood-based biomarkers. Feature extraction methods are compared both in terms of their performance and robustness

https://github.com/abhivij/bloodbased-pancancer-diagnosis

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.0%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Benchmarking study of feature extraction methods for cancer diagnosis using blood-based biomarkers. Feature extraction methods are compared both in terms of their performance and robustness

Basic Info

Host: GitHub
Owner: abhivij
Language: R
Default Branch: master
Homepage:
Size: 6.26 MB

Statistics

Stars: 2
Watchers: 0
Forks: 2
Open Issues: 0
Releases: 0

Created almost 6 years ago · Last pushed over 2 years ago

Metadata Files

Readme Citation

Blood-based transcriptomic signature panel identification for cancer diagnosis: Benchmarking of feature extraction methods

Citation

If you use this repository, please cite our publication in Briefings in Bioinformatics : Blood-based transcriptomic signature panel identification for cancer diagnosis: benchmarking of feature extraction methods

Problem

Compare feature extraction methods for binary classification of cancer types and subtypes using blood-based biomarkers.

Approach

Build a generic pipeline to run any biomarker dataset on multiple feature extraction methods and classification models

Type of data used

microRNAs from Extra Cellular Vesicles
Total RNA from Tumour Educated Platelets
microRNAs from blood
microRNAs from serum

Pipeline

pipeline

The Feature Extraction Method comparison pipeline code is made available as an R package, inside the directory FEMPipeline.

To use this in your project : devtools::install_github("abhivij/bloodbased-pancancer-diagnosis/FEMPipeline") And within R : library(FEMPipeline)

The function to call the pipeline is execute_pipeline.

To obtain information regarding the arguments, within R, use ?execute_pipeline Main inputs to the pipeline are : * Read count file in (transcripts x samples) format. Other omics datasets can also be used. * Phenotype file - tab separated file with column named 'Sample' with each of the samples in read count file, and their corresponding meta-data that includes a classification criteria column * Classification criteria column name

Code & Directory Structure

The R script files outside the FEMPipeline directory calls the FEMPipeline package for datasets relevant to this study

pipeline_executor.R : starting point to call pipeline
datasetpipelinearguments.R : list of datasets and its meta-data, used by pipeline_executor.R as arguments to call pipeline
katanascripts/ : scripts to call pipelineexecutor.R in Katana computational cluster
data/ : contains source data, extracted data and preprocessed data
phenotype_info/ : contains currently used phenotype files and the script used in some steps of phenotype file creation
data_extraction/ : data extraction step in the pipeline
results_processing/ : scripts to generate plots from results, statistically analyze results, compute pairwise Jaccard Index, combine results, analyze results specifically of that of Ranger feature selection method
install.R : list of packages to be installed to run this pipeline

Owner

Name: Abhishek Vijayan
Login: abhivij
Kind: user
Location: Sydney
Company: UNSW

Repositories: 9
Profile: https://github.com/abhivij

Research Associate, BABS, UNSW

Citation (CITATION.cff)

authors:
  - family-names: Vijayan
    given-names: Abhishek
cff-version: 1.0.0
message: "If you use this software, please cite both the article from preferred-citation and the software itself."
title: "Blood-based transcriptomic signature panel identification for cancer diagnosis: Benchmarking of feature extraction methods"
version: 1.0.0
doi: 10.5281/zenodo.6300985
preferred-citation:
  authors:
    - family-names: Vijayan
      given-names: Abhishek
      orcid: https://orcid.org/0000-0001-9877-9080
    - family-names: Fatima
      given-names: Shadma
      orcid: https://orcid.org/0000-0002-3583-1301
    - family-names: Sowmya
      given-names: Arcot
      orcid: https://orcid.org/0000-0001-9236-5063
    - family-names: Vafaee
      given-names: Fatemeh
      orcid: https://orcid.org/0000-0002-7521-2417
  title: "Blood-based transcriptomic signature panel identification for cancer diagnosis: benchmarking of feature extraction methods"
  type: article
  doi: 10.1093/bib/bbac315
  url: https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbac315/6658855

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science