bloodbased-pancancer-diagnosis

Benchmarking study of feature extraction methods for cancer diagnosis using blood-based biomarkers. Feature extraction methods are compared both in terms of their performance and robustness

https://github.com/abhivij/bloodbased-pancancer-diagnosis

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.0%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Benchmarking study of feature extraction methods for cancer diagnosis using blood-based biomarkers. Feature extraction methods are compared both in terms of their performance and robustness

Basic Info
  • Host: GitHub
  • Owner: abhivij
  • Language: R
  • Default Branch: master
  • Homepage:
  • Size: 6.26 MB
Statistics
  • Stars: 2
  • Watchers: 0
  • Forks: 2
  • Open Issues: 0
  • Releases: 0
Created over 5 years ago · Last pushed about 2 years ago
Metadata Files
Readme Citation

README.md

Blood-based transcriptomic signature panel identification for cancer diagnosis: Benchmarking of feature extraction methods

Citation

If you use this repository, please cite our publication in Briefings in Bioinformatics : Blood-based transcriptomic signature panel identification for cancer diagnosis: benchmarking of feature extraction methods

Problem

Compare feature extraction methods for binary classification of cancer types and subtypes using blood-based biomarkers.

Approach

Build a generic pipeline to run any biomarker dataset on multiple feature extraction methods and classification models

Type of data used

  • microRNAs from Extra Cellular Vesicles
  • Total RNA from Tumour Educated Platelets
  • microRNAs from blood
  • microRNAs from serum

Pipeline

pipeline

The Feature Extraction Method comparison pipeline code is made available as an R package, inside the directory FEMPipeline.

To use this in your project : devtools::install_github("abhivij/bloodbased-pancancer-diagnosis/FEMPipeline") And within R : library(FEMPipeline)

The function to call the pipeline is execute_pipeline.

To obtain information regarding the arguments, within R, use ?execute_pipeline Main inputs to the pipeline are : * Read count file in (transcripts x samples) format. Other omics datasets can also be used. * Phenotype file - tab separated file with column named 'Sample' with each of the samples in read count file, and their corresponding meta-data that includes a classification criteria column * Classification criteria column name

Code & Directory Structure

The R script files outside the FEMPipeline directory calls the FEMPipeline package for datasets relevant to this study

  • pipeline_executor.R : starting point to call pipeline
  • datasetpipelinearguments.R : list of datasets and its meta-data, used by pipeline_executor.R as arguments to call pipeline
  • katanascripts/ : scripts to call pipelineexecutor.R in Katana computational cluster
  • data/ : contains source data, extracted data and preprocessed data
  • phenotype_info/ : contains currently used phenotype files and the script used in some steps of phenotype file creation
  • data_extraction/ : data extraction step in the pipeline
  • results_processing/ : scripts to generate plots from results, statistically analyze results, compute pairwise Jaccard Index, combine results, analyze results specifically of that of Ranger feature selection method
  • install.R : list of packages to be installed to run this pipeline

Owner

  • Name: Abhishek Vijayan
  • Login: abhivij
  • Kind: user
  • Location: Sydney
  • Company: UNSW

Research Associate, BABS, UNSW

Citation (CITATION.cff)

authors:
  - family-names: Vijayan
    given-names: Abhishek
cff-version: 1.0.0
message: "If you use this software, please cite both the article from preferred-citation and the software itself."
title: "Blood-based transcriptomic signature panel identification for cancer diagnosis: Benchmarking of feature extraction methods"
version: 1.0.0
doi: 10.5281/zenodo.6300985
preferred-citation:
  authors:
    - family-names: Vijayan
      given-names: Abhishek
      orcid: https://orcid.org/0000-0001-9877-9080
    - family-names: Fatima
      given-names: Shadma
      orcid: https://orcid.org/0000-0002-3583-1301
    - family-names: Sowmya
      given-names: Arcot
      orcid: https://orcid.org/0000-0001-9236-5063
    - family-names: Vafaee
      given-names: Fatemeh
      orcid: https://orcid.org/0000-0002-7521-2417
  title: "Blood-based transcriptomic signature panel identification for cancer diagnosis: benchmarking of feature extraction methods"
  type: article
  doi: 10.1093/bib/bbac315
  url: https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbac315/6658855

GitHub Events

Total
Last Year