gsea_refiner

A python tool for interpreting pathway enrichment results using NLP techniques.

https://github.com/fesedebe/gsea_refiner

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.3%) to scientific vocabulary
Last synced: 9 months ago · JSON representation ·

Repository

A python tool for interpreting pathway enrichment results using NLP techniques.

Basic Info
  • Host: GitHub
  • Owner: fesedebe
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 3.09 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed 11 months ago
Metadata Files
Readme Citation

README.md

gsea_refiner

A python tool for transformer-based categorization and interpretation of pathway enrichment results.

Overview

Pathway enrichment analysis often returns thousands of significant gene sets with overlapping themes, making interpretation difficult. GSEA-refiner uses a fine-tuned transformer model (e.g., BioBERT) to assign biologically meaningful categories to enriched pathways. The tool also identifies top-ranked genes within each category and generates visualizations to summarize enrichment trends across transcriptomic datasets.

Features

  • Categorizes enriched pathways using a fine-tuned transformer model
  • Predicts biological categories based on the semantic content
  • Derives training data using rule-based keyword matching and enrichment scoring
  • Ranks top genes within categories based on frequency significance
  • Provides modular components for preprocessing, classification, visualization etc

Workflow

GSEA-refiner Workflow

Documentation

See gsea-refiner-doc.md for full technical details.

Installation

Clone the repository and install:

For development (editable mode)

bash git clone https://github.com/fesedebe/gsea_refiner.git cd gsea_refiner pip install -e .

Dependencies

See requirements.txt for full list. - transformers - datasets - scikit-learn - pandas - torch

Owner

  • Login: fesedebe
  • Kind: user

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: gsea_refiner
message: 'If you use this software, please cite it as below.'
type: software
authors:
  - given-names: Favour
    family-names: Esedebe
    orcid: 'https://orcid.org/0000-0001-6167-0647'

GitHub Events

Total
  • Push event: 20
  • Create event: 1
Last Year
  • Push event: 20
  • Create event: 1

Dependencies

requirements.txt pypi
  • contourpy ==1.3.1
  • cycler ==0.12.1
  • fonttools ==4.55.3
  • joblib ==1.4.2
  • kiwisolver ==1.4.7
  • matplotlib ==3.10.0
  • numpy ==2.2.1
  • packaging ==24.2
  • pandas ==2.2.3
  • pillow ==11.0.0
  • pyparsing ==3.2.0
  • python-dateutil ==2.9.0.post0
  • pytz ==2024.2
  • scikit-learn ==1.6.0
  • scipy ==1.14.1
  • six ==1.17.0
  • threadpoolctl ==3.5.0
  • tzdata ==2024.2