topictcga
Notebooks for "A topic model analysis of TCGA transcriptomic data of breast and lung cancer"
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: mdpi.com, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.4%) to scientific vocabulary
Keywords
Repository
Notebooks for "A topic model analysis of TCGA transcriptomic data of breast and lung cancer"
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 1
- Releases: 2
Topics
Metadata Files
README.md
A topic model analysis of TCGA
Notebooks and libraries for "A Topic Modeling Analysis of TCGA Breast and Lung Cancer Transcriptomic Data"
Analyse results
In order to analyse results and reproduce plots in the paper without rerunning hSBM use the following notebook hSBM_postprocess.ipynb
This repository, following the structure of the paper, is divided into three parts. See Readme.md in each folder for a detailed description of the specific pipeline.
breast
breast analyses, stochastic block modelling and predictor
lung
lung analyses, stochastic block modelling, survival analysis and predictor
unified lung
lung data from unified dataset as discussed in the paper
tree plotter
A submodule useful to plot hierarchies
Run
You can simply create a Docker container with all dependencies installed
bash
docker run -v $PWD:/home/jovyan/work -p 8888:8888 --rm -it --name topic_tcga docker.pkg.github.com/fvalle1/topictcga/topic:latest
then point your browser to localhost
hSBM_Topicmodel
The run_graph.ipynb notebook can be used to run hierarchical Stochastic Block Modelling.
Data
The data processed in our analysis when not available trough git can be accessed via DataVersionControl
bash
dvc pull -r mydrive name_of_the_file_to_download.dvc
License
Please see LICENSE
Owner
- Name: Filippo Valle
- Login: fvalle1
- Kind: user
- Location: Turin, Italy
- Company: @Elemento-Modular-Cloud
- Website: https://fvalle.online
- Repositories: 69
- Profile: https://github.com/fvalle1
Chief Technology Officer of @Elemento-Modular-Cloud | Complex Systems researcher @BioPhys-Turin
GitHub Events
Total
Last Year
Dependencies
- cairocffi *
- gobject *
- gseapy *
- jupyter *
- matplotlib *
- numpy *
- pandas *
- scanpy *
- scipy *
- seaborn *
- sklearn *
- tensorflow *
- topicpy *
- watermark *