https://github.com/akikuno/tsumugi-dev
Web tool for visualizing phenotype-similarity gene networks
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.5%) to scientific vocabulary
Keywords
Repository
Web tool for visualizing phenotype-similarity gene networks
Basic Info
- Host: GitHub
- Owner: akikuno
- License: mit
- Language: JavaScript
- Default Branch: main
- Homepage: https://larc-tsukuba.github.io/tsumugi/
- Size: 7.08 MB
Statistics
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 21
- Releases: 13
Topics
Metadata Files
README.md
TSUMUGI (Trait-driven Surveillance for Mutation-based Gene module Identification) is a web tool that leverages knockout (KO) mouse phenotype data from the International Mouse Phenotyping Consortium (IMPC) to extract and visualize gene modules based on phenotypic similarity.
The tool is publicly available online for anyone to use 👇️
🔗 https://larc-tsukuba.github.io/tsumugi/
TSUMUGI derives from the Japanese concept of "weaving together gene groups that form phenotypes."
📖 How to Use TSUMUGI
💬 Top Page
TSUMUGI supports three types of input:
1. Phenotype
When you input a phenotype of interest, TSUMUGI searches for gene groups with similar overall phenotype profiles among genes whose KO mice exhibit that phenotype.
Phenotype names are based on Mammalian Phenotype Ontology (MPO).
List of currently searchable phenotypes in TSUMUGI:
👉 Phenotype List
2. Gene
When you specify a single gene, TSUMUGI searches for other gene groups whose KO mice have similar phenotype profiles to that gene's KO mice.
Gene names follow gene symbols registered in MGI.
List of currently searchable gene names in TSUMUGI:
👉 Gene List
3. Gene List
Accepts input of multiple genes.
Gene lists should be entered separated by line breaks.
[!NOTE] Gene List differs from single Gene input in that it extracts phenotypically similar genes among the genes within the list.
[!CAUTION] If no phenotypically similar genes are found,
No similar phenotypes were found among the entered genes.alert will be displayed and processing will stop.If phenotypically similar genes exceed 200,
Too many genes submitted. Please limit the number to 200 or fewer.alert will be displayed and processing will stop to prevent browser overload.
📥 Raw Data Download (TSUMUGI_{version}_raw_data)
You can download raw data of phenotypic similarity between gene pairs (in Gzip-compressed CSV format or Parquet format).
Contents include:
- Paired gene names (Gene1, Gene2)
- Phenotypic similarity between pairs (Jaccard Similarity)
- Number of shared phenotypes between pairs (Number of shared phenotype)
- List of shared phenotypes between pairs (List of shared phenotype)
[!CAUTION] File size is approximately 50-100MB. Download may take some time.
We recommend using Parquet format when working with Polars or Pandas.
You can load the data as follows:
Polars
```bash
Install Polars and PyArrow using conda
conda create -y -n env-tsumugi polars pyarrow conda activate env-tsumugi ```
```python
Load Parquet file using Polars
import polars as pl dftsumugi = pl.readparquet("TSUMUGI{version}raw_data.parquet") ```
Pandas
```bash
Install Pandas and PyArrow using conda
conda create -y -n env-tsumugi pandas pyarrow conda activate env-tsumugi ```
```python
Load Parquet file using Pandas
import pandas as pd dftsumugi = pd.readparquet("TSUMUGI{version}raw_data.parquet") ```
🌐 Network Visualization
Based on the input, the page transitions and the network is automatically drawn.
[!IMPORTANT] Gene pairs with 2 or more shared abnormal phenotypes AND phenotypic similarity of 0.2 or higher are subject to visualization.
Network Panel
Nodes (Points)
Each node represents one gene.
Clicking displays a list of abnormal phenotypes observed in that gene's KO mice.
You can freely adjust positions by dragging.
Edges (Lines)
Clicking an edge shows details of shared phenotypes.
Control Panel
The left control panel allows you to adjust network display.
Filter by Phenotypic Similarity
The Phenotypes similarity slider allows you to set thresholds for gene pairs displayed in the network based on edge phenotypic similarity (Jaccard coefficient).
Similarity minimum and maximum values are converted to a 1-10 scale, allowing 10-level filtering.
[!NOTE] For details on phenotypic similarity, please see:
👉 🔍 Calculation Method for Phenotypically Similar Gene Groups
Filter by Phenotype Severity
The Phenotype severity slider allows you to adjust node display based on phenotype severity (effect size) in KO mice.
Higher effect sizes indicate stronger phenotypic impact.
This also scales the effect size range to 1-10, allowing 10-level filtering.
[!NOTE] When IMPC phenotype evaluation is binary (present/absent) (e.g., abnormal embryo development: list of binary phenotypes available here) or when gene name is input, the
Phenotypes severityslider is not available.
Specify Genotype
You can specify the genotype of KO mice exhibiting phenotypes:
Homo: Phenotypes seen in homozygous miceHetero: Phenotypes seen in heterozygous miceHemi: Phenotypes seen in hemizygous mice
Specify Sex
You can extract sex-specific phenotypes:
Female: Female-specific phenotypesMale: Male-specific phenotypes
Specify Life Stage
You can specify life stages when phenotypes appear:
Embryo: Phenotypes appearing during embryonic stageEarly: Phenotypes appearing at 0-16 weeks of ageInterval: Phenotypes appearing at 17-48 weeks of ageLate: Phenotypes appearing at 49+ weeks of age
Markup Panel
Highlight Human Disease-Related Genes (Highlight: Human Disease)
You can highlight genes related to human diseases.
The relationship between KO mice and human diseases uses public data from IMPC Disease Models Portal.
Search Gene Names (Search: Specific Gene)
You can search for gene names included in the network.
Adjust Network Display Style (Layout & Display)
You can adjust the following elements:
- Network layout (layout)
- Font size (Font size)
- Edge thickness (Edge width)
- Distance between nodes (*Cose layout only) (Node repulsion)
Export
You can export current network images and data in PNG, CSV and GraphML formats.
CSV includes connected component (module) IDs and lists of phenotypes shown by each gene's KO mice.
GraphML is a format compatible with the desktop version of Cytoscape, allowing you to import the network into Cytoscape for further analysis.
🔍 Calculation Method for Phenotypically Similar Gene Groups
Data Source
IMPC dataset uses statistical-results-ALL.csv.gz from Release-23.0.
Information about columns included in the dataset: Data fields
Preprocessing
Extract gene-phenotype pairs where KO mouse phenotype P-values (p_value, female_ko_effect_p_value, or male_ko_effect_p_value) are 0.0001 or below.
- Genotype-specific phenotypes are annotated with homo, hetero, or hemi
- Sex-specific phenotypes are annotated with female or male
Phenotypic Similarity Calculation
Jaccard coefficient is used as the phenotypic similarity metric.
This is a similarity measure that expresses the proportion of shared phenotypes as a 0-1 numerical value.
Jaccard(A, B) = |A ∩ B| / |A ∪ B|
For example, suppose gene A and gene B KO mice have the following abnormal phenotypes:
A: {abnormal embryo development, abnormal heart morphology, abnormal kidney morphology}
B: {abnormal embryo development, abnormal heart morphology, abnormal lung morphology}
In this case, there are 2 shared phenotypes and 4 total unique phenotypes, so the Jaccard coefficient is calculated as follows:
Jaccard(A, B) = 2 / 4 = 0.5
✉️ Contact
For questions or requests, please feel free to contact us:
Google Form
👉 Contact FormFor GitHub account holders
👉 GitHub Issue
Owner
- Name: Akihiro Kuno
- Login: akikuno
- Kind: user
- Location: Tsukuba, Ibaraki, Japan
- Company: University of Tsukuba
- Website: https://researchmap.jp/7000027584/?lang=en
- Twitter: akikuno_sh
- Repositories: 12
- Profile: https://github.com/akikuno
Bioinformatician working at the Laboratory Animal Resource Center
GitHub Events
Total
- Create event: 17
- Release event: 8
- Issues event: 100
- Watch event: 2
- Delete event: 8
- Member event: 2
- Issue comment event: 65
- Push event: 270
- Pull request event: 20
Last Year
- Create event: 17
- Release event: 8
- Issues event: 100
- Watch event: 2
- Delete event: 8
- Member event: 2
- Issue comment event: 65
- Push event: 270
- Pull request event: 20
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 59
- Total pull requests: 10
- Average time to close issues: about 1 month
- Average time to close pull requests: less than a minute
- Total issue authors: 1
- Total pull request authors: 1
- Average comments per issue: 0.73
- Average comments per pull request: 0.0
- Merged pull requests: 8
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 57
- Pull requests: 10
- Average time to close issues: 30 days
- Average time to close pull requests: less than a minute
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 0.72
- Average comments per pull request: 0.0
- Merged pull requests: 8
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- akikuno (55)
Pull Request Authors
- akikuno (9)