https://github.com/akikuno/mieru-splicing-project
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: ncbi.nlm.nih.gov, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.5%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: akikuno
- License: mit
- Language: R
- Default Branch: main
- Size: 69.8 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 1
- Releases: 1
Metadata Files
README.md
Mieru Splicing Project
This repository contains comprehensive RNA splicing analysis scripts and data for studying the effects of RNA-binding protein (RBP) knockouts on alternative splicing patterns in mouse embryonic stem cells. The project analyzes differential splicing events across 11 different RBP knockout lines compared to MIERU control cells.
Project Overview
The analysis investigates how specific RBP knockouts affect: - Alternative splicing patterns (5 event types: SE, A3SS, A5SS, MXE, RI) - Gene expression changes - Protein complex composition - Functional pathway enrichment
Key RBPs analyzed: Cd2bp2, Qk, Rbm24, Rpl22l1, Spen, Strap, Tra2b, Trim71, Ubr5, Wt1, Ybx1
Requirements
- Unix environment (WSL2 with Ubuntu or macOS recommended)
- conda package manager via miniforge
- ~50GB storage space for genome indices and intermediate files
Installation
Create and activate the conda environment with all required bioinformatics tools:
bash
conda env create -f environment.yml
conda activate mieru
[!IMPORTANT] To ensure full reproducibility of this project's analysis results, the following files are provided: -
environment.yml: Complete conda environment (including build numbers) -environment-no-builds.yml: Cross-platform compatible version -conda-packages-list.txt: Detailed list of installed packages -R-session-info.txt: R session information -REPRODUCIBILITY.md: Step-by-step instructions for reproducibility
For details, seeREPRODUCIBILITY.md.
Dataset Access
Raw sequencing data and processed results are publicly available:
Fig2 data: Control samples and marker gene analysis
- FASTQ files: GSE291522
- Location:
Fig2/data/fastq/
Fig3-6 data: RBP knockout comparative analysis
- FASTQ files: GSE291672
- rMATS splicing results:
GSE291672_rmats.zip - Location:
Fig3-6/data/fastq/andFig3-6/data/rmats/original_output/
Pipeline Overview
The analysis pipeline follows a modular, standardized organization:
Fig2 Analysis (Control Characterization)
- Fluorescent Analysis (
025-fluorescents/): Fluorescent protein expression quantification - Marker Gene Analysis (
035-marker_genes/): Validation of cell line markers
Fig3-6 Analysis (Main Pipeline)
- Setup (
00-setup/): System dependencies, directory creation, and genome data download - Preprocessing (
01-preprocessing/): Quality control, trimming, genome alignment, read counting, and differential splicing with rMATS - Quality Control (
02-quality-control/): Exon characteristics analysis and quality metrics - Event Analysis (
04-event-analysis/): Alternative splicing event frequency and ΔPSI distribution analysis (Figure 3) - DEG Comparison (
05-deg-comparison/): Integration with differential gene expression (Figure 4) - Complex Analysis (
06-complex-analysis/): Protein complex enrichment using CORUM/ComplexTab databases (Figure 5) - Heatmap Analysis (
07-heatmap-analysis/): GO term and pathway visualization (Figure 6)
Legacy Structure
- Original organization preserved in
_past/directories for backward compatibility - Legacy scripts (e.g.,
015-preprocess/,Fig3-event-frequency/) remain functional
Usage
- Download raw data from GEO repositories
- Activate conda environment:
conda activate mieru - Run preprocessing scripts in numerical order within each analysis directory
- Scripts must be executed from their respective directories due to relative paths
Owner
- Name: Akihiro Kuno
- Login: akikuno
- Kind: user
- Location: Tsukuba, Ibaraki, Japan
- Company: University of Tsukuba
- Website: https://researchmap.jp/7000027584/?lang=en
- Twitter: akikuno_sh
- Repositories: 12
- Profile: https://github.com/akikuno
Bioinformatician working at the Laboratory Animal Resource Center
GitHub Events
Total
- Create event: 1
- Release event: 1
- Issues event: 1
- Push event: 53
Last Year
- Create event: 1
- Release event: 1
- Issues event: 1
- Push event: 53
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- akikuno (1)