Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.2%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: Kataoka-K-Lab
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 40.5 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 3
Created about 1 year ago · Last pushed 11 months ago
Metadata Files
Readme License Citation

README.md

Celline - Single Cell RNA-seq Analysis Pipeline

Celline is a comprehensive, interactive pipeline for single-cell RNA sequencing (scRNA-seq) analysis, designed to streamline the workflow from raw data to biological insights. It provides both command-line and web-based interfaces for flexible analysis workflows.

📖 Detailed Documentation: Celline Docs

Features

  • 🔄 Automated Data Processing: From raw FASTQ files to expression matrices
  • ✅ Quality Control: Built-in QC metrics and filtering with Scrublet doublet detection
  • 📊 Dimensionality Reduction: PCA, t-SNE, and UMAP implementations
  • 🔍 Clustering Analysis: Multiple clustering algorithms
  • 🧬 Cell Type Prediction: Automated cell type annotation using scPred
  • ⚖️ Batch Effect Correction: Multiple methods for data integration (Seurat, scVI)
  • 🌐 Interactive Visualization: Web-based interface for data exploration
  • 🔧 Flexible Execution: Support for local multithreading and PBS cluster execution
  • 📁 Database Integration: Built-in support for SRA, GEO, and CNCB data repositories
  • 🔬 R Integration: Seamless R/Seurat integration for advanced analysis

System Requirements

Required Dependencies

  • Python: ≥3.10
  • R: ≥4.0 with Seurat and other required packages
  • Cell Ranger: For 10x Genomics data processing
  • SRA Toolkit: For downloading SRA data (fastq-dump)

Python Dependencies

All Python dependencies are automatically installed via pip. Key packages include: - scanpy - Single-cell analysis - pandas, polars - Data manipulation - fastapi, uvicorn - Web API - rich - Enhanced CLI interface - pysradb - SRA database access

Installation

Option 1: Install from PyPI

bash pip install celline

Option 2: Install from Source

bash git clone https://github.com/your-repo/Celline.git cd Celline pip install -e .

Option 3: Development Installation

bash git clone https://github.com/your-repo/Celline.git cd Celline pip install -e ".[dev]"

Quick Start

1. Initialize Your Project

Start by initializing a new project. This will validate system dependencies and create configuration files:

bash celline init

This command will: - Check for required system dependencies (R, Cell Ranger, SRA Toolkit) - Set up R environment configuration - Create project configuration files - Prompt for project name and settings

2. Configure Execution Settings (Optional)

Configure execution parameters for your system:

```bash

Interactive configuration

celline config

Or set specific options

celline config --system multithreading --nthread 8 celline config --system PBS --pbs-server your-cluster-name ```

3. Explore Available Functions

List all available analysis functions:

bash celline list

Get detailed help for specific functions:

bash celline help download celline help preprocess

4. Basic Analysis Workflow

Download Public Data

```bash

Download from SRA/GEO

celline run download --accession GSE123456 celline run download --accession SRR123456

Download from CNCB

celline run download --accession CRA123456 ```

Data Preprocessing

```bash

Quality control and preprocessing

celline run preprocess --input raw_data/ --output processed/

Gene expression counting (10x data)

celline run count --input cellranger_output/ --output counts/ ```

Create Seurat Objects

```bash

Create Seurat object for downstream analysis

celline run createseurat --input counts/ --output seuratobject.rds ```

Advanced Analysis

```bash

Dimensionality reduction

celline run reduce --input seurat_object.rds --methods pca,umap,tsne

Cell type prediction

celline run predictcelltype --input seuratobject.rds --reference ref_data/

Batch effect correction

celline run integrate --input multiple_samples/ --method seurat ```

5. Interactive Web Interface

Launch the interactive web interface for visual analysis:

bash celline interactive

This will: - Start the FastAPI backend server - Launch the Vue.js frontend - Open your web browser automatically - Provide interactive data exploration tools

6. API Server Only (for Development)

Start only the API server for testing:

bash celline api

Available Functions

| Function | Description | Usage Example | |----------|-------------|---------------| | init | Initialize project and validate dependencies | celline init | | download | Download scRNA-seq data from public repositories | celline run download --accession GSE123456 | | preprocess | Quality control and preprocessing | celline run preprocess | | count | Gene expression quantification | celline run count | | create_seurat | Create Seurat objects | celline run create_seurat | | reduce | Dimensionality reduction (PCA, UMAP, t-SNE) | celline run reduce | | integrate | Batch effect correction and data integration | celline run integrate | | predict_celltype | Automated cell type annotation | celline run predict_celltype | | batch_cor | Batch correlation analysis | celline run batch_cor | | interactive | Launch web interface | celline interactive | | sync_DB | Update local databases | celline run sync_DB | | info | Show system information | celline info |

Project Structure

When you initialize a project, Celline creates the following structure:

your_project/ ├── setting.toml # Project configuration ├── data/ # Raw and processed data ├── results/ # Analysis results ├── scripts/ # Generated analysis scripts └── logs/ # Execution logs

Configuration

Celline uses setting.toml files for configuration:

```toml [project] name = "my_project" version = "0.01"

[execution] system = "multithreading" # or "PBS" nthread = 8 pbs_server = "your-cluster" # for PBS system

[R] r_path = "/usr/local/bin/R"

[fetch] wait_time = 4 # seconds between API calls ```

Advanced Usage

Running on HPC Clusters

For PBS/Torque clusters:

bash celline config --system PBS --pbs-server your-cluster-name celline run preprocess # Will submit PBS jobs automatically

Custom Analysis Scripts

Celline generates executable scripts in the scripts/ directory that can be run independently or modified for custom workflows.

R Integration

Access Seurat objects and run custom R analysis:

```bash

R scripts are available in template/hook/R/

Custom R functions can be added to the pipeline

```

Troubleshooting

Common Issues

  1. Missing Dependencies: Run celline init to validate all dependencies
  2. R Package Issues: Ensure Seurat and required R packages are installed
  3. Memory Issues: Adjust thread count with celline config --nthread <number>
  4. Web Interface Not Loading: Check that ports 8000 and 3000 are available

Getting Help

```bash

General help

celline help

Function-specific help

celline help

System information

celline info

List all functions

celline list ```

Contributing

We welcome contributions! Please see our contributing guidelines for more information.

Citation

If you use Celline in your research, please cite:

[Citation information to be added]

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Owner

  • Name: Kataoka Lab
  • Login: Kataoka-K-Lab
  • Kind: organization
  • Location: Japan

Citation (CITATION.cff)

cff-version: 1.0.0
message: "Cite as"
authors:
  - family-names: Sato
    given-names: Yuya
    affiliation: "The University of Waseda"
title: "Celline"
doi: 10.5281/zenodo.15795373

GitHub Events

Total
  • Release event: 4
  • Push event: 23
  • Public event: 1
  • Pull request event: 2
  • Create event: 4
Last Year
  • Release event: 4
  • Push event: 23
  • Public event: 1
  • Pull request event: 2
  • Create event: 4

Dependencies

.github/workflows/publish-testpypi.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
src/celline/frontend/package.json npm
  • @babel/core ^7.12.16 development
  • @babel/eslint-parser ^7.12.16 development
  • @vue/cli-plugin-babel ~5.0.0 development
  • @vue/cli-plugin-eslint ~5.0.0 development
  • @vue/cli-plugin-typescript ^5.0.8 development
  • @vue/cli-service ~5.0.0 development
  • eslint ^7.32.0 development
  • eslint-plugin-vue ^8.0.3 development
  • typescript ^5.1.6 development
  • @types/axios ^0.14.0
  • axios ^1.4.0
  • core-js ^3.8.3
  • vue ^3.2.13
  • vue-router ^4.2.4
  • vuetify ^3.3.13
src/celline/frontend/yarn.lock npm
  • 857 dependencies
pyproject.toml pypi
  • argparse >=1.4.0
  • continuousvi >=0.1.5
  • inquirer >=3.4.0
  • iprogress >=0.4
  • ipywidgets >=8.1.5
  • multipledispatch >=1.0.0
  • pandas >=2.2.3
  • polars >=1.26.0
  • pyarrow >=19.0.1
  • pyper >=1.1.2
  • pysradb >=2.2.2
  • pyyaml >=6.0.2
  • requests-html >=0.10.0
  • rich >=14.0.0
  • scanpy >=1.11.1
  • scrublet >=0.2.3
  • toml >=0.10.2
  • tqdm >=4.67.1
  • varname >=0.14.0
uv.lock pypi
  • 197 dependencies
frontend/package.json npm
  • @nuxt/devtools latest development
  • nuxt ^3.8.0 development
  • @nuxtjs/axios ^5.13.6
  • @pinia/nuxt ^0.5.1
  • pinia ^2.1.7
src/celline/frontend/package-lock.json npm
  • 856 dependencies