https://github.com/bayer-group/cellenium

Cellenium is a FAIR and scalable interactive visual analytics app for scRNA-Seq data (single-cell RNA sequencing).

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.2%) to scientific vocabulary

Keywords

bioinformatics dataviz scrna-seq transcriptomics

Keywords from Contributors

interactive archival projection generic sequences observability autograding hacking shellcodes modular

Last synced: 5 months ago · JSON representation

Repository

Cellenium is a FAIR and scalable interactive visual analytics app for scRNA-Seq data (single-cell RNA sequencing).

Basic Info

Host: GitHub
Owner: Bayer-Group
License: mit
Language: Jupyter Notebook
Default Branch: main
Homepage:
Size: 12.4 MB

Statistics

Stars: 27
Watchers: 5
Forks: 5
Open Issues: 18
Releases: 0

Topics

bioinformatics dataviz scrna-seq transcriptomics

Created about 3 years ago · Last pushed about 2 years ago

Metadata Files

Readme Contributing License Codeowners

cellenium

Cellenium is a FAIR and scalable interactive visual analytics app for scRNA-Seq data. It allows to: * organize and semantically find scRNA studies with ontologized metadata for tissues and diseases * explore cell types and other cell annotations in UMAP space * find differentially expressed genes based on clusters of annotated cells * view the expression of a single gene (or a few selected genes) in the UMAP plot or as grouped violin plots * draw coexpression plots for pairs of genes, explore the cell types contained in the plots * add new cell annotations based on plot selections, see differentially expressed genes for a selected group of cells * find genes which expression is highly correlated to a query gene * find marker genes in all imported studies and qualitatively compare gene expression across studies

Link to publication: https://doi.org/10.1093/bioinformatics/btad349

Link to showcase: https://youtu.be/U71qIK-Mqlc

UMAP projection cell type plot of the public study example blood_covid.ipynb

System Overview

Cellenium imports scRNA expression data and cell annotations in H5AD format. We provide jupyter notebooks for downloading some publicly available scRNA studies, normalize the data if necessary, and calculate differentially expressed genes, a UMAP projection and other study data that is needed for Cellenium's features to work.

Cellenium is a web application that accesses a PostgreSQL database via GraphQL API. Some API features, like server-side rendered plots, depend on Python stored procedures. The graphqlapiusage folder contains a couple of example queries to illustrate the API capabilities.

Cellenium architecture

The setup steps below automate the download and creation of appropriate H5AD files, docker image build, database schema setup and data ingestion.

Setting up

Preparation of CellO data files (workaround for https://github.com/deweylab/CellO/issues/29 ):

bash mkdir scratch/cello_resources curl https://deweylab.biostat.wisc.edu/cell_type_classification/resources_v2.0.0.tar.gz >scratch/cello_resources/resources_v2.0.0.tar.gz tar -C scratch/cello_resources -zxf scratch/cello_resources/resources_v2.0.0.tar.gz

Cellenium setup, including execution of study data processing notebooks (initially, this will take a couple of hours to run).

```bash

builds docker images and runs the whole stack

until you run the "make reset_database" step below, error messages about the missing "postgraphile" user pile up... you can ignore them for now.

docker compose up conda env create -f dataimport/environment.yml conda activate celleniumimport

'test_studydata' should contain data to cover all application features, but is small enough to be imported in a few minutes

make resetdatabase teststudydata_import

'normal_studydata': real life studies (i.e. with full amount of cells and genes)

make normalstudydataimport

we have one for atac

make atacstudydataimport

and one for cite

make citestudydataimport ```

The GraphQL API explorer is available at http://localhost:5000/postgraphile/graphiql . Postgraphile will listen to changes in the database schema and the updated API is visible immediately.

The cellenium webapp 'production build' static site is hosted in the 'client' container, see http://localhost:6002/ . For development, you run (cd client && yarn && yarn start) to install the webapp's dependencies and have a hot-reloaded webapp.

Before you process and import the huge example study (there are two additional make targets for that), edit the beginning of heart_failure_reichart2022*.ipynb and define the download URL as described in the notebooks.

manually executing the study data preparation jupyter notebooks

The notebooks are run in headless mode by make. To create new notebooks and explore datasets:

bash (cd data_import && PYTHONPATH=$(pwd) jupyter-lab)

Owner

Name: Bayer Open Source
Login: Bayer-Group
Kind: organization

Website: https://bayer.com/
Repositories: 98
Profile: https://github.com/Bayer-Group

Science for a better life

GitHub Events

Total

Issues event: 1
Watch event: 2

Last Year

Issues event: 1
Watch event: 2

Committers

Last synced: about 1 year ago

All Time

Total Commits: 466
Total Committers: 5
Avg Commits per committer: 93.2
Development Distribution Score (DDS): 0.425

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Carsten Jahn	c**n@b**m	268
andreassteffen	a**n@g**m	121
Dan Plischke	d**e@b**m	58
Mahmoud Ibrahim	m**m@b**m	17
dependabot[bot]	4****]	2

Committer Domains (Top 20 + Academic)

bayer.com: 3

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 26
Total pull requests: 21
Average time to close issues: about 2 months
Average time to close pull requests: 18 days
Total issue authors: 3
Total pull request authors: 2
Average comments per issue: 0.27
Average comments per pull request: 0.05
Merged pull requests: 9
Bot issues: 0
Bot pull requests: 12

Past Year

Issues: 1
Pull requests: 0
Average time to close issues: 27 days
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 0
Average comments per issue: 1.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

andreassteffen (16)
carsten-jahn (8)
qiong-lin1 (1)

Pull Request Authors

dependabot[bot] (11)
carsten-jahn (9)

Top Labels

Issue Labels

enhancement (1)

Pull Request Labels

dependencies (11)

Dependencies

client/Dockerfile docker

nginx stable-alpine build
node 16 build

postgraphile/Dockerfile docker

node 16 build

postgres/Dockerfile docker

postgres 15.1 build

client/package.json npm

@graphql-codegen/cli ^2.16.2 development
@graphql-codegen/client-preset ^1.2.4 development
@graphql-codegen/typescript-react-apollo ^3.3.7 development
@apollo/client ^3.7.3
@emotion/react ^11.10.5
@mantine/core ^5.10.0
@mantine/form ^5.10.4
@mantine/hooks ^5.10.0
@mantine/modals ^5.10.0
@mantine/notifications ^5.10.0
@nivo/core ^0.80.0
@nivo/sankey ^0.80.0
@tabler/icons ^1.119.0
@testing-library/jest-dom ^5.14.1
@testing-library/react ^13.0.0
@testing-library/user-event ^13.2.1
@types/jest ^27.0.1
@types/lodash ^4.14.191
@types/memoize-one ^5.1.2
@types/node ^16.7.13
@types/react ^18.0.0
@types/react-dom ^18.0.0
@types/react-plotly.js ^2.5.2
add ^2.0.6
arquero ^5.1.0
graphql ^16.6.0
lodash ^4.17.21
memoize-one ^6.0.0
plotly.js ^2.17.0
react ^18.2.0
react-data-table-component ^7.5.3
react-dom ^18.2.0
react-medium-image-zoom ^5.1.2
react-plotly.js ^2.6.0
react-router-dom ^6.6.1
react-scripts 5.0.1
react-vega ^7.6.0
recoil ^0.7.6
styled-components ^5.3.6
typescript ^4.9.4
vega ^5.22.1
vega-lite ^5.6.0
web-vitals ^2.1.0
yarn ^1.22.19

client/yarn.lock npm

1834 dependencies

postgraphile/package.json npm

@graphile-contrib/pg-simplify-inflector ^6.1.0
express ^4.18.2
lodash ^4.17.21
lodash.set ^4.3.2
postgraphile ^4.13.0
postgraphile-plugin-connection-filter ^2.3.0

postgraphile/yarn.lock npm

127 dependencies

data_import/study_import_requirements.txt pypi

boto3 *
leidenalg *
muon *
numpy *
pandas *
psycopg2-binary *
scanpy *
smart-open *
sqlalchemy *

data_import/environment.yml conda

graphviz
pip
pyjaspar
python 3.10.*

data_import/Dockerfile docker

python 3.10-slim build

docker-compose.yml docker

terraform/modules/batch/Dockerfile docker

python 3.10 build
python 3.10-slim build

postgres/environment.yml pypi

terraform/modules/lambda/lambda_layer/requirements.txt pypi

SQLAlchemy *
furl *
psycopg2-binary *

https://github.com/bayer-group/cellenium

Science Score: 13.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

cellenium

System Overview

Setting up

builds docker images and runs the whole stack

until you run the "make reset_database" step below, error messages about the missing "postgraphile" user pile up... you can ignore them for now.

'test_studydata' should contain data to cover all application features, but is small enough to be imported in a few minutes

'normal_studydata': real life studies (i.e. with full amount of cells and genes)

we have one for atac

and one for cite

manually executing the study data preparation jupyter notebooks

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies