https://github.com/cellgeni/cellranger_cite_hash

Processing CITE-seq and cell hashing datasets with Cell Ranger

https://github.com/cellgeni/cellranger_cite_hash

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.6%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Processing CITE-seq and cell hashing datasets with Cell Ranger

Basic Info
  • Host: GitHub
  • Owner: cellgeni
  • License: gpl-3.0
  • Language: Shell
  • Default Branch: main
  • Size: 23.4 KB
Statistics
  • Stars: 2
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 4 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License

README.md

Processing CITE-seq and cell hashing datasets with Cell Ranger

Feature Barcode technology is a method for adding extra layers of information to cells by running single cell gene expression in parallel with other assays.

While we process most of our 10X scRNA-seq experiments with STARsolo, it's currently impossible to process antibody-based assays. There are three main types of experiments that are used: 1) CITE-seq, which uses antibodies to specific surface proteins to evaluate their abundance on the protein level; 2) cell hashing, which uses relatively non-specific antibodies to label individual samples, in order to "supercharge" a 10X run; 3) CRISPR perturbation analysis.

The method uses antibodies linked to a particular DNA barcode. Thus, the library structure for feature barcoding is as follows: - R1 contains typical 10X droplet barcode (from one of the whitelists) and UMI; - R2 contains the antibody barcode, which can be located at the beginning of the read, or at some fixed position relative to 5' end of the read. For example, TotalSeq-B barcodes start at 11th nucleotide of read 2. Barcode sequences and locations are specified in the CSV file (see below).

Used antibodies and the sequence and location of used barcode are usually given in the form of a feature reference CSV file. There are some examples of CSV files we have used in the past given in this repo in /csvs. In general, you have to have the CSV file - but sometimes you can guess it using the 15 bp sequences in the R2 read.

Overall analysis workflow of feature barcoding experiments with Cell Ranger can be found here. Briefly, - we use cellranger count command to process gene expression (regular scRNA-seq) and feature barcoding (antibody) experiments simultaneously; - library CSV file is used to specify where both fastq files are located. Fastq files need to be formatted according to Cell Ranger requirements (i.e. should look like <sample>_S*_L00*_R1_001.fastq.gz); - feature barcoding CSV file needs to be provided (see above).

If the experiment is cell hashing, once the counting is complete, one can use Solo to demultiplex individual hashed samples. I will add example notebook later, I promise.

The output can be seen with sample/outs from the directory where the cellranger command was executed.

Please note to install cellranger you have to provide institute details and then get a personal download link, so the dockerfile has replaced that link with "personal login link"

Owner

  • Name: Cellular Genetics Informatics
  • Login: cellgeni
  • Kind: organization
  • Location: United Kingdom

Wellcome Sanger Institute

GitHub Events

Total
  • Watch event: 3
Last Year
  • Watch event: 3

Dependencies

Dockerfile docker
  • ubuntu $ubuntu_version build