cy

Python module to analyse CRISPR-based libraries

https://github.com/emanuelgoncalves/crispy

Science Score: 33.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 6 DOI reference(s) in README
  • Academic publication links
    Links to: biorxiv.org, zenodo.org
  • Committers with academic emails
    1 of 2 committers (50.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.3%) to scientific vocabulary

Keywords

crispr gaussian python sklearn
Last synced: 6 months ago · JSON representation

Repository

Python module to analyse CRISPR-based libraries

Basic Info
  • Host: GitHub
  • Owner: EmanuelGoncalves
  • License: bsd-3-clause
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 601 MB
Statistics
  • Stars: 13
  • Watchers: 0
  • Forks: 4
  • Open Issues: 1
  • Releases: 0
Topics
crispr gaussian python sklearn
Created over 8 years ago · Last pushed almost 5 years ago
Metadata Files
Readme License

README.md

Crispy logo

License PyPI version DOI

Module with utility functions to process CRISPR-based screens and method to correct gene independent copy-number effects.

Description

Crispy uses Sklearn implementation of Gaussian Process Regression, fitting each sample independently.

Install

Install pybedtools and then install Crispy

``` conda install -c bioconda pybedtools

pip install cy ```

Examples

Support to library imports: ```python from crispy.CRISPRData import Library

Master Library, standardised assembly of KosukeYusa V1.1, Avana, Brunello and TKOv3

CRISPR-Cas9 libraries.

masterlib = Library.loadlibrary("MasterLib_v1.csv.gz")

Genome-wide minimal CRISPR-Cas9 library.

minimallib = Library.loadlibrary("MinLibCas9.csv.gz")

Some of the most broadly adopted CRISPR-Cas9 libraries:

'Avanav1.csv.gz', 'Brunellov1.csv.gz', 'GeCKOv2.csv.gz', 'ManjunathWu_v1.csv.gz',

'TKOv3.csv.gz', 'Yusa_v1.1.csv.gz'

brunellolib = Library.loadlibrary("Brunello_v1.csv.gz") ```

Select sgRNAs (across multiple CRISPR-Cas9 libraries) for a given gene: ```python from crispy.GuideSelection import GuideSelection

sgRNA selection class

gselection = GuideSelection()

Select 5 optimal sgRNAs for MCL1 across multiple libraries

geneguides = gselection.selectsgrnas( "MCL1", nguides=5, offtarget=[1, 0], jacksthres=1, ruleset2_thres=.4 )

Perform different rounds of sgRNA selection with increasingly relaxed efficiency thresholds

geneguides = gselection.selectionrounds("TRIM49", nguides=5, doamberround=True, dored_round=True) ```

Copy-number correction: ```python import crispy as cy import matplotlib.pyplot as plt from crispy.CRISPRData import ReadCounts, Library

""" Import sample data """ rawcounts, copynumber = cy.Utils.getexampledata()

""" Import CRISPR-Cas9 library

Important: Library has to have the following columns: "Chr", "Start", "End", "ApprovedSymbol" Library and segments have to have consistent "Chr" formating: "Chr1" or "chr1" or "1" Gurantee that "Start" and "End" columns are int """ lib = Library.loadlibrary("Yusa_v1.1.csv.gz")

lib = lib.rename( columns=dict(start="Start", end="End", chr="Chr", Gene="Approved_Symbol") ).dropna(subset=["Chr", "Start", "End"])

lib["Chr"] = "chr" + lib["Chr"]

lib["Start"] = lib["Start"].astype(int) lib["End"] = lib["End"].astype(int)

""" Calculate fold-change """ plasmids = ["ERS717283"] rawcounts = ReadCounts(rawcounts).removelowcounts(plasmids) sgrnafc = rawcounts.normrpm().foldchange(plasmids)

""" Correct CRISPR-Cas9 sgRNA fold changes """ crispy = cy.Crispy( sgrnafc=sgrnafc.mean(1), copynumber=copynumber, library=lib.loc[sgrnafc.index] )

Fold-changes and correction integrated funciton.

Output is a modified/expanded BED formated data-frame with sgRNA and segments information

n_sgrna: represents the minimum number of sgRNAs required per segment to consider in the fit.

Recomended default values range between 4-10.

beddf = crispy.correct(nsgrna=10) print(bed_df.head())

Gaussian Process Regression is stored

crispy.gpr.plot(xfeature="ratio", yfeature="fold_change") plt.show() ``` GPR

Credits and License

Developed at the Wellcome Sanger Institue (2017-2020).

For citation please refer to:

Gonalves E, Behan FM, Louzada S, Arnol D, Stronach EA, Yang F, Yusa K, Stegle O, Iorio F, Garnett MJ (2019) Structural rearrangements generate cell-specific, gene-independent CRISPR-Cas9 loss of fitness effects. Genome Biol 20: 27

Gonalves E, Thomas M, Behan FM, Picco G, Pacini C, Allen F, Parry-Smith D, Iorio F, Parts L, Yusa K, Garnett MJ (2019) Minimal genome-wide human CRISPR-Cas9 library. bioRxiv

Owner

  • Name: Emanuel
  • Login: EmanuelGoncalves
  • Kind: user
  • Location: Lisbon
  • Company: Instituto Superior Técnico (IST)

Assistant Professor

GitHub Events

Total
  • Watch event: 3
Last Year
  • Watch event: 3

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 415
  • Total Committers: 2
  • Avg Commits per committer: 207.5
  • Development Distribution Score (DDS): 0.335
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Emanuel Goncalves e****4@s****k 276
eg14 e****s@g****m 139
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 5
  • Total pull requests: 0
  • Average time to close issues: 21 days
  • Average time to close pull requests: N/A
  • Total issue authors: 3
  • Total pull request authors: 0
  • Average comments per issue: 1.8
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • laulabbumc (3)
  • eshinesimida (1)
  • agiraldeztrujillo (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 221 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 4
  • Total versions: 42
  • Total maintainers: 1
pypi.org: cy

Modelling CRISPR dropout data

  • Versions: 42
  • Dependent Packages: 0
  • Dependent Repositories: 4
  • Downloads: 221 Last month
Rankings
Dependent repos count: 7.5%
Dependent packages count: 10.0%
Downloads: 10.3%
Average: 12.5%
Forks count: 16.8%
Stargazers count: 17.7%
Maintainers (1)
Last synced: 6 months ago