cy

Python module to analyse CRISPR-based libraries

https://github.com/emanuelgoncalves/crispy

Science Score: 33.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 6 DOI reference(s) in README
✓
Academic publication links
Links to: biorxiv.org, zenodo.org
✓
Committers with academic emails
1 of 2 committers (50.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.3%) to scientific vocabulary

Keywords

crispr gaussian python sklearn

Last synced: 6 months ago · JSON representation

Repository

Python module to analyse CRISPR-based libraries

Basic Info

Host: GitHub
Owner: EmanuelGoncalves
License: bsd-3-clause
Language: Python
Default Branch: master
Homepage:
Size: 601 MB

Statistics

Stars: 13
Watchers: 0
Forks: 4
Open Issues: 1
Releases: 0

Topics

crispr gaussian python sklearn

Created over 8 years ago · Last pushed almost 5 years ago

Metadata Files

Readme License

README.md

Module with utility functions to process CRISPR-based screens and method to correct gene independent copy-number effects.

Description

Crispy uses Sklearn implementation of Gaussian Process Regression, fitting each sample independently.

Install

Install pybedtools and then install Crispy

``` conda install -c bioconda pybedtools

pip install cy ```

Examples

Support to library imports: ```python from crispy.CRISPRData import Library

Master Library, standardised assembly of KosukeYusa V1.1, Avana, Brunello and TKOv3

CRISPR-Cas9 libraries.

masterlib = Library.loadlibrary("MasterLib_v1.csv.gz")

Genome-wide minimal CRISPR-Cas9 library.

minimallib = Library.loadlibrary("MinLibCas9.csv.gz")

Some of the most broadly adopted CRISPR-Cas9 libraries:

'Avanav1.csv.gz', 'Brunellov1.csv.gz', 'GeCKOv2.csv.gz', 'ManjunathWu_v1.csv.gz',

'TKOv3.csv.gz', 'Yusa_v1.1.csv.gz'

brunellolib = Library.loadlibrary("Brunello_v1.csv.gz") ```

Select sgRNAs (across multiple CRISPR-Cas9 libraries) for a given gene: ```python from crispy.GuideSelection import GuideSelection

sgRNA selection class

gselection = GuideSelection()

Select 5 optimal sgRNAs for MCL1 across multiple libraries

geneguides = gselection.selectsgrnas( "MCL1", nguides=5, offtarget=[1, 0], jacksthres=1, ruleset2_thres=.4 )

Perform different rounds of sgRNA selection with increasingly relaxed efficiency thresholds

geneguides = gselection.selectionrounds("TRIM49", nguides=5, doamberround=True, dored_round=True) ```

Copy-number correction: ```python import crispy as cy import matplotlib.pyplot as plt from crispy.CRISPRData import ReadCounts, Library

""" Import sample data """ rawcounts, copynumber = cy.Utils.getexampledata()

""" Import CRISPR-Cas9 library

Important: Library has to have the following columns: "Chr", "Start", "End", "ApprovedSymbol" Library and segments have to have consistent "Chr" formating: "Chr1" or "chr1" or "1" Gurantee that "Start" and "End" columns are int """ lib = Library.loadlibrary("Yusa_v1.1.csv.gz")

lib = lib.rename( columns=dict(start="Start", end="End", chr="Chr", Gene="Approved_Symbol") ).dropna(subset=["Chr", "Start", "End"])

lib["Chr"] = "chr" + lib["Chr"]

lib["Start"] = lib["Start"].astype(int) lib["End"] = lib["End"].astype(int)

""" Calculate fold-change """ plasmids = ["ERS717283"] rawcounts = ReadCounts(rawcounts).removelowcounts(plasmids) sgrnafc = rawcounts.normrpm().foldchange(plasmids)

""" Correct CRISPR-Cas9 sgRNA fold changes """ crispy = cy.Crispy( sgrnafc=sgrnafc.mean(1), copynumber=copynumber, library=lib.loc[sgrnafc.index] )

Fold-changes and correction integrated funciton.

Output is a modified/expanded BED formated data-frame with sgRNA and segments information

n_sgrna: represents the minimum number of sgRNAs required per segment to consider in the fit.

Recomended default values range between 4-10.

beddf = crispy.correct(nsgrna=10) print(bed_df.head())

Gaussian Process Regression is stored

crispy.gpr.plot(xfeature="ratio", yfeature="fold_change") plt.show() ``` GPR

Credits and License

Developed at the Wellcome Sanger Institue (2017-2020).

For citation please refer to:

Gonalves E, Behan FM, Louzada S, Arnol D, Stronach EA, Yang F, Yusa K, Stegle O, Iorio F, Garnett MJ (2019) Structural rearrangements generate cell-specific, gene-independent CRISPR-Cas9 loss of fitness effects. Genome Biol 20: 27

Gonalves E, Thomas M, Behan FM, Picco G, Pacini C, Allen F, Parry-Smith D, Iorio F, Parts L, Yusa K, Garnett MJ (2019) Minimal genome-wide human CRISPR-Cas9 library. bioRxiv

Owner

Name: Emanuel
Login: EmanuelGoncalves
Kind: user
Location: Lisbon
Company: Instituto Superior Técnico (IST)

Website: https://emanuelgoncalves.github.io/
Twitter: emanuelvgo
Repositories: 7
Profile: https://github.com/EmanuelGoncalves

Assistant Professor

GitHub Events

Total

Watch event: 3

Last Year

Watch event: 3

Committers

Last synced: over 1 year ago

All Time

Total Commits: 415
Total Committers: 2
Avg Commits per committer: 207.5
Development Distribution Score (DDS): 0.335

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Emanuel Goncalves	e**4@s**k	276
eg14	e**s@g**m	139

Committer Domains (Top 20 + Academic)

sanger.ac.uk: 1

Issues and Pull Requests

Last synced: 7 months ago

All Time

Total issues: 5
Total pull requests: 0
Average time to close issues: 21 days
Average time to close pull requests: N/A
Total issue authors: 3
Total pull request authors: 0
Average comments per issue: 1.8
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

laulabbumc (3)
eshinesimida (1)
agiraldeztrujillo (1)

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 221 last-month

Total dependent packages: 0
Total dependent repositories: 4
Total versions: 42
Total maintainers: 1

pypi.org: cy

Modelling CRISPR dropout data

Homepage: https://github.com/EmanuelGoncalves/crispy
Documentation: https://cy.readthedocs.io/
License: BSD License
Latest release: 0.5.8
published almost 5 years ago

Versions: 42
Dependent Packages: 0
Dependent Repositories: 4
Downloads: 221 Last month

Rankings

Dependent repos count: 7.5%

Dependent packages count: 10.0%

Downloads: 10.3%

Average: 12.5%

Forks count: 16.8%

Stargazers count: 17.7%

Maintainers (1)

EmanuelG

Last synced: 6 months ago