beditor

A Computational Workflow for Designing Libraries of sgRNAs for CRISPR-Mediated Base Editing, and much more

https://github.com/rraadd88/beditor

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 7 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.6%) to scientific vocabulary

Keywords

base-editing crispr genome-wide-targeted-mutagenesis guide-rna-library
Last synced: 6 months ago · JSON representation ·

Repository

A Computational Workflow for Designing Libraries of sgRNAs for CRISPR-Mediated Base Editing, and much more

Basic Info
  • Host: GitHub
  • Owner: rraadd88
  • License: gpl-3.0
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 1.87 MB
Statistics
  • Stars: 19
  • Watchers: 5
  • Forks: 5
  • Open Issues: 0
  • Releases: 0
Topics
base-editing crispr genome-wide-targeted-mutagenesis guide-rna-library
Created over 7 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License Citation

README.md

beditor(v2)

A Computational Workflow for Designing Libraries of sgRNAs for CRISPR-Mediated Base Editing, and much more <!-- [![Contributors][contributors-shield]][contributors-url] [![Forks][forks-shield]][forks-url] [![Stargazers][stars-shield]][stars-url] --> <!-- PyPIPython --> build Issues Downloads GNU License

Usage

🖱️ GUI-mode

beditor gui

Note: GUI is recommended for designing small libraries and prioritization of the guides.

▶️ CLI-mode

beditor cli --editor BE1 -m path/to/mutations.tsv -o path/to/output_directory/ --species human --ensembl-release 110 or beditor cli -c beditor_config.yml

Parameters usage: beditor cli [--editor EDITOR] [-m MUTATIONS_PATH] [-o OUTPUT_DIR_PATH] [--species SPECIES] [--ensembl-release ENSEMBL_RELEASE] [--genome-path GENOME_PATH] [--gtf-path GTF_PATH] [-r RNA_PATH] [-p PRT_PATH] [-c CONFIG_PATH] [--search-window SEARCH_WINDOW] [-n] [-w WD_PATH] [-t THREADS] [-k KERNEL_NAME] [-v VERBOSE] [-i IGV_PATH_PREFIX] [--ext EXT] [-f] [-d] [--skip SKIP] optional arguments: -h, --help show this help message and exit --editor EDITOR base-editing method, available methods can be listed using command: 'beditor resources' -m MUTATIONS_PATH, --mutations-path MUTATIONS_PATH path to the mutation file, the format of which is available at https://github.com/rraadd88/beditor/README.md#Input-format. -o OUTPUT_DIR_PATH, --output-dir-path OUTPUT_DIR_PATH path to the directory where the outputs should be saved. --species SPECIES species name. --ensembl-release ENSEMBL_RELEASE ensemble release number. --genome-path GENOME_PATH path to the genome file, which is not available on Ensembl. --gtf-path GTF_PATH path to the gene annotations file, which is not available on Ensembl. -r RNA_PATH, --rna-path RNA_PATH path to the transcript sequences file, which is not available on Ensembl. -p PRT_PATH, --prt-path PRT_PATH path to the protein sequences file, which is not available on Ensembl. --search-window SEARCH_WINDOW number of bases to search on either side of a target, if not specified, it is inferred by beditor. -n, --not-be False do not process as a base editor. -c CONFIG_PATH, --config-path CONFIG_PATH path to the configuration file. -w WD_PATH, --wd-path WD_PATH path to the working directory. -t THREADS, --threads THREADS 1 number of threads for parallel processing. -k KERNEL_NAME, --kernel-name KERNEL_NAME 'beditor' name of the jupyter kernel. -v VERBOSE, --verbose VERBOSE 'WARNING' verbose, logging levels: DEBUG > INFO > WARNING > ERROR (default) > CRITICAL. -i IGV_PATH_PREFIX, --igv-path-prefix IGV_PATH_PREFIX prefix to be added to the IGV URL. --ext EXT file extensions of the output tables. -f, --force False -d, --dbug False --skip SKIP skip sections of the workflow Examples: Notes: Required parameters for assigning a species: species ensembl_release or genome_path gtf_path rna_path prt_path

Installation

Virtual environment and namming kernel (recommended)

conda env create -n beditor python=3.9; # options: conda/mamba, python=3.9/3.8 python -m ipykernel install --user --name beditor

Installation of the package

pip install beditor[all]

Optional dependencies, as required:

pip install beditor # only cli pip install beditor[gui] # plus gui

For fast processing of large genomes (highly recommended for human genome):

conda install install bioconda::ucsc-fatotwobit bioconda::ucsc-twobittofa bioconda::ucsc-twobitinfo # options: conda/mamba

Else, for moderately fast processing,

conda install install bioconda::bedtools # options: conda/mamba

Input format

Note: The coordinates are 1-based (i.e. X:1-1 instead of X:0:1) and IDs correspond to the chosen genome assemblies (e.g. from Ensembl).

Point mutations chrom start end strand mutation 5 1123 1123 + C Position scanning chrom start end strand 5 1123 1123 + Region scanning chrom start end strand 5 1123 2123 + Protein point mutations protein id aa pos mutation ENSP1123 43 S Protein position scanning protein id aa pos ENSP1123 43 Protein region scanning protein id aa start aa end ENSP1123 43 143 Note: Ensembl protein IDs are used.

Output format

Note: output contains 0-based coordinates are used.

guide sequence guide locus offtargets score {columns in the input} AGCGTTTGGCAAATCAAACAAAA 4:1003215-1003238(+) 0 1 ..

Supported base editing methods

| method | nucleotide | nucleotide mutation | window start | window end | guide length | PAM | PAM position | |-------------|------------|---------------------|--------------|------------|--------------|--------|--------------| | A3A-BE3 | C | T | 4 | 8 | 20 | NGG | down | | ABE7.10 | A | G | 4 | 7 | 20 | NGG | down | | ABE7.10* | A | G | 4 | 8 | 20 | NGG | down | | ABE7.9 | A | G | 5 | 8 | 20 | NGG | down | | ABESa | A | G | 6 | 12 | 21 | NNGRRT | down | | BE-PLUS | C | T | 4 | 14 | 20 | NGG | down | | BE1 | C | T | 4 | 8 | 20 | NGG | down | | BE2 | C | T | 4 | 8 | 20 | NGG | down | | BE3 | C | T | 4 | 8 | 20 | NGG | down | | BE4-Gam | C | T | 4 | 8 | 20 | NGG | down | | BE4/BE4max | C | T | 4 | 8 | 20 | NGG | down | | Cas12a-BE | C | T | 10 | 12 | 23 | TTTV | up | | eA3A-BE3 | C | T | 4 | 8 | 20 | NGG | down | | EE-BE3 | C | T | 5 | 6 | 20 | NGG | down | | HF-BE3 | C | T | 4 | 8 | 20 | NGG | down | | Sa(KKH)-ABE | A | G | 6 | 12 | 21 | NNNRRT | down | | SA(KKH)-BE3 | C | T | 3 | 12 | 21 | NNNRRT | down | | SaBE3 | C | T | 3 | 12 | 21 | NNGRRT | down | | SaBE4 | C | T | 3 | 12 | 21 | NNGRRT | down | | SaBE4-Gam | C | T | 3 | 12 | 21 | NNGRRT | down | | Target-AID | C | T | 2 | 4 | 20 | NGG | down | | Target-AID | C | T | 2 | 4 | 20 | NG | down | | VQR-ABE | A | G | 4 | 6 | 20 | NGA | down | | VQR-BE3 | C | T | 4 | 11 | 20 | NGAN | down | | VRER-ABE | A | G | 4 | 6 | 20 | NGCG | down | | VRER-BE3 | C | T | 3 | 10 | 20 | NGCG | down | | xBE3 | C | T | 4 | 8 | 20 | NG | down | | YE1-BE3 | C | T | 5 | 7 | 20 | NGG | down | | YE2-BE3 | C | T | 5 | 6 | 20 | NGG | down | | YEE-BE3 | C | T | 5 | 6 | 20 | NGG | down |

Favorite base editor not listed?
Please send the required info using a PR, or an issue.

Change log

v2

New features:
1. Design libraries for base or amino acid mutational scanning, at defined positions and regions. 2. The gui contains library filtering and prioritization options. 3. Non-base editing applications, e.g. CRISPR-tiling, using not_be option.

Key updates:
1. Quicker installation due to reduced number of dependencies (bwa comes in the package, and samtools not needed). 2. Faster run-time, compared to v1, because of the improvements in the dependencies e.g. pandas etc.
3. Faster run-time on large genomes e.g. human genome, because of the use of 2bit tools.
4. Direct command line options to use non-model species which e.g. not indexed on Ensembl.
5. Configuration made optional.

Technical updates: 1. The gui is powered by mercury, thus overcomming the limitations of v1. 2. Use of one base editor (method) per run, instead of multiple.
3. Due to overall faster run-times, parallelization within a run is disabled. However, multiple runs can be parallelized, externally e.g. using Python's built-in multiprocessing. 5. Only the sgRNAs for which target lies within the optimal activity window are reported. Therefore unneeded penalty for target not being in activity window is now not utilized, but options retained for back-compatibility.
6. Many refactored functions can now be imported and executed independently for "much more" applications.
7. Reports generated for each run in the form of a jupyter notebook. 8. Automated testing on GitHub for continuous integration.
9. The cli is compatible with python 3.8 and 3.9 (even higher untested versions), however the gui not supported on python 3.7 due lack of dependencies.

Future directions, for which contributions are welcome:

  • [ ] Adding option to provide 0-based co-ordinates in the input.

Similar projects:

  • http://www.rgenome.net/be-designer/
  • http://yang-laboratory.com/BEable-GPS
  • https://github.com/maxwshen/bepredictbystander
  • https://github.com/maxwshen/bepredictefficiency
  • https://fgcz-shiny.uzh.ch/PnBDesigner/

How to cite?

v2

  1. Using BibTeX:
    @software{Dandage_beditor, title = {beditor: A Computational Workflow for Designing Libraries of sgRNAs for CRISPR-Mediated Base Editing}, author = {Dandage, Rohan}, year = {2024}, url = {https://doi.org/10.5281/zenodo.10648264}, version = {v2.0.1}, note = {The URL is a DOI link to the permanent archive of the software.}, }

  2. DOI link: DOI, or

  3. Using citation information from CITATION.CFF file.

v1 1. Using BibTeX: ``` @software{Dandage_beditorv1, title = {beditor: A Computational Workflow for Designing Libraries of sgRNAs for CRISPR-Mediated Base Editing}, author = {Dandage, Rohan}, year = {2019}, url = {https://doi.org/10.1534/genetics.119.302089}, version = {v1}, } ```

Future directions, for which contributions are welcome:

  • [ ] Allowing 0-based coordinates in the input.

Similar projects:

  • http://www.rgenome.net/be-designer/
  • http://yang-laboratory.com/BEable-GPS
  • https://github.com/maxwshen/bepredictbystander
  • https://github.com/maxwshen/bepredictefficiency
  • https://fgcz-shiny.uzh.ch/PnBDesigner/ # API <!-- markdownlint-disable -->

module beditor.lib.get_mutations

Mutation co-ordinates using pyensembl


function get_protein_cds_coords

python get_protein_cds_coords(annots, protein_id: str) → DataFrame

Get protein CDS coordinates

Args:

  • annots: pyensembl annotations
  • protein_id (str): protein ID

Returns:

  • pd.DataFrame: output table

function get_protein_mutation_coords

python get_protein_mutation_coords(data: DataFrame, aapos: int, test=False) → tuple

Get protein mutation coordinates

Args:

  • data (pd.DataFrame): input table
  • aapos (int): amino acid position
  • test (bool, optional): test-mode. Defaults to False.

Raises:

  • ValueError: invalid positions

Returns:

  • tuple: aapos,start,end,seq

function map_coords

python map_coords(df_: DataFrame, df1_: DataFrame, verbose: bool = False) → DataFrame

Map coordinates

Args:

  • df_ (pd.DataFrame): input table

Returns:

  • pd.DataFrame: output table

function get_mutation_coords_protein

python get_mutation_coords_protein( df0: DataFrame, annots, search_window: int, outd: str = None, force: bool = False, verbose: bool = False ) → DataFrame

Get mutation coordinates for protein

Args:

  • df0 (pd.DataFrame): input table
  • annots (type): pyensembl annotations
  • search_window (int): search window length on either side of the target
  • outd (str, optional): output directory path. Defaults to None.
  • force (bool, optional): force. Defaults to False.
  • verbose (bool, optional): verbose. Defaults to False.

Returns:

  • pd.DataFrame: output table

function get_mutation_coords

python get_mutation_coords( df0: DataFrame, annots, search_window: int, verbose: bool = False, **kws_protein ) → DataFrame

Get mutation coordinates

Args:

  • df0 (pd.DataFrame): input table
  • annots (type): pyensembl annotation
  • search_window (int): search window length on either side of the target
  • verbose (bool, optional): verbose. Defaults to False.

Returns:

  • pd.DataFrame: output table

module beditor.lib.get_scores

Scores


function get_ppamdist

python get_ppamdist( guide_length: int, pam_len: int, pam_pos: str, ppamdist_min: int ) → DataFrame

Get penalties set based on distances of the mismatch/es from PAM

:param guidelength: length of guide sequence :param pamlen: length of PAM sequence :param pampos: PAM location 3' or 5' :param ppamdistmin: minimum penalty :param pmutatpam: penalty for mismatch at PAM

TODOs: Use different scoring function for different methods.


function get_beditorscore_per_alignment

python get_beditorscore_per_alignment( NM: int, alignment: str, pam_len: int, pam_pos: str, pentalty_genic: float = 0.5, pentalty_intergenic: float = 0.9, pentalty_dist_from_pam: float = 0.1, verbose: bool = False ) → float

Calculates beditor score per alignment between guide and genomic DNA.

:param NM: Hamming distance :param mismatchesmax: Maximum mismatches allowed in alignment :param alignment: Symbol '|' means a match, '.' means mismatch and ' ' means gap. e.g. |||||.||||||||||.||||.| :param pentaltygenic: penalty for genic alignment :param pentaltyintergenic: penalty for intergenic alignment :param pentaltydistfrompam: maximum pentalty for a mismatch at PAM () :returns: beditor score per alignment.


function get_beditorscore_per_guide

python get_beditorscore_per_guide( guide_seq: str, strategy: str, align_seqs_scores: DataFrame, dBEs: DataFrame, penalty_activity_window: float = 0.5, test: bool = False ) → float

Calculates beditor score per guide.

:param guideseq: guide seqeunce 23nts :param strategy: strategy string eg. ABE;+;@-14;ACT:GCT;T:A; :param alignseqsscores: list of beditor scores per alignments for all the alignments between guide and genomic DNA :param penaltyactivitywindow: if editable base is not in activity window, penaltyactivity_window=0.5 :returns: beditor score per guide.


function revcom

python revcom(s)


function calc_cfd

python calc_cfd(wt, sg, pam)


function get_cfdscore

python get_cfdscore(wt, off)

module beditor.lib.get_specificity

Specificities


function run_alignment

python run_alignment( src_path: str, genomep: str, guidesfap: str, guidessamp: str, guidel: int, mismatches_max: int = 2, threads: int = 1, force: bool = False, verbose: bool = False ) → str

Run alignment

Args:

  • src_path (str): source path
  • genomep (str): genome path
  • guidesfap (str): guide fasta path
  • guidessamp (str): guide sam path
  • threads (int, optional): threads. Defaults to 1.
  • force (bool, optional): force. Defaults to False.
  • verbose (bool, optional): verbose. Defaults to False.

Returns:

  • str: alignment file.

function read_sam

python read_sam(align_path: str) → DataFrame

read alignment file

Args:

  • align_path (str): path to the alignment file

Returns:

  • pd.DataFrame: output table

Notes:

Tag Meaning NM Edit distance MD Mismatching positions/bases AS Alignment score BC Barcode sequence X0 Number of best hits X1 Number of suboptimal hits found by BWA XN Number of ambiguous bases in the referenece XM Number of mismatches in the alignment XO Number of gap opens XG Number of gap extentions XT Type: Unique/Repeat/N/Mate-sw XA Alternative hits; format: (chr,pos,CIGAR,NM;)* XS Suboptimal alignment score XF Support from forward/reverse alignment XE Number of supporting seeds Reference: https://bio-bwa.sourceforge.net/bwa.shtml


function parse_XA

python parse_XA(XA: str) → DataFrame

Parse XA tags

Args:

  • XA (str): XA tag

Notes:

format: (chr,pos,CIGAR,NM;)

Example: XA='4,+908051,23M,0;4,+302823,23M,0;4,-183556,23M,0;4,+1274932,23M,0;4,+207765,23M,0;4,+456906,23M,0;4,-1260135,23M,0;4,+454215,23M,0;4,-1177442,23M,0;4,+955254,23M,1;4,+1167921,23M,1;4,-613257,23M,1;4,+857893,23M,1;4,-932678,23M,2;4,-53825,23M,2;4,+306783,23M,2;'


function get_extra_alignments

python get_extra_alignments( df1: DataFrame, genome: str, bed_path: str, alignments_max: int = 10, threads: int = 1 ) → DataFrame

Get extra alignments

Args:

  • df1 (pd.DataFrame): input table
  • alignments_max (int, optional): alignments max. Defaults to 10.
  • threads (int, optional): threads. Defaults to 1.

Returns:

  • pd.DataFrame: output table

TODOs: 1. apply parallel processing to get_seq


function to_pam_coord

python to_pam_coord( pam_pos: str, pam_len: int, align_start: int, align_end: int, strand: str ) → tuple

Get PAM coords

Args:

  • pam_pos (str): PAM position
  • pam_len (int): PAM length
  • align_start (int): alignment start
  • align_end (int): alignment end
  • strand (str): strand

Returns:

  • tuple: start,end

function get_alignments

python get_alignments( align_path: str, genome: str, alignments_max: int, pam_pos: str, pam_len: int, guide_len: int, pam_pattern: str, pam_bed_path: str, extra_bed_path: str, **kws_xa ) → DataFrame

Get alignments

Args:

  • align_path (str): alignement path
  • genome (str): genome path
  • pam_pos (str): PAM position
  • pam_len (int): PAM length
  • guide_len (int): sgRNA length
  • pam_pattern (str): PAM pattern
  • pam_bed_path (str): PAM bed path

Returns:

  • pd.DataFrame: output path

function get_penalties

python get_penalties( aligns: DataFrame, guides: DataFrame, annots: DataFrame ) → DataFrame

Get penalties

Args:

  • aligns (pd.DataFrame): alignements
  • guides (pd.DataFrame): guides
  • annots (pd.DataFrame): annotations

Returns:

  • pd.DataFrame: output table

function score_alignments

python score_alignments( df4: DataFrame, pam_len: int, pam_pos: str, pentalty_genic: float = 0.5, pentalty_intergenic: float = 0.9, pentalty_dist_from_pam: float = 0.1, verbose: bool = False ) → tuple

scorealignments _summary

Args:

  • df4 (pd.DataFrame): input table
  • pam_pos (str): PAM position
  • pentalty_genic (float, optional): penalty for offtarget in genic locus. Defaults to 0.5.
  • pentalty_intergenic (float, optional): penalty for offtarget in intergenic locus. Defaults to 0.9.
  • pentalty_dist_from_pam (float, optional): penalty for offtarget wrt distance from PAM. Defaults to 0.1.
  • verbose (bool, optional): verbose. Defaults to False.

Returns:

  • tuple: tables

Note:

  1. Low value corresponds to high penalty and vice versa, because values are multiplied. 2. High penalty means consequential offtarget alignment and vice versa.

function score_guides

python score_guides( guides: DataFrame, scores: DataFrame, not_be: bool = False ) → DataFrame

Score guides

Args:

  • guides (pd.DataFrame): guides
  • scores (pd.DataFrame): scores
  • not_be (bool, optional): not a base editor. Defaults to False.

Returns:

  • pd.DataFrame: output table

Changes: penaltyactivitywindow disabled as only the sgRNAs with target in the window are reported.

module beditor.lib.io

Input/Output


function download_annots

python download_annots(species_name: str, release: int) → bool

Download annotations using pyensembl

Args:

  • species_name (str): species name
  • release (int): release number

Returns:

  • bool: whether annotation is downloaded or not

function cache_subdirectory

python cache_subdirectory( reference_name: str = None, annotation_name: str = None, annotation_version: int = None, CACHE_BASE_SUBDIR: str = 'beditor' ) → str

Which cache subdirectory to use for a given annotation database over a particular reference. All arguments can be omitted to just get the base subdirectory for all pyensembl cached datasets.

Args:

  • reference_name (str, optional): reference name. Defaults to None.
  • annotation_name (str, optional): annotation name. Defaults to None.
  • annotation_version (int, optional): annotation version. Defaults to None.
  • CACHE_BASE_SUBDIR (str, optional): cache path. Defaults to 'beditor'.

Returns:

  • str: output path

function cached_path

python cached_path(path_or_url: str, cache_directory_path: str)

When downloading remote files, the default behavior is to name local files the same as their remote counterparts.


function to_downloaded_cached_path

python to_downloaded_cached_path( url: str, annots=None, reference_name: str = None, annotation_name: str = 'ensembl', ensembl_release: str = None, CACHE_BASE_SUBDIR: str = 'pyensembl' ) → str

To downloaded cached path

Args:

  • url (str): URL
  • annots (optional): pyensembl annotation. Defaults to None.
  • reference_name (str, optional): reference name. Defaults to None.
  • annotation_name (str, optional): annotation name. Defaults to 'ensembl'.
  • ensembl_release (str, optional): ensembl release. Defaults to None.
  • CACHE_BASE_SUBDIR (str, optional): cache path. Defaults to 'pyensembl'.

Returns:

  • str: output path

function download_genome

python download_genome( species: str, ensembl_release: int, force: bool = False, verbose: bool = False ) → str

Download genome

Args:

  • species (str): species name
  • ensembl_release (int): release
  • force (bool, optional): force. Defaults to False.
  • verbose (bool, optional): verbose. Defaults to False.

Returns:

  • str: output path

function read_genome

python read_genome(genome_path: str, fast=True)

Read genome

Args:

  • genome_path (str): genome path
  • fast (bool, optional): fast mode. Defaults to True.

function to_fasta

python to_fasta( sequences: dict, output_path: str, molecule_type: str, force: bool = True, **kws_SeqRecord ) → str

Save fasta file.

Args:

  • sequences (dict): dictionary mapping the sequence name to the sequence.
  • output_path (str): path of the fasta file.
  • force (bool): overwrite if file exists.

Returns:

  • output_path (str): path of the fasta file

function to_2bit

python to_2bit( genome_path: str, src_path: str = None, force: bool = False, verbose: bool = False ) → str

To 2bit

Args:

  • genome_path (str): genome path
  • src_path (str, optional): source path. Defaults to None.
  • verbose (bool, optional): verbose. Defaults to False.

Returns:

  • str: output path

function to_fasta_index

python to_fasta_index( genome_path: str, bgzip: bool = False, bgzip_path: str = None, threads: int = 1, verbose: bool = True, force: bool = False, indexed: bool = False ) → str

To fasta index

Args:

  • genome_path (str): genome path
  • bgzip_path (str, optional): bgzip path. Defaults to None.
  • threads (int, optional): threads. Defaults to 1.
  • verbose (bool, optional): verbose. Defaults to True.
  • force (bool, optional): force. Defaults to False.
  • indexed (bool, optional): indexed or not. Defaults to False.

Returns:

  • str: output path

function to_bed

python to_bed( df: DataFrame, outp: str, cols: list = ['chrom', 'start', 'end', 'locus', 'score', 'strand'] ) → str

To bed path

Args:

  • df (pd.DataFrame): input table
  • outp (str): output path
  • cols (list, optional): columns. Defaults to ['chrom','start','end','locus','score','strand'].

Returns:

  • str: output path

function read_bed

python read_bed( p: str, cols: list = ['chrom', 'start', 'end', 'locus', 'score', 'strand'] ) → DataFrame

Read bed file

Args:

  • p (str): path
  • cols (list, optional): columns. Defaults to ['chrom','start','end','locus','score','strand'].

Returns:

  • pd.DataFrame: output table

function to_viz_inputs

python to_viz_inputs( gtf_path: str, genome_path: str, output_dir_path: str, output_ext: str = 'tsv', threads: int = 1, force: bool = False ) → dict

To viz inputs for the IGV

Args:

  • gtf_path (str): GTF path
  • genome_path (str): genome path
  • output_dir_path (str): output directory path
  • output_ext (str, optional): output extension. Defaults to 'tsv'.
  • threads (int, optional): threads. Defaults to 1.
  • force (bool, optional): force. Defaults to False.

Returns:

  • dict: configuration

function to_igv_path_prefix

python to_igv_path_prefix() → str

Get IGV path prefix

Returns:

  • str: URL

function to_session_path

python to_session_path(p: str, path_prefix: str = None, outp: str = None) → str

To session path

Args:

  • p (str): session configuration path
  • path_prefix (str, optional): path prefix. Defaults to None.
  • outp (str, optional): output path. Defaults to None.

Returns:

  • str: output path

function read_cytobands

python read_cytobands( cytobands_path: str, col_chrom: str = 'chromosome', remove_prefix: str = 'chr' ) → DataFrame

Read cytobands

Args:

  • cytobands_path (str): path
  • col_chrom (str, optional): column with contig. Defaults to 'chromosome'.

Returns:

  • pd.DataFrame: output table

function to_output

python to_output(inputs: DataFrame, guides: DataFrame, scores: DataFrame) → DataFrame

To output table

Args:

  • inputs (pd.DataFrame): inputs
  • guides (pd.DataFrame): guides
  • scores (pd.DataFrame): scores

Returns:

  • pd.DataFrame: output table

module beditor.lib.make_guides

Designing the sgRNAs


function get_guide_pam

python get_guide_pam( match: str, pam_stream: str, guidel: int, seq: str, pos_codon: int = None )


function get_pam_searches

python get_pam_searches(dpam: DataFrame, seq: str, pos_codon: int) → DataFrame

Search PAM occurance

:param dpam: dataframe with PAM sequences :param seq: target sequence :param poscodon: reading frame :param test: debug mode on :returns dpamsearches: dataframe with positions of pams


function get_guides

python get_guides( data: DataFrame, dpam: DataFrame, guide_len: int, base_fraction_max: float = 0.8 ) → DataFrame

Get guides

Args:

  • data (pd.DataFrame): input table
  • dpam (pd.DataFrame): table with PAM info
  • guide_len (int): guide length
  • base_fraction_max (float, optional): base fraction max. Defaults to 0.8.

Returns:

  • pd.DataFrame: output table

function to_locusby_pam

python to_locusby_pam( chrom: str, pam_start: int, pam_end: int, pam_position: str, strand: str, length: int, start_off: int = 0 ) → str

To locus by PAM from PAM coords.

Args:

  • chrom (str): chrom
  • pam_start (int): PAM start
  • pam_end (int): PAM end
  • pam_position (str): PAM position
  • strand (str): strand
  • length (int): length

Returns:

  • str: locus

function to_pam_coord

python to_pam_coord( startf: int, endf: int, startp: int, endp: int, strand: str ) → tuple

To PAM coordinates

Args:

  • startf (int): start flank start
  • endf (int): start flank end
  • startp (int): start PAM start
  • endp (int): start PAM end
  • strand (str): strand

Returns:

  • tuple: start,end

function get_distances

python get_distances(df2: DataFrame, df3: DataFrame, cfg_method: dict) → DataFrame

Get distances

Args:

  • df2 (pd.DataFrame): input table #1
  • df3 (pd.DataFrame): input table #2
  • cfg_method (dict): config for the method

Returns:

  • pd.DataFrame: output table

function get_windows_seq

python get_windows_seq(s: str, l: str, wl: str, verbose: bool = False) → str

Sequence by guide strand

Args:

  • s (str): sequence
  • l (str): locus
  • wl (str): window locus
  • verbose (bool, optional): verbose. Defaults to False.

Returns:

  • str: window sequence

function filter_guides

python filter_guides( df1: DataFrame, cfg_method: dict, verbose: bool = False ) → DataFrame

Filter sgRNAs

Args:

  • df1 (pd.DataFrame): input table
  • cfg_method (dict): config of the method
  • verbose (bool, optional): verbose. Defaults to False.

Returns:

  • pd.DataFrame: output table

function get_window_target_overlap

python get_window_target_overlap( tstart: int, tend: int, wl: str, ws: str, nt: str, verbose: bool = False ) → tuple

Get window target overlap

Args:

  • tstart (int): target start
  • tend (int): target end
  • wl (str): window locus
  • ws (str): window sequence
  • nt (str): nucleotide
  • verbose (bool, optional): verbose. Defaults to False.

Returns:

  • tuple: windowoverlapsthetarget,wts,ntin_overlap,wtl

function get_mutated_codon

python get_mutated_codon( ts: str, tl: str, tes: str, tel: str, strand: str, verbose: bool = False ) → str

Get mutated codon

Args:

  • ts (str): target sequence
  • tl (str): target locus
  • tes (str): target edited sequence
  • tel (str): target edited locus
  • strand (str): strand
  • verbose (bool, optional): verbose. Defaults to False.

Returns:

  • str: mutated codon

function get_coedits_base

python get_coedits_base( ws: str, wl: str, wts: str, wtl: str, nt: str, verbose: bool = False ) → str

Get co-edited bases

Args:

  • ws (str): window sequence
  • wl (str): window locus
  • wts (str): window target overlap sequence
  • wtl (str): window target overlap locus
  • nt (str): nucleotide
  • verbose (bool, optional): verbose. Defaults to False.

Returns:

  • str: coedits

module beditor.lib

module beditor.lib.methods

Global Variables

  • multint2reg
  • multint2regcomplement

function dpam2dpam_strands

python dpam2dpam_strands(dpam: DataFrame, pams: list) → DataFrame

Duplicates dpam dataframe to be compatible for searching PAMs on - strand

Args:

  • dpam (pd.DataFrame): dataframe with pam information
  • pams (list): pams to be used for actual designing of guides.

Returns:

  • pd.DataFrame: table

function get_be2dpam

python get_be2dpam( din: DataFrame, methods: list = None, test: bool = False, cols_dpam: list = ['PAM', 'PAM position', 'guide length'] ) → dict

Make BE to dpam mapping i.e. dict

Args:

  • din (pd.DataFrame): table with BE and PAM info all cols_dpam needed
  • methods (list, optional): method names. Defaults to None.
  • test (bool, optional): test-mode. Defaults to False.
  • cols_dpam (list, optional): columns to be used. Defaults to ['PAM', 'PAM position', 'guide length'].

Returns:

  • dict: output dictionary.

module beditor.lib.utils

Utilities

Global Variables

  • cols_muts
  • multint2reg
  • multint2regcomplement

function get_src_path

python get_src_path() → str

Get the beditor source directory path.

Returns:

  • str: path

function runbashcmd

python runbashcmd(cmd: str, test: bool = False, logf=None)

Run a bash command

Args:

  • cmd (str): command
  • test (bool, optional): test-mode. Defaults to False.
  • logf (optional): log file instance. Defaults to None.

function log_time_elapsed

python log_time_elapsed(start)

Log time elapsed.

Args:

  • start (datetime): start tile

Returns:

  • datetime: difference in time.

function rescale

python rescale( a: <built-in function array>, mn: float = None ) → <built-in function array>

Rescale a vector.

Args:

  • a (np.array): vector.
  • mn (float, optional): minimum value. Defaults to None.

Returns:

  • np.array: output vector

function get_nt2complement

python get_nt2complement()


function s2re

python s2re(s: str, ss2re: dict) → str

String to regex patterns

Args:

  • s (str): string
  • ss2re (dict): substrings to regex patterns.

Returns:

  • str: string with regex patterns.

function parse_locus

python parse_locus(s: str, zero_based: bool = True) → tuple

parselocus _summary

Args:

  • s (str): location string.
  • zero_based (bool, optional): zero-based coordinates. Defaults to True.

Returns:

  • tuple: chrom, start, end, strand

Notes:

beditor outputs (including bed files) use 0-based loci pyensembl and IGV use 1-based locations


function get_pos

python get_pos(s: str, l: str, reverse: bool = True, zero_based: bool = True) → Series

Expand locus to positions mapped to nucleotides.

Args:

  • s (str): sequence
  • l (str): locus
  • reverse (bool, optional): reverse the - strand. Defaults to True.
  • zero_based (bool, optional): zero based coordinates. Defaults to True.

Returns:

  • pd.Series: output.

function get_seq

python get_seq( genome: str, contig: str, start: int, end: int, strand: str, out_type: str = 'str', verbose: bool = False ) → str

Extract a sequence from a genome file based on start and end positions using streaming.

Args:

  • genome (str): The path to the genome file in FASTA format.
  • contig (str): chrom
  • start (int): start
  • end (int): end
  • strand (str): strand
  • out_type (str, optional): type of the output. Defaults to 'str'.
  • verbose (bool, optional): verbose. Defaults to False.

Raises:

  • ValueError: invalid strand.

Returns:

  • str: The extracted sequence.

function read_fasta

python read_fasta( fap: str, key_type: str = 'id', duplicates: bool = False, out_type='dict' ) → dict

Read fasta

Args:

  • fap (str): path
  • key_type (str, optional): key type. Defaults to 'id'.
  • duplicates (bool, optional): duplicates present. Defaults to False.

Returns:

  • dict: data.

Notes:

  1. If duplicates key_type is set to description instead of id.

function format_coords

python format_coords(df: DataFrame) → DataFrame

Format coordinates

Args:

  • df (pd.DataFrame): table

Returns:

  • pd.DataFrame: formated table

function fetch_sequences_bp

python fetch_sequences_bp(p: str, genome: str) → DataFrame

Fetch sequences using biopython.

Args:

  • p (str): path to the bed file.
  • genome (str): genome path.

Returns:

  • pd.DataFrame: sequences.

function fetch_sequences

python fetch_sequences( p: str, genome_path: str, outp: str = None, src_path: str = None, revcom: bool = True, method='2bit', out_type='df' ) → DataFrame

Fetch sequences

Args:

  • p (str): path to the bed file
  • genome_path (str): genome path
  • outp (str, optional): output path for fasta file. Defaults to None.
  • src_path (str, optional): source path. Defaults to None.
  • revcom (bool, optional): reverse-complement. Defaults to True.
  • method (str, optional): method name. Defaults to '2bit'.
  • out_type (str, optional): type of the output. Defaults to 'df'.

Returns:

  • pd.DataFrame: sequences.

function get_sequences

python get_sequences( df1: DataFrame, p: str, genome_path: str, outp: str = None, src_path: str = None, revcom: bool = True, out_type: str = 'df', renames: dict = {}, **kws_fetch_sequences ) → DataFrame

Get sequences for the loci in a table

Args:

  • df1 (pd.DataFrame): input table
  • p (str): path to the beb file
  • outp (str, optional): output path. Defaults to None.
  • src_path (str, optional): source path. Defaults to None.
  • revcom (bool, optional): reverse complement. Defaults to True.
  • out_type (str, optional): output type. Defaults to 'df'.
  • renames (dict, optional): renames. Defaults to {}.

Returns:

  • pd.DataFrame: output sequences

Notes:

Input is 1-based Output is 0-based Saves bed file and gets the sequences


function to_locus

python to_locus( chrom: str = 'chrom', start: str = 'start', end: str = 'end', strand: str = 'strand', x: Series = None ) → str

To locus

Args:

  • chrom (str, optional): chrom. Defaults to 'chrom'.
  • start (str, optional): strart. Defaults to 'start'.
  • end (str, optional): end. Defaults to 'end'.
  • strand (str, optional): strand. Defaults to 'strand'.
  • x (pd.Series, optional): row of the dataframe. Defaults to None.

Returns:

  • str: locus

function get_flanking_seqs

python get_flanking_seqs( df1: DataFrame, targets_path: str, flanks_path: str, genome: str = None, search_window: list = None ) → DataFrame

Get flanking sequences

Args:

  • df1 (pd.DataFrame): input table
  • targets_path (str): target sequences path
  • flanks_path (str): flank sequences path
  • genome (str, optional): genome path. Defaults to None.
  • search_window (list, optional): search window around the target. Defaults to None.

Returns:

  • pd.DataFrame: output table with sequences

function get_strand

python get_strand( genome, df1: DataFrame, col_start: str, col_end: str, col_chrom: str, col_strand: str, col_seq: str ) → DataFrame

Get strand by comparing the aligned and fetched sequence

Args:

  • genome: genome instance
  • df1 (pd.DataFrame): input table.
  • col_start (str): start
  • col_end (str): end
  • col_chrom (str): chrom
  • col_strand (str): strand
  • col_seq (str): sequences

Returns:

  • pd.DataFrame: output table

Notes:

used for tests.


function reverse_complement_multintseq

python reverse_complement_multintseq(seq: str, nt2complement: dict) → str

Reverse complement multi-nucleotide sequence

Args:

  • seq (str): sequence
  • nt2complement (dict): nucleotide to complement

Returns:

  • str: sequence

function reverse_complement_multintseqreg

python reverse_complement_multintseqreg( seq: str, multint2regcomplement: dict, nt2complement: dict ) → str

Reverse complement multi-nucleotide regex patterns

Args:

  • seq (str): description
  • multint2regcomplement (dict): mapping.
  • nt2complement (dict): nucleotide to complement

Returns:

  • str: regex pattern

function hamming_distance

python hamming_distance(s1: str, s2: str) → int

Return the Hamming distance between equal-length sequences

Args:

  • s1 (str): sequence #1
  • s2 (str): sequence #2

Raises:

  • ValueError: Undefined for sequences of unequal length

Returns:

  • int: distance.

function align

python align( q: str, s: str, test: bool = False, psm: float = 2, pmm: float = 0.5, pgo: float = -3, pge: float = -1 ) → str

Creates pairwise local alignment between seqeunces.

Args:

  • q (str): query
  • s (str): subject
  • test (bool, optional): test-mode. Defaults to False.

Returns:

  • str: alignment with symbols.

Notes:

REF: http://biopython.org/DIST/docs/api/Bio.pairwise2-module.html The match parameters are: CODE DESCRIPTION x No parameters. Identical characters have score of 1, otherwise 0. m A match score is the score of identical chars, otherwise mismatch score. d A dictionary returns the score of any pair of characters. c A callback function returns scores. The gap penalty parameters are: CODE DESCRIPTION x No gap penalties. s Same open and extend gap penalties for both sequences. d The sequences have different open and extend gap penalties. c A callback function returns the gap penalties.


function get_orep

python get_orep(seq: str) → int

Get the overrepresentation


function get_polyt_length

python get_polyt_length(s: str) → int

Counts the length of the longest polyT stretch (RNA pol3 terminator) in sequence

:param s: sequence in string format


function get_annots_installed

python get_annots_installed() → DataFrame

Get a list of annotations installed.

Returns:

  • pd.DataFrame: output.

function get_annots

python get_annots( species_name: str = None, release: int = None, gtf_path: str = None, transcript_path: str = None, protein_path: str = None, reference_name: str = 'assembly', annotation_name: str = 'source', verbose: bool = False, **kws_Genome )

Get pyensembl annotation instance

Args:

  • species_name (str, optional): species name. Defaults to None.
  • release (int, optional): release number. Defaults to None.
  • gtf_path (str, optional): GTF path. Defaults to None.
  • transcript_path (str, optional): transcripts path. Defaults to None.
  • protein_path (str, optional): protein path. Defaults to None.
  • reference_name (str, optional): reference name. Defaults to 'assembly'.
  • annotation_name (str, optional): annotation name. Defaults to 'source'.
  • verbose (bool, optional): verbose. Defaults to False.

Returns: pyensembl annotation instance


function to_pid

python to_pid(annots, gid: str) → str

To protein ID

Args:

  • annots: pyensembl annotation instance
  • gid (str): gene ID

Returns:

  • str: protein ID

function to_one_based_coordinates

python to_one_based_coordinates(df: DataFrame) → DataFrame

To one based coordinates

Args:

  • df (pd.DataFrame): input table

Returns:

  • pd.DataFrame: output table.

module beditor.lib.viz

Visualizations.


function to_igv

python to_igv( cfg: dict = None, gtf_path: str = None, genome_path: str = None, output_dir_path: str = None, threads: int = 1, output_ext: str = None, force: bool = False ) → str

To IGV session file.

Args:

  • cfg (dict, optional): configuration of the run. Defaults to None.
  • gtf_path (str, optional): path to the gtf file. Defaults to None.
  • genome_path (str, optional): path to the genome file. Defaults to None.
  • output_dir_path (str, optional): path to the output directory. Defaults to None.
  • threads (int, optional): threads. Defaults to 1.
  • output_ext (str, optional): extension of the output. Defaults to None.
  • force (bool, optional): force. Defaults to False.

Returns:

  • str: path to the session file.

function get_nt_composition

python get_nt_composition(seqs: list) → DataFrame

Get nt composition.

Args:

  • seqs (list): list of sequences

Returns:

  • pd.DataFrame: table with the frequencies of the nucleotides.

function plot_ntcompos

python plot_ntcompos( seqs: list, pam_pos: str, pam_len: int, window: list = None, ax: Axes = None, color_pam: str = 'lime', color_window: str = 'gold' ) → Axes

Plot nucleotide composition

Args:

  • seqs (list): list of sequences.
  • pam_pos (str): PAM position.
  • pam_len (int): PAM length.
  • window (list, optional): activity window bounds. Defaults to None.
  • ax (plt.Axes, optional): subplot. Defaults to None.
  • color_pam (str, optional): color of the PAM. Defaults to 'lime'.
  • color_window (str, optional): color of the wnindow. Defaults to 'gold'.

Returns:

  • plt.Axes: subplot

function plot_ontarget

python plot_ontarget( guide_loc: str, pam_pos: str, pam_len: int, guidepam_seq: str, window: list = None, show_title: bool = False, figsize: list = [10, 2], verbose: bool = False, kws_sg: dict = {} ) → Axes

plotontarget _summary

Args:

  • guide_loc (str): sgRNA locus
  • pam_pos (str): PAM position
  • pam_len (int): PAM length
  • guidepam_seq (str): sgRNA and PAM sequence
  • window (list, optional): activity window bounds. Defaults to None.
  • show_title (bool, optional): show the title. Defaults to False.
  • figsize (list, optional): figure size. Defaults to [10,2].
  • verbose (bool, optional): verbose. Defaults to False.
  • kws_sg (dict, optional): keyword arguments to plot the sgRNA. Defaults to {}.

Returns:

  • plt.Axes: subplot

TODOs: 1. convert to 1-based coordinates 2. features from the GTF file


function get_plot_inputs

python get_plot_inputs(df2: DataFrame) → list

Get plot inputs.

Args:

  • df2 (pd.DataFrame): table.

Returns:

  • list: list of tables.

function plot_library_stats

python plot_library_stats( dfs: list, palette: dict = {True: 'b', False: 'lightgray'}, cutoffs: dict = None, not_be: bool = True, dbug: bool = False, figsize: list = [10, 2.5] ) → list

Plot library stats

Args:

  • dfs (list): list of tables.
  • palette (type, optional): color palette. Defaults to {True:'b',False:'lightgray'}.
  • cutoffs (dict, optional): cutoffs to be applied. Defaults to None.
  • not_be (bool, optional): not a base editor. Defaults to True.
  • dbug (bool, optional): debug mode. Defaults to False.
  • figsize (list, optional): figure size. Defaults to [10,2.5].

Returns:

  • list: list of subplots.

module beditor.run

Command-line options


function validate_params

python validate_params(parameters: dict) → bool

Validate the parameters.

Args:

  • parameters (dict): parameters

Returns:

  • bool: whther the parameters are valid or not

function cli

python cli( editor: str = None, mutations_path: str = None, output_dir_path: str = None, species: str = None, ensembl_release: int = None, genome_path: str = None, gtf_path: str = None, rna_path: str = None, prt_path: str = None, search_window: int = None, not_be: bool = False, config_path: str = None, wd_path: str = None, threads: int = 1, kernel_name: str = 'beditor', verbose='WARNING', igv_path_prefix=None, ext: str = None, force: bool = False, dbug: bool = False, skip=None, **kws )

beditor command-line (CLI)

Args:

  • editor (str, optional): base-editing method, available methods can be listed using command: 'beditor resources'. Defaults to None.
  • mutations_path (str, optional): path to the mutation file, the format of which is available at https://github.com/rraadd88/beditor/README.md#Input-format. Defaults to None.
  • output_dir_path (str, optional): path to the directory where the outputs should be saved. Defaults to None.
  • species (str, optional): species name. Defaults to None.
  • ensembl_release (int, optional): ensemble release number. Defaults to None.
  • genome_path (str, optional): path to the genome file, which is not available on Ensembl. Defaults to None.
  • gtf_path (str, optional): path to the gene annotations file, which is not available on Ensembl. Defaults to None.
  • rna_path (str, optional): path to the transcript sequences file, which is not available on Ensembl. Defaults to None.
  • prt_path (str, optional): path to the protein sequences file, which is not available on Ensembl. Defaults to None.
  • search_window (int, optional): number of bases to search on either side of a target, if not specified, it is inferred by beditor. Defaults to None.
  • not_be (bool, optional): do not process as a base editor. Defaults to False.
  • config_path (str, optional): path to the configuration file. Defaults to None.
  • wd_path (str, optional): path to the working directory. Defaults to None.
  • threads (int, optional): number of threads. Defaults to 1.
  • kernel_name (str, optional): name of the jupyter kernel. Defaults to "beditor".
  • verbose (str, optional): verbose, logging levels: DEBUG > INFO > WARNING > ERROR (default) > CRITICAL. Defaults to "WARNING".
  • igv_path_prefix (type, optional): prefix to be added to the IGV url. Defaults to None.
  • ext (str, optional): file extensions of the output tables. Defaults to None.
  • force (bool, optional): overwrite the outputs of they exist. Defaults to False.
  • dbug (bool, optional): debug mode (developer). Defaults to False.
  • skip (type, optional): skip sections of the workflow (developer). Defaults to None.

Examples: beditor cli -c inputs/mutations/protein/positions.yml

Notes:

Required parameters for a run: editor mutationspath outputdirpath or configpath


function gui

python gui()


function resources

python resources()

Owner

  • Login: rraadd88
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'beditor: A Computational Workflow for Designing Libraries of sgRNAs for CRISPR-Mediated
  Base Editing'
message: If you use this software, please cite it using the metadata from this file.
type: software
authors:
- given-names: Rohan
  family-names: Dandage
  orcid: https://orcid.org/0000-0002-6421-2067
identifiers:
- type: doi
  value: 10.5281/zenodo.10648264
repository-code: https://github.com/rraadd88/beditor
version: v2.0.1
date-released: '2024-02-11'

GitHub Events

Total
  • Watch event: 3
  • Fork event: 1
Last Year
  • Watch event: 3
  • Fork event: 1

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 7
  • Total Committers: 1
  • Avg Commits per committer: 7.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
rraadd88 r****e@g****m 7

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 17
  • Total pull requests: 5
  • Average time to close issues: 2 months
  • Average time to close pull requests: 5 months
  • Total issue authors: 11
  • Total pull request authors: 1
  • Average comments per issue: 2.88
  • Average comments per pull request: 0.4
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 5
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: 9 days
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • bioinfo4321 (4)
  • herrroaa (3)
  • onebeingmay (2)
  • chengwsh (1)
  • yx-xiang (1)
  • leejimmy93 (1)
  • neko-ni (1)
  • ebrettmann (1)
  • enormandeau (1)
  • valkm2 (1)
  • murphycj (1)
Pull Request Authors
  • dependabot[bot] (7)
Top Labels
Issue Labels
help wanted (2) enhancement (2)
Pull Request Labels
dependencies (7)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 62 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 20
  • Total maintainers: 1
pypi.org: beditor

A computational workflow for designing libraries of guide RNAs for CRISPR base editing

  • Versions: 20
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 62 Last month
Rankings
Dependent packages count: 10.0%
Forks count: 15.3%
Stargazers count: 16.0%
Average: 17.4%
Dependent repos count: 21.7%
Downloads: 24.0%
Maintainers (1)
Last synced: 6 months ago

Dependencies

environment.yml pypi
  • PySimpleGUI *
  • datacache ==1.1.4
  • dna_features_viewer ==0.1.9
  • matplotlib ==2.2.2
  • pandas ==0.23.3
  • pysam ==0.14.1
  • requests ==2.19.1
  • scipy ==1.1.0
  • seaborn ==0.8.1
  • tqdm ==4.23.4
requirements.txt pypi
  • biopython ==1.71
  • datacache ==1.1.4
  • dna_features_viewer ==0.1.9
  • matplotlib ==2.2.2
  • numpy ==1.21.0
  • pandas ==0.23.3
  • pysam ==0.14.1
  • regex ==2018.7.11
  • requests ==2.20.0
  • scipy ==1.1.0
  • seaborn ==0.8.1
  • tqdm ==4.23.4
setup.py pypi
  • biopython ==1.71
  • datacache ==1.1.4
  • dna_features_viewer ==0.1.9
  • matplotlib ==2.2.2
  • numpy ==1.21.0
  • pandas *
  • pyensembl ==1.4.0
  • pysam ==0.14.1
  • pyyaml *
  • regex ==2018.7.11
  • requests ==2.20.0
  • scipy ==1.1.0
  • seaborn ==0.8.1
  • tqdm ==4.23.4