https://github.com/broadinstitute/genepy

Genepy is an open source utils package covering a range of useful functions for large scale genomics data analysis in python

https://github.com/broadinstitute/genepy

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    3 of 6 committers (50.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.0%) to scientific vocabulary

Keywords

data-science dna epigenomics gcp genomics google-sheets imaging plots rna-seq terra

Keywords from Contributors

labels
Last synced: 5 months ago · JSON representation

Repository

Genepy is an open source utils package covering a range of useful functions for large scale genomics data analysis in python

Basic Info
Statistics
  • Stars: 21
  • Watchers: 2
  • Forks: 5
  • Open Issues: 2
  • Releases: 0
Fork of jkobject/genepy
Topics
data-science dna epigenomics gcp genomics google-sheets imaging plots rna-seq terra
Created about 5 years ago · Last pushed almost 3 years ago
Metadata Files
Readme Changelog Contributing Funding License

README.md

genepy

what is genepy?

A set of awesome functions & tools for Computational Geneticists

long genome

Content

  • utils: where a bunch of helper functions and usefull general scripts are stored
    • plots: a set of plotting tools based on matplotlib and bokeh to make volcano plots / CNV maps etc..
    • helper: and additional helper functions to save data, do merging of dataframes...
  • terra: contains a set of functions that uses dalmatian to interact with the GCP powered genomics HPC platform: Terra.
  • sequencing: contains a set of function to works with bed/bam/fastqs...
  • rna: contains function to work with RNAseq (and related) data.
    • pyDESeq2: it is a python integration of deseq2 (the differential expression analyser) with rpy2
  • mutations: a set of functions to work with maf files, vcf files etc..
  • google: functions and packages linked to google's apis
    • google_sheet: function to upload a df as a google sheet
    • gcp: sets of functions to interact with google storage (relies on gsutil)
  • epigenetics: where we have things related to epigenomics
    • chipseq: has functions to read, merge, denoise, ChIP seq data.
    • plot: has functions to plot ChIP seq data.

Helper tools

tools that you do not need to use directly as they have binding functions in genepy.

  • epigenetics/rose:: where an updated version of the rose algorithm is stored (as a git submodule)
  • celllinemapping-master/python/celllinemapper: a set of functions to map cell line ids to other cell line ids based on an up to date google spreadsheet.

Install

with pip

pip install broad-genepy

and then use with from genepy.utils/epigenetics/... import ...

Please see the next step to get access to all bindings and tools.

dev mode

bash git clone git://github.com/BroadInstitute/genepy.git pip install -e genepy

then you can import files in python with e.g: ```python from genepy import terra from genepy.utils import helper as h from genepy.google import gcp from genepy.utils import plot from genepy.epigenetics import chipseq

```

installation: to get access to all bindings and tools

Install the following tools: - gcloud - firecloud-dalmatian - gsheets - htslib/samtools - bwa just used once: - bowtie2

Some of these packages like gsheets, gcloud, firecloud-dalmatian will require you to create google accounts, login on your machine or download oauth files.

Finaly you can install R packages (GSEABase, erccdashboard, GSVA, DESeq2):

bash R -e 'if(!requireNamespace("BiocManager", quietly = TRUE)){install.packages("BiocManager")};BiocManager::install(c("GSEABase", "erccdashboard", "GSVA", "DESeq2"));'

data:

hg38 genome sizes: from https://github.com/igvteam/igv/blob/master/genomes/sizes/hg38.chrom.sizes

About

please do contribute, we do not have time to fix all issues or work on feature requests

Jeremie Kalfon jkalfon@broadinstitute.org jkobject@gmail.com https://jkobject.com

Apache license 2.0.

Owner

  • Name: Broad Institute
  • Login: broadinstitute
  • Kind: organization
  • Location: Cambridge, MA

Broad Institute of MIT and Harvard

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 360
  • Total Committers: 6
  • Avg Commits per committer: 60.0
  • Development Distribution Score (DDS): 0.117
Top Committers
Name Email Commits
jkobject j****t@g****m 318
Javad Noorbakhsh j****k@b****g 16
Gwen Miller g****r@b****g 15
monikaperez m****z@g****m 5
Simone Zhang x****g@b****g 4
dependabot[bot] 4****]@u****m 2
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 21
  • Average time to close issues: N/A
  • Average time to close pull requests: 11 days
  • Total issue authors: 0
  • Total pull request authors: 7
  • Average comments per issue: 0
  • Average comments per pull request: 0.48
  • Merged pull requests: 14
  • Bot issues: 0
  • Bot pull requests: 1
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • javadnoorb (8)
  • jkobject (7)
  • monikaperez (1)
  • dependabot[bot] (1)
  • qinqian (1)
  • XL-whitelabgx (1)
  • 5im1z (1)
Top Labels
Issue Labels
Pull Request Labels
enhancement (1) dependencies (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 51 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 11
  • Total maintainers: 1
pypi.org: broad-genepy

A useful module for any CompBio

  • Versions: 11
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 51 Last month
Rankings
Dependent packages count: 10.1%
Forks count: 14.2%
Stargazers count: 14.5%
Downloads: 24.9%
Average: 26.2%
Dependent repos count: 67.3%
Maintainers (1)
Last synced: 6 months ago