https://github.com/kundajelab/histobpnet

Sequence to histone models

https://github.com/kundajelab/histobpnet

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: biorxiv.org, zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (4.8%) to scientific vocabulary
Last synced: 4 months ago · JSON representation

Repository

Sequence to histone models

Basic Info
  • Host: GitHub
  • Owner: kundajelab
  • License: bsd-3-clause
  • Default Branch: main
  • Size: 1.4 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of kundajelab/chrombpnet-pytorch
Created 8 months ago · Last pushed 6 months ago

https://github.com/kundajelab/histobpnet/blob/main/

# [Pytorch Implementation] Bias factorized, base-resolution deep learning models of chromatin accessibility reveal cis-regulatory sequence syntax, transcription factor footprints and regulatory variants

- Pytorch implementation for [ChromBPNet](https://github.com/kundajelab/chrombpnet)
- Please refer to original [code](https://github.com/kundajelab/chrombpnet) and paper [ChromBPNet: Bias factorized, base-resolution deep learning models of chromatin accessibility reveal cis-regulatory sequence syntax, transcription factor footprints and regulatory variants](https://www.biorxiv.org/content/10.1101/2024.12.25.630221v1) by  Anusri Pampari*, Anna Shcherbina*, Anshul Kundaje. (*authors contributed equally)  
- This repo also refers to [bpnet-lite](https://github.com/jmschrei/bpnet-lite) and uses [tangermeme](https://github.com/jmschrei/tangermeme) for interpretation, two very useful repos by Jacob Schreiber


ChromBPNet

## Reproduce Official ChromBPNet performance ### Pearson correlation on counts prediction of peaks official chrombpnet (left) vs pytorch chrombpnet (right)

### Attribution score Here is the [genome browser](https://epigenomegateway.wustl.edu/browser2022/?blob=7Vhdc5s4FP0rGe2rDcaJvY7fupm0zXSbnYk92YedDiPQBVQLiQrhxPXkv_cKMMYOTt1u291p84bQuV_nXF0.1iQGqVK4pimQKUni0wnpkSWHuytpQC.pINM1yQ3VhkzH3vn52XjUIyBZtRqPTh96xGgaLnIy_WdNZOXnphCg0ZFZZXap66WgAaDDZltlhiuJluu9LXRaaLtEUAqGMmqoRc1tpJMdrwjl.QwEhAYwq4iKHHok4gL.Ct5XHuzCpveulSpe83wO91hWbWJwcaFkxGOMhF5pxtvLDwXo1aVkmeISjdYPD72mXA3RK5CwLRhJBSqlMtQW2Kp8i.yofbN5VPV7If4zGnp1B226p0ULboSKwe3p.RHM7IA7yGnt_yz8vH1xfennZU6.5wyOIMlanFQWuHGyrKw62OoC_iy03UAG1Lyl.WJnxpR3083dZp7sgrsmThvRcJQYk.VT113yj0kROHdFboQDrHCzIhA8dG1Srk7zhTd2guDJMdVO7P8wrbacBTy.Q4vWIdSqkMw3ujDJgWO4i1jNQiqsK1oYZddv6T2ZesPBIyJTjtU4.CSRkdKspHKBnuh7cBVduAL4PcaKN_xWpbsaYgnGfTMaD53gbo_m3Q7p7dJe1_b1hJcUNiEQ9687Ggm51FrpWgMb6wVjrbSqAwtsToMryQCZHPSIrRZbsOzUVBk4KQ3bByLTwHwVRTzkVPh0aRU9pHAXtkPmLtiu1t9IYKus2w7k29DPWj.pdZholQYZngw_GnnDz8n9GH5I8cfI7yd6tjJKh4n_EgM9i_5lontDS8fRolfwY0SvkD9oqtvovHxBycvr_UyeG.JQQzQDs8WYPVR.iE9n4.cJzbb9wVYyhw.t_jjOuqNdjjP8LiMDK8Ao0mgeOGW03ImUYH6Kb6vO5fXF7GYynrx89cZpTsPRfVPz80sMkvbMPb5hPmPWNViesviGTxF_J1DVGGWQL5wdv2IP4IT9mh7oNnuqB7osnnvgx_cAprghYw747VpljQ9jJHYGZn99W_4FLJ32vTr6n4D1sb85w68_.0LQIwGKJOAKUyKT0WjIwrNBf3A6CvueFw36k_Fg3A.j4dnZ7zCJzkP7cwfZTdQdl_E1XfKYonZkit.TWE2zc3vTVCjoShWWDhILFdT_IzPBDf6enPGPyDQmYWiACb8GykC_Bh4naNDcRm5m.NzImg0kdut185WubK9wW8ZvHl6FCRdMgyz_atYYdJeDaWD29fgAbIOhmT0b9cH7A2Pk5V.IUKWZkmBVriEgaSDgQqgckWXlVi2KL0dLvGHZeXiHaQd4NECXQuEqwXqEralWjlG9mCdgg1UuPgE-) to compare the profile prediction and attribution scores between official ChromBPNet and pytorch implementation with n_filters = 512 and 128

genome_browser

## Table of contents - [Installation](#installation) - [QuickStart](#quickstart) - [How-to-cite](#how-to-cite) ## Installation #### Install from pypi ``` pip install chrombpnet-pytorch ``` #### Install from source ``` pip install git+https://github.com/jsxlei/chrombpnet-pytorch.git ``` ## QuickStart ### Before training - Download the [genome](https://zenodo.org/records/12193595) or use your own genome data - Download the [K562 example ATAC data](https://zenodo.org/records/15713376) or use your own ATAC data ### Bias-factorized ChromBPNet training Please refer to [data_config](https://github.com/kundajelab/chrombpnet-pytorch/chrombpnet/data_config.py) to define your own dataset or pass them through command. if your contains: peaks.bed, negatives.bed, and unstranded.bw, and bias_scaled.h5 as well. ``` chrombpnet train --data_dir ``` Otherwise ``` chrombpnet train --peaks --negatives --bigwig --bias --adjust_bias ``` ### Predict with pretrained model in .h5 format or .cpkt or .pt ``` chrombpnet predict --checkpoint chrombpnet_wo_bias.h5/best_model.cpkt/chrombpnet_wo_bias.pt -o ``` ### Interpret by calculating attribution ``` chrombpnet interpret --checkpoint -o ``` #### Input Format - `--bigwig` - `--peaks` - `--negatives` #### Output Format The ouput directory will be populated as follows with fold_0 chromosome splits - ``` fold_0\ checkpoints\ best_model.cpkt last.cpkt chrombpnet_nobias.pt (pytorch i.e model to predict bias corrected accessibility profile) train.log predict.log evaluation\ eval\ all_regions.counts_pearsonr.png all_regions_jsd.profile_jsd.png peaks.counts_pearsonr.png peaks_jsd.profile_jsd.png regions.csv metrics.json interpret\ counts\ ``` ## How to Cite If you're using ChromBPNet in your work, please cite as follows: ``` @article {Pampari2024.12.25.630221, author = {Pampari, Anusri and Shcherbina, Anna and Kvon, Evgeny and Kosicki, Michael and Nair, Surag and Kundu, Soumya and Kathiria, Arwa S. and Risca, Viviana I. and Kuningas, Kristiina and Alasoo, Kaur and Greenleaf, William James and Pennacchio, Len A. and Kundaje, Anshul}, title = {ChromBPNet: bias factorized, base-resolution deep learning models of chromatin accessibility reveal cis-regulatory sequence syntax, transcription factor footprints and regulatory variants}, elocation-id = {2024.12.25.630221}, year = {2024}, doi = {10.1101/2024.12.25.630221}, publisher = {Cold Spring Harbor Laboratory}, URL = {https://www.biorxiv.org/content/early/2024/12/25/2024.12.25.630221}, eprint = {https://www.biorxiv.org/content/early/2024/12/25/2024.12.25.630221.full.pdf}, journal = {bioRxiv} } ```

Owner

  • Name: Kundaje Lab
  • Login: kundajelab
  • Kind: organization
  • Location: Stanford University

Compbio and machine learning code repositories from the Kundaje Lab at Stanford Genetics and Computer Science Depts.

GitHub Events

Total
  • Push event: 7
  • Create event: 1
Last Year
  • Push event: 7
  • Create event: 1