https://github.com/kundajelab/locusselect

extraction of data embeddings from deep learning model layers; computation of embedding distance and visualization with umap/tsne

https://github.com/kundajelab/locusselect

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    3 of 6 committers (50.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (1.6%) to scientific vocabulary

Keywords from Contributors

deeplift guided-backpropagation integrated-gradients interpretability interpretable-deep-learning saliency-map sensitivity-analysis
Last synced: 6 months ago · JSON representation

Repository

extraction of data embeddings from deep learning model layers; computation of embedding distance and visualization with umap/tsne

Basic Info
  • Host: GitHub
  • Owner: kundajelab
  • Language: Jupyter Notebook
  • Default Branch: master
  • Size: 48.8 MB
Statistics
  • Stars: 2
  • Watchers: 5
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Created almost 7 years ago · Last pushed over 5 years ago

https://github.com/kundajelab/locusselect/blob/master/

# locusselect

* extraction of data embeddings from deep learning model layers;

* computation of embedding distance for inputs;

* clustering and visualization of embeddings  with umap/tsne 

See examples/example.sh for usage examples. 

```
 compute_embeddings \
                    --input_bed_file optimal_peak.narrowPeak.gz \
                    --weights 13kb_context_3b_prediction_dnase.hdf5 \
                    --json k562_dnase_profile_arch.json \
                    --ref_fasta /mnt/data/annotations/by_release/hg38/GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta \
                    --center_on_summit \
                    --flank 6500 \
                    --output_npz_file k562_dnase_profile_embeddings_layer_-2.npz \
                    --embedding_layer -2 \
                    --threads 40

```

Owner

  • Name: Kundaje Lab
  • Login: kundajelab
  • Kind: organization
  • Location: Stanford University

Compbio and machine learning code repositories from the Kundaje Lab at Stanford Genetics and Computer Science Depts.

GitHub Events

Total
Last Year

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 74
  • Total Committers: 6
  • Avg Commits per committer: 12.333
  • Development Distribution Score (DDS): 0.432
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
anna shcherbina a****a@g****m 42
Av Shrikumar a****r@g****m 22
mhfzsharmin m****n@u****u 5
soumyakundu s****u@s****u 3
annashcherbina a****h@s****u 1
mhfzsharmin m****n@g****m 1
Committer Domains (Top 20 + Academic)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 11 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 3
  • Total maintainers: 1
pypi.org: locusselect

Compute deep learning embeddings for narrowPeak files; compute pairwise distance between embeddings and cluster with tSNE

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 11 Last month
Rankings
Dependent packages count: 10.0%
Dependent repos count: 21.7%
Average: 31.6%
Downloads: 63.1%
Maintainers (1)
Last synced: 7 months ago