Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.5%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: SiddharthaStoic
  • License: mit
  • Language: C++
  • Default Branch: main
  • Size: 13.4 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

README.md

Adaptive ntHash

This project extends the ntHash library with an adaptive spaced seed hashing mechanism based on local sequence entropy. The implementation dynamically switches between sparse and dense spaced seed patterns during hashing to optimize performance, especially in regions of varying sequence complexity.

Key Features

  • Adaptive Hashing: Uses entropy thresholds to switch between sparse and dense seed configurations at runtime.
  • Optimized Rolling Entropy: Efficient entropy computation using a rolling window and fixed-point log₂ lookup.
  • Faster Than Standard ntHash: Consistently outperforms the standard ntHash across multiple benchmark runs.
  • Drop-in Integration: Fully compatible with existing ntHash codebase and Meson build system.

Benchmark Results

The plot below shows runtime (in seconds) for 1000 iterations of hashing a 1 million base pair sequence using standard ntHash vs the adaptive version.

Benchmark Comparison

Build Instructions

This project uses the Meson build system.

git clone https://github.com/SiddharthaStoic/adaptive-nthash.git cd adaptive-nthash meson setup build --wipe meson compile -C build meson test -C build

Run Benchmarks

```

Adaptive hash benchmark

./build/adaptive_benchmark.exe

Standard ntHash benchmark

./build/normal_benchmark.exe ```

Contributing

This repository was developed as a high-performance extension to the ntHash project. Contributions are welcome to further optimize, test, or extend the adaptive entropy model.

License

This project is licensed under the MIT License

Owner

  • Name: Siddhartha
  • Login: SiddharthaStoic
  • Kind: user

Student Developer at Jyothy Institute of Technology

Citation (CITATION.bib)

@article{10.1093/bioinformatics/btac564,
    author = {Kazemi, Parham and Wong, Johnathan and Nikolić, Vladimir and Mohamadi, Hamid and Warren, René L and Birol, Inanç},
    title = "{ntHash2: recursive spaced seed hashing for nucleotide sequences}",
    journal = {Bioinformatics},
    volume = {38},
    number = {20},
    pages = {4812-4813},
    year = {2022},
    month = {08},
    issn = {1367-4803},
    doi = {10.1093/bioinformatics/btac564},
    url = {https://doi.org/10.1093/bioinformatics/btac564},
    eprint = {https://academic.oup.com/bioinformatics/article-pdf/38/20/4812/46535020/btac564.pdf},
}

@article{doi:10.1093/bioinformatics/btw397,
author = {Mohamadi, Hamid and Chu, Justin and Vandervalk, Benjamin P. and Birol, Inanc},
title = {ntHash: recursive nucleotide hashing},
journal = {Bioinformatics},
volume = {32},
number = {22},
pages = {3492},
year = {2016},
doi = {10.1093/bioinformatics/btw397},
URL = { + http://dx.doi.org/10.1093/bioinformatics/btw397},
eprint = {/oup/backfile/Content_public/Journal/bioinformatics/32/22/10.1093_bioinformatics_btw397/3/btw397.pdf}
}

GitHub Events

Total
  • Push event: 2
  • Create event: 6
Last Year
  • Push event: 2
  • Create event: 6