LMDiskANN.jl

LMDiskANN.jl: An Implementation of the Low Memory Disk Approximate Nearest Neighbors Search Algorithm - Published in JOSS (2025)

https://github.com/mantzaris/lmdiskann.jl

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 9 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

approximate-nearest-neighbor-search diskann julia julialang vector-database
Last synced: 6 months ago · JSON representation

Repository

Julia Implementation of Low Memory Disk ANN (LM-DiskANN)

Basic Info
Statistics
  • Stars: 6
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 2
Topics
approximate-nearest-neighbor-search diskann julia julialang vector-database
Created 11 months ago · Last pushed 8 months ago
Metadata Files
Readme License

README.md

License: MIT Docs - stable<!-- Documentation --> Docs - dev Build Status DOI

LMDiskANN.jl

Julia Implementation of Low Memory Disk ANN (LM-DiskANN)

LM-DiskANN is a lightweight library for approximate nearest‐neighbor (ANN) indexing on disk. It creates a graph structure over vector embeddings, storing them in memory‐mapped files to keep the in‐memory footprint low. It also supports optional LevelDB databases for user‐key ↔ numeric ID lookups, making it easy to associate each embedding with a custom string ID.

Key Features

  • Disk‐Resident: Vectors and adjacency lists are stored in memory‐mapped files, reducing RAM usage.
  • Graph‐Based Search: Leverages a BFS expansion (EF_SEARCH) for approximate neighbor lookups.
  • Insert & Delete: Dynamically insert or remove embeddings without needing to rebuild the index.
  • User Keys: Link a string key (e.g. "image123") to your internal node ID; retrieve or delete by either integer ID or key.

Quick Start Example

Install via: (@v1.6) pkg> add https://github.com/mantzaris/LMDiskANN.jl

```julia using LMDiskANN using Random

create an index with dimension = 5

index = createindex("myindex", 5)

insert a random vector

v1 = rand(Float32, 5) (key1, id1) = ann_insert!(index, v1) # returns (autoKey, 1)

insert another vector with a custom string key

v2 = rand(Float32, 5) (mykey, id2) = ann_insert!(index, v2; key="myvec")

search for v1

results = search(index, v1, topk=3) println("Results for v1 => ", results)

retrieve embeddings

retrievedv1 = getembeddingfromid(index, id1) retrievedv2 = getembeddingfromkey(index, "myvec")

delete by ID or key

anndelete!(index, id1) anndelete!(index, "myvec") ```

citing this work

  • Mantzaris, A. V., (2025). LMDiskANN.jl: An Implementation of the Low Memory Disk Approximate Nearest Neighbors Search Algorithm. Journal of Open Source Software, 10(110), 8199, https://doi.org/10.21105/joss.08199

  • @article{Mantzaris2025, doi = {10.21105/joss.08199}, url = {https://doi.org/10.21105/joss.08199}, year = {2025}, publisher = {The Open Journal}, volume = {10}, number = {110}, pages = {8199}, author = {Alexander V. Mantzaris}, title = {LMDiskANN.jl: An Implementation of the Low Memory Disk Approximate Nearest Neighbors Search Algorithm}, journal = {Journal of Open Source Software} }

original paper introducing the LM-DiskANN algorithm

"LM-diskann: Low memory footprint in disk-native dynamic graph-based ann indexing." Pan, Yu, Jianxin Sun, and Hongfeng Yu, 2023 IEEE International Conference on Big Data (BigData). IEEE, 2023.

Owner

  • Name: a.v.mantzaris
  • Login: mantzaris
  • Kind: user
  • Location: USA

Excited about the future of technology. Happy to participate in shaping that future through theory and practice.

JOSS Publication

LMDiskANN.jl: An Implementation of the Low Memory Disk Approximate Nearest Neighbors Search Algorithm
Published
June 12, 2025
Volume 10, Issue 110, Page 8199
Authors
Alexander V. Mantzaris ORCID
Department of Statistics and Data Science, University of Central Florida (UCF), USA
Editor
Mehmet Hakan Satman ORCID
Tags
ANN Embeddings Search Vector DB

GitHub Events

Total
  • Create event: 4
  • Commit comment event: 8
  • Issues event: 8
  • Release event: 1
  • Watch event: 3
  • Delete event: 4
  • Issue comment event: 10
  • Push event: 55
  • Pull request event: 1
  • Fork event: 1
Last Year
  • Create event: 4
  • Commit comment event: 8
  • Issues event: 8
  • Release event: 1
  • Watch event: 3
  • Delete event: 4
  • Issue comment event: 10
  • Push event: 55
  • Pull request event: 1
  • Fork event: 1

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 46
  • Total Committers: 2
  • Avg Commits per committer: 23.0
  • Development Distribution Score (DDS): 0.043
Past Year
  • Commits: 46
  • Committers: 2
  • Avg Commits per committer: 23.0
  • Development Distribution Score (DDS): 0.043
Top Committers
Name Email Commits
mantzaris a****s@g****m 44
Mehmet Hakan Satman m****n@g****m 2

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 5
  • Total pull requests: 1
  • Average time to close issues: about 20 hours
  • Average time to close pull requests: about 7 hours
  • Total issue authors: 2
  • Total pull request authors: 1
  • Average comments per issue: 4.4
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 5
  • Pull requests: 1
  • Average time to close issues: about 20 hours
  • Average time to close pull requests: about 7 hours
  • Issue authors: 2
  • Pull request authors: 1
  • Average comments per issue: 4.4
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • findmyway (4)
  • JuliaTagBot (1)
Pull Request Authors
  • jbytecode (2)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads: unknown
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
juliahub.com: LMDiskANN

Julia Implementation of Low Memory Disk ANN (LM-DiskANN)

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 0 Total
Rankings
Dependent repos count: 8.3%
Average: 22.1%
Dependent packages count: 35.8%
Last synced: 6 months ago

Dependencies

.github/workflows/ci.yml actions
  • actions/checkout v4 composite
  • julia-actions/setup-julia v2 composite
.github/workflows/documentation.yml actions
  • actions/checkout v4 composite
  • julia-actions/cache v2 composite
  • julia-actions/setup-julia v2 composite