nmslib

Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

https://github.com/nmslib/nmslib

Science Score: 33.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org, scholar.google, sciencedirect.com
  • Committers with academic emails
    2 of 49 committers (4.1%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.7%) to scientific vocabulary

Keywords

k-nn-graphs knn-search neighborhood-graphs non-metric vp-tree

Keywords from Contributors

distributed cryptocurrency transcriptomics parallelism gradient-boosting closember jdbc bioinformatics data-mining deep-neural-networks
Last synced: 10 months ago · JSON representation

Repository

Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

Basic Info
  • Host: GitHub
  • Owner: nmslib
  • License: apache-2.0
  • Language: C++
  • Default Branch: master
  • Homepage:
  • Size: 94.6 MB
Statistics
  • Stars: 3,530
  • Watchers: 93
  • Forks: 461
  • Open Issues: 98
  • Releases: 19
Topics
k-nn-graphs knn-search neighborhood-graphs non-metric vp-tree
Created almost 13 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License

README.md

Pypi version Downloads Downloads Windows Build Status Join the chat at https://gitter.im/nmslib/Lobby

Non-Metric Space Library (NMSLIB)

Important Notes

  • NMSLIB is generic but fast, see the results of ANN benchmarks.
  • A standalone implementation of our fastest method HNSW also exists as a header-only library.
  • All the documentation (including using Python bindings and the query server, description of methods and spaces, building the library, etc) can be found on this page.
  • For generic questions/inquiries, please, use the Gitter chat: GitHub issues page is for bugs and feature requests.

Objectives

Non-Metric Space Library (NMSLIB) is an efficient cross-platform similarity search library and a toolkit for evaluation of similarity search methods. The core-library does not have any third-party dependencies. It has been gaining popularity recently. In particular, it has become a part of Amazon Elasticsearch Service.

The goal of the project is to create an effective and comprehensive toolkit for searching in generic and non-metric spaces. Even though the library contains a variety of metric-space access methods, our main focus is on generic and approximate search methods, in particular, on methods for non-metric spaces. NMSLIB is possibly the first library with a principled support for non-metric space searching.

NMSLIB is an extendible library, which means that is possible to add new search methods and distance functions. NMSLIB can be used directly in C++ and Python (via Python bindings). In addition, it is also possible to build a query server, which can be used from Java (or other languages supported by Apache Thrift (version 0.12). Java has a native client, i.e., it works on many platforms without requiring a C++ library to be installed.

Authors: Bilegsaikhan Naidan, Leonid Boytsov, Yury Malkov, David Novak. With contributions from Ben Frederickson, Lawrence Cayton, Wei Dong, Avrelin Nikita, Dmitry Yashunin, Bob Poekert, @orgoro, @gregfriedland, Scott Gigante, Maxim Andreev, Daniel Lemire, Nathan Kurz, Alexander Ponomarenko.

Brief History

NMSLIB started as a personal project of Bilegsaikhan Naidan, who created the initial code base, the Python bindings, and participated in earlier evaluations. The most successful class of methods--neighborhood/proximity graphs--is represented by the Hierarchical Navigable Small World Graph (HNSW) due to Malkov and Yashunin (see the publications below). Other most useful methods, include a modification of the VP-tree due to Boytsov and Naidan (2013), a Neighborhood APProximation index (NAPP) proposed by Tellez et al. (2013) and improved by David Novak, as well as a vanilla uncompressed inverted file.

Credits and Citing

If you find this library useful, feel free to cite our SISAP paper [BibTex] as well as other papers listed in the end. One crucial contribution to cite is the fast Hierarchical Navigable World graph (HNSW) method [BibTex]. Please, also check out the stand-alone HNSW implementation by Yury Malkov, which is released as a header-only HNSWLib library.

License

The code is released under the Apache License Version 2.0 http://www.apache.org/licenses/. Older versions of the library include additional components, which have different licenses (but this does not apply to NMLISB 2.x):

Older versions of the library included the following components: * The LSHKIT, which is embedded in our library, is distributed under the GNU General Public License, see http://www.gnu.org/licenses/. * The k-NN graph construction algorithm NN-Descent due to Dong et al. 2011 (see the links below), which is also embedded in our library, seems to be covered by a free-to-use license, similar to Apache 2. * FALCONN library's licence is MIT.

Funding

Leonid Boytsov was supported by the Open Advancement of Question Answering Systems (OAQA) group and the following NSF grant #1618159: "Matching and Ranking via Proximity Graphs: Applications to Question Answering and Beyond". Bileg was supported by the iAd Center.

Related Publications

Most important related papers are listed below in the chronological order: * L. Boytsov, D. Novak, Y. Malkov, E. Nyberg (2016). Off the Beaten Path: Let’s Replace Term-Based Retrieval with k-NN Search. In proceedings of CIKM'16. [BibTex] We use a special branch of this library, plus the following Java code. * Malkov, Y.A., Yashunin, D.A.. (2016). Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. CoRR, abs/1603.09320. [BibTex] * Bilegsaikhan, N., Boytsov, L. 2015 Permutation Search Methods are Efficient, Yet Faster Search is Possible PVLDB, 8(12):1618--1629, 2015 [BibTex] * Ponomarenko, A., Averlin, N., Bilegsaikhan, N., Boytsov, L., 2014. Comparative Analysis of Data Structures for Approximate Nearest Neighbor Search. [BibTex] * Malkov, Y., Ponomarenko, A., Logvinov, A., & Krylov, V., 2014. Approximate nearest neighbor algorithm based on navigable small world graphs. Information Systems, 45, 61-68. [BibTex] * Boytsov, L., Bilegsaikhan, N., 2013. Engineering Efficient and Effective Non-Metric Space Library. In Proceedings of the 6th International Conference on Similarity Search and Applications (SISAP 2013). [BibTex]
* Boytsov, L., Bilegsaikhan, N., 2013. Learning to Prune in Metric and Non-Metric Spaces. In Advances in Neural Information Processing Systems 2013. [BibTex] * Tellez, Eric Sadit, Edgar Chávez, and Gonzalo Navarro. Succinct nearest neighbor search. Information Systems 38.7 (2013): 1019-1030. [BibTex] * A. Ponomarenko, Y. Malkov, A. Logvinov, and V. Krylov Approximate nearest neighbor search small world approach. ICTA 2011 * Dong, Wei, Charikar Moses, and Kai Li. 2011. Efficient k-nearest neighbor graph construction for generic similarity measures. Proceedings of the 20th international conference on World wide web. ACM, 2011. [BibTex] * L. Cayton, 2008 Fast nearest neighbor retrieval for bregman divergences. Twenty-Fifth International Conference on Machine Learning (ICML). [BibTex] * Amato, Giuseppe, and Pasquale Savino. 2008 Approximate similarity search in metric spaces using inverted files. [BibTex] * Gonzalez, Edgar Chavez, Karina Figueroa, and Gonzalo Navarro. Effective proximity retrieval by ordering permutations. Pattern Analysis and Machine Intelligence, IEEE Transactions on 30.9 (2008): 1647-1658. [BibTex]

Owner

  • Name: nmslib
  • Login: nmslib
  • Kind: organization

GitHub Events

Total
  • Issues event: 5
  • Watch event: 127
  • Issue comment event: 24
  • Pull request review event: 3
  • Pull request review comment event: 3
  • Pull request event: 2
  • Fork event: 13
Last Year
  • Issues event: 5
  • Watch event: 127
  • Issue comment event: 24
  • Pull request review event: 3
  • Pull request review comment event: 3
  • Pull request event: 2
  • Fork event: 13

Committers

Last synced: about 1 year ago

All Time
  • Total Commits: 1,342
  • Total Committers: 49
  • Avg Commits per committer: 27.388
  • Development Distribution Score (DDS): 0.162
Past Year
  • Commits: 22
  • Committers: 2
  • Avg Commits per committer: 11.0
  • Development Distribution Score (DDS): 0.045
Top Committers
Name Email Commits
searchivarius l****o@b****o 1,125
Ben Frederickson g****b@b****m 39
bileg y****u@e****m 24
John Mazanec j****e@a****m 20
Greg Friedland g****g@a****m 17
Leonid Boytsov l****o@L****e 13
searchivarius l****d@f****a 9
Yuri Malkov y****v@i****m 9
bileg b****i@g****m 8
Yury Malkov y****v@m****u 8
Leonid Boytsov l****o@L****l 5
Scott Gigante s****e@g****m 5
Simon Hewitt si@s****k 5
Maxim Andreev m****v@n****m 3
Joris Vankerschaver j****r@g****m 3
Andrey Goryachev a****v@g****m 3
Abdallah Moussawi a****i@a****m 3
Paul Wise p****s@d****g 2
Janakarajan Natarajan j****n@a****m 2
Huon Wilson H****n@d****u 2
David Novak n****d@g****m 2
Bob Poekert b****b@p****m 2
orgoro o****o@g****m 2
Will Sackfield s****d@s****m 2
jjjamie j****g@g****m 2
searchivarius l****d@t****f 2
Jamie Brunning j****b@i****l 2
bileg b****i@j****l 2
Ubuntu u****u@i****l 1
Ubuntu u****u@i****l 1
and 19 more...

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 107
  • Total pull requests: 23
  • Average time to close issues: 5 months
  • Average time to close pull requests: 2 months
  • Total issue authors: 80
  • Total pull request authors: 19
  • Average comments per issue: 5.15
  • Average comments per pull request: 3.7
  • Merged pull requests: 13
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 7
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: less than a minute
  • Issue authors: 7
  • Pull request authors: 1
  • Average comments per issue: 0.86
  • Average comments per pull request: 1.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • abmitra84 (6)
  • celsofranssa (4)
  • xiankgx (3)
  • davisidarta (3)
  • PLNech (3)
  • searchivarius (3)
  • adtsvetkov (2)
  • lsorber (2)
  • sh1ng (2)
  • TamHHM (2)
  • ivan-marroquin (2)
  • zh-en520 (2)
  • MarcYin (2)
  • barracuda156 (2)
  • ebursztein (2)
Pull Request Authors
  • barracuda156 (4)
  • shatejas (2)
  • Prokuma (2)
  • jmazanec15 (2)
  • kamelCased (2)
  • janaknat (2)
  • pabs3 (2)
  • netj (1)
  • jvkersch (1)
  • JewlsIOB (1)
  • GuilhemN (1)
  • Cadovvl (1)
  • geofft (1)
  • mrgentlemanus (1)
  • luyuncheng (1)
Top Labels
Issue Labels
bug (7) feature request (3) enhancement (3) minor issue (1)
Pull Request Labels

Packages

  • Total packages: 8
  • Total downloads:
    • nuget 3,952 total
    • pypi 383,321 last-month
  • Total docker downloads: 1,799
  • Total dependent packages: 19
    (may contain duplicates)
  • Total dependent repositories: 213
    (may contain duplicates)
  • Total versions: 35
  • Total maintainers: 5
pypi.org: nmslib

Non-Metric Space Library (NMSLIB)

  • Versions: 15
  • Dependent Packages: 17
  • Dependent Repositories: 211
  • Downloads: 357,974 Last month
  • Docker Downloads: 1,799
Rankings
Dependent packages count: 0.6%
Downloads: 0.6%
Dependent repos count: 1.0%
Stargazers count: 1.3%
Average: 1.4%
Docker downloads count: 2.1%
Forks count: 2.6%
Maintainers (1)
Last synced: 10 months ago
proxy.golang.org: github.com/nmslib/nmslib
  • Versions: 13
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Forks count: 0.9%
Stargazers count: 1.1%
Average: 4.6%
Dependent packages count: 7.0%
Dependent repos count: 9.3%
Last synced: 10 months ago
nuget.org: nmslib.v141.static.x64

Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Stargazers count: 0.4%
Forks count: 0.8%
Average: 8.4%
Dependent repos count: 12.7%
Dependent packages count: 19.5%
Last synced: 10 months ago
nuget.org: nmslib.vc141.static.x64

Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 2,895 Total
Rankings
Stargazers count: 0.4%
Forks count: 0.8%
Dependent repos count: 12.7%
Average: 15.3%
Dependent packages count: 19.5%
Downloads: 43.1%
Maintainers (1)
Last synced: 10 months ago
nuget.org: nmslib.vc142.static.x64

Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 1,057 Total
Rankings
Stargazers count: 0.4%
Forks count: 0.8%
Dependent repos count: 12.7%
Average: 15.4%
Dependent packages count: 19.5%
Downloads: 43.4%
Maintainers (1)
Last synced: 10 months ago
anaconda.org: nmslib

Non-Metric Space Library (NMSLIB) is an efficient cross-platform similarity search library and a toolkit for evaluation of similarity search methods. The goal of the project is to create an effective and comprehensive toolkit for searching in generic and non-metric spaces. Even though the library contains a variety of metric-space access methods, our main focus is on generic and approximate search methods, in particular, on methods for non-metric spaces. NMSLIB is possibly the first library with a principled support for non-metric space searching.

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 2
Rankings
Stargazers count: 14.7%
Forks count: 16.2%
Average: 30.0%
Dependent packages count: 41.0%
Dependent repos count: 48.0%
Last synced: 10 months ago
pypi.org: nmslib-metabrainz

Non-Metric Space Library (NMSLIB)

  • Versions: 2
  • Dependent Packages: 1
  • Dependent Repositories: 0
  • Downloads: 24,190 Last month
Rankings
Dependent packages count: 9.9%
Average: 37.5%
Dependent repos count: 65.2%
Maintainers (2)
Last synced: 10 months ago
pypi.org: fixed-install-nmslib

Non-Metric Space Library (NMSLIB)

  • Versions: 1
  • Dependent Packages: 1
  • Dependent Repositories: 0
  • Downloads: 1,157 Last month
Rankings
Dependent packages count: 10.1%
Average: 38.5%
Dependent repos count: 66.8%
Maintainers (1)
Last synced: 10 months ago

Dependencies

query_server/java_client/pom.xml maven
  • commons-cli:commons-cli 1.2
  • org.apache.thrift:libthrift 0.11.0
  • org.slf4j:slf4j-api 1.7.12
python_bindings/dev-requirements.txt pypi
  • flake8 * development
  • numpy >=1.10.0 development
  • pip * development
  • psutil * development
  • pybind11 >=2.2.3 development
  • pytest * development
  • scipy * development
  • setuptools * development
  • six * development
  • twine * development
  • wheel * development
python_bindings/requirements.txt pypi
  • numpy >=1.10.0
  • pybind11 >=2.2.3
  • sphinx_rtd_theme *