nmslib
Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.
Science Score: 33.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org, scholar.google, sciencedirect.com -
✓Committers with academic emails
2 of 49 committers (4.1%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.7%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.
Basic Info
Statistics
- Stars: 3,530
- Watchers: 93
- Forks: 461
- Open Issues: 98
- Releases: 19
Topics
Metadata Files
README.md
Non-Metric Space Library (NMSLIB)
Important Notes
- NMSLIB is generic but fast, see the results of ANN benchmarks.
- A standalone implementation of our fastest method HNSW also exists as a header-only library.
- All the documentation (including using Python bindings and the query server, description of methods and spaces, building the library, etc) can be found on this page.
- For generic questions/inquiries, please, use the Gitter chat: GitHub issues page is for bugs and feature requests.
Objectives
Non-Metric Space Library (NMSLIB) is an efficient cross-platform similarity search library and a toolkit for evaluation of similarity search methods. The core-library does not have any third-party dependencies. It has been gaining popularity recently. In particular, it has become a part of Amazon Elasticsearch Service.
The goal of the project is to create an effective and comprehensive toolkit for searching in generic and non-metric spaces. Even though the library contains a variety of metric-space access methods, our main focus is on generic and approximate search methods, in particular, on methods for non-metric spaces. NMSLIB is possibly the first library with a principled support for non-metric space searching.
NMSLIB is an extendible library, which means that is possible to add new search methods and distance functions. NMSLIB can be used directly in C++ and Python (via Python bindings). In addition, it is also possible to build a query server, which can be used from Java (or other languages supported by Apache Thrift (version 0.12). Java has a native client, i.e., it works on many platforms without requiring a C++ library to be installed.
Authors: Bilegsaikhan Naidan, Leonid Boytsov, Yury Malkov, David Novak. With contributions from Ben Frederickson, Lawrence Cayton, Wei Dong, Avrelin Nikita, Dmitry Yashunin, Bob Poekert, @orgoro, @gregfriedland, Scott Gigante, Maxim Andreev, Daniel Lemire, Nathan Kurz, Alexander Ponomarenko.
Brief History
NMSLIB started as a personal project of Bilegsaikhan Naidan, who created the initial code base, the Python bindings, and participated in earlier evaluations. The most successful class of methods--neighborhood/proximity graphs--is represented by the Hierarchical Navigable Small World Graph (HNSW) due to Malkov and Yashunin (see the publications below). Other most useful methods, include a modification of the VP-tree due to Boytsov and Naidan (2013), a Neighborhood APProximation index (NAPP) proposed by Tellez et al. (2013) and improved by David Novak, as well as a vanilla uncompressed inverted file.
Credits and Citing
If you find this library useful, feel free to cite our SISAP paper [BibTex] as well as other papers listed in the end. One crucial contribution to cite is the fast Hierarchical Navigable World graph (HNSW) method [BibTex]. Please, also check out the stand-alone HNSW implementation by Yury Malkov, which is released as a header-only HNSWLib library.
License
The code is released under the Apache License Version 2.0 http://www.apache.org/licenses/. Older versions of the library include additional components, which have different licenses (but this does not apply to NMLISB 2.x):
Older versions of the library included the following components: * The LSHKIT, which is embedded in our library, is distributed under the GNU General Public License, see http://www.gnu.org/licenses/. * The k-NN graph construction algorithm NN-Descent due to Dong et al. 2011 (see the links below), which is also embedded in our library, seems to be covered by a free-to-use license, similar to Apache 2. * FALCONN library's licence is MIT.
Funding
Leonid Boytsov was supported by the Open Advancement of Question Answering Systems (OAQA) group and the following NSF grant #1618159: "Matching and Ranking via Proximity Graphs: Applications to Question Answering and Beyond". Bileg was supported by the iAd Center.
Related Publications
Most important related papers are listed below in the chronological order:
* L. Boytsov, D. Novak, Y. Malkov, E. Nyberg (2016). Off the Beaten Path: Let’s Replace Term-Based Retrieval
with k-NN Search. In proceedings of CIKM'16. [BibTex] We use a special branch of this library, plus the following Java code.
* Malkov, Y.A., Yashunin, D.A.. (2016). Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. CoRR, abs/1603.09320. [BibTex]
* Bilegsaikhan, N., Boytsov, L. 2015 Permutation Search Methods are Efficient, Yet Faster Search is Possible PVLDB, 8(12):1618--1629, 2015 [BibTex]
* Ponomarenko, A., Averlin, N., Bilegsaikhan, N., Boytsov, L., 2014. Comparative Analysis of Data Structures for Approximate Nearest Neighbor Search. [BibTex]
* Malkov, Y., Ponomarenko, A., Logvinov, A., & Krylov, V., 2014. Approximate nearest neighbor algorithm based on navigable small world graphs. Information Systems, 45, 61-68. [BibTex]
* Boytsov, L., Bilegsaikhan, N., 2013. Engineering Efficient and Effective Non-Metric Space Library. In Proceedings of the 6th International Conference on Similarity Search and Applications (SISAP 2013). [BibTex]
* Boytsov, L., Bilegsaikhan, N., 2013. Learning to Prune in Metric and Non-Metric Spaces. In Advances in Neural Information Processing Systems 2013. [BibTex]
* Tellez, Eric Sadit, Edgar Chávez, and Gonzalo Navarro. Succinct nearest neighbor search. Information Systems 38.7 (2013): 1019-1030. [BibTex]
* A. Ponomarenko, Y. Malkov, A. Logvinov, and V. Krylov Approximate nearest
neighbor search small world approach. ICTA 2011
* Dong, Wei, Charikar Moses, and Kai Li. 2011. Efficient k-nearest neighbor graph construction for generic similarity measures. Proceedings of the 20th international conference on World wide web. ACM, 2011.
[BibTex]
* L. Cayton, 2008 Fast nearest neighbor retrieval for bregman divergences. Twenty-Fifth International Conference on Machine Learning (ICML). [BibTex]
* Amato, Giuseppe, and Pasquale Savino. 2008 Approximate similarity search in metric spaces using inverted files. [BibTex]
* Gonzalez, Edgar Chavez, Karina Figueroa, and Gonzalo Navarro. Effective proximity retrieval by ordering permutations. Pattern Analysis and Machine Intelligence, IEEE Transactions on 30.9 (2008): 1647-1658. [BibTex]
Owner
- Name: nmslib
- Login: nmslib
- Kind: organization
- Repositories: 2
- Profile: https://github.com/nmslib
GitHub Events
Total
- Issues event: 5
- Watch event: 127
- Issue comment event: 24
- Pull request review event: 3
- Pull request review comment event: 3
- Pull request event: 2
- Fork event: 13
Last Year
- Issues event: 5
- Watch event: 127
- Issue comment event: 24
- Pull request review event: 3
- Pull request review comment event: 3
- Pull request event: 2
- Fork event: 13
Committers
Last synced: about 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| searchivarius | l****o@b****o | 1,125 |
| Ben Frederickson | g****b@b****m | 39 |
| bileg | y****u@e****m | 24 |
| John Mazanec | j****e@a****m | 20 |
| Greg Friedland | g****g@a****m | 17 |
| Leonid Boytsov | l****o@L****e | 13 |
| searchivarius | l****d@f****a | 9 |
| Yuri Malkov | y****v@i****m | 9 |
| bileg | b****i@g****m | 8 |
| Yury Malkov | y****v@m****u | 8 |
| Leonid Boytsov | l****o@L****l | 5 |
| Scott Gigante | s****e@g****m | 5 |
| Simon Hewitt | si@s****k | 5 |
| Maxim Andreev | m****v@n****m | 3 |
| Joris Vankerschaver | j****r@g****m | 3 |
| Andrey Goryachev | a****v@g****m | 3 |
| Abdallah Moussawi | a****i@a****m | 3 |
| Paul Wise | p****s@d****g | 2 |
| Janakarajan Natarajan | j****n@a****m | 2 |
| Huon Wilson | H****n@d****u | 2 |
| David Novak | n****d@g****m | 2 |
| Bob Poekert | b****b@p****m | 2 |
| orgoro | o****o@g****m | 2 |
| Will Sackfield | s****d@s****m | 2 |
| jjjamie | j****g@g****m | 2 |
| searchivarius | l****d@t****f | 2 |
| Jamie Brunning | j****b@i****l | 2 |
| bileg | b****i@j****l | 2 |
| Ubuntu | u****u@i****l | 1 |
| Ubuntu | u****u@i****l | 1 |
| and 19 more... | ||
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 107
- Total pull requests: 23
- Average time to close issues: 5 months
- Average time to close pull requests: 2 months
- Total issue authors: 80
- Total pull request authors: 19
- Average comments per issue: 5.15
- Average comments per pull request: 3.7
- Merged pull requests: 13
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 7
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: less than a minute
- Issue authors: 7
- Pull request authors: 1
- Average comments per issue: 0.86
- Average comments per pull request: 1.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- abmitra84 (6)
- celsofranssa (4)
- xiankgx (3)
- davisidarta (3)
- PLNech (3)
- searchivarius (3)
- adtsvetkov (2)
- lsorber (2)
- sh1ng (2)
- TamHHM (2)
- ivan-marroquin (2)
- zh-en520 (2)
- MarcYin (2)
- barracuda156 (2)
- ebursztein (2)
Pull Request Authors
- barracuda156 (4)
- shatejas (2)
- Prokuma (2)
- jmazanec15 (2)
- kamelCased (2)
- janaknat (2)
- pabs3 (2)
- netj (1)
- jvkersch (1)
- JewlsIOB (1)
- GuilhemN (1)
- Cadovvl (1)
- geofft (1)
- mrgentlemanus (1)
- luyuncheng (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 8
-
Total downloads:
- nuget 3,952 total
- pypi 383,321 last-month
- Total docker downloads: 1,799
-
Total dependent packages: 19
(may contain duplicates) -
Total dependent repositories: 213
(may contain duplicates) - Total versions: 35
- Total maintainers: 5
pypi.org: nmslib
Non-Metric Space Library (NMSLIB)
- Homepage: https://github.com/nmslib/nmslib
- Documentation: https://nmslib.readthedocs.io/
- License: apache-2.0
-
Latest release: 2.1.1
published over 5 years ago
Rankings
Maintainers (1)
proxy.golang.org: github.com/nmslib/nmslib
- Documentation: https://pkg.go.dev/github.com/nmslib/nmslib#section-documentation
- License: apache-2.0
-
Latest release: v2.1.1+incompatible
published over 5 years ago
Rankings
nuget.org: nmslib.v141.static.x64
Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.
- Homepage: https://github.com/nmslib/nmslib
- License: Apache-2.0
-
Latest release: 1.7.3.6
published over 126 years ago
Rankings
nuget.org: nmslib.vc141.static.x64
Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.
- Homepage: https://github.com/nmslib/nmslib
- License: Apache-2.0
-
Latest release: 1.7.3.6
published about 7 years ago
Rankings
Maintainers (1)
nuget.org: nmslib.vc142.static.x64
Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.
- Homepage: https://github.com/nmslib/nmslib
- License: Apache-2.0
-
Latest release: 1.7.3.6
published about 7 years ago
Rankings
Maintainers (1)
anaconda.org: nmslib
Non-Metric Space Library (NMSLIB) is an efficient cross-platform similarity search library and a toolkit for evaluation of similarity search methods. The goal of the project is to create an effective and comprehensive toolkit for searching in generic and non-metric spaces. Even though the library contains a variety of metric-space access methods, our main focus is on generic and approximate search methods, in particular, on methods for non-metric spaces. NMSLIB is possibly the first library with a principled support for non-metric space searching.
- Homepage: https://github.com/nmslib/nmslib
- License: Apache-2.0
-
Latest release: 2.1.1
published about 3 years ago
Rankings
pypi.org: nmslib-metabrainz
Non-Metric Space Library (NMSLIB)
- Homepage: https://github.com/nmslib/nmslib
- Documentation: https://nmslib-metabrainz.readthedocs.io/
- License: apache-2.0
-
Latest release: 2.1.3
published about 2 years ago
Rankings
Maintainers (2)
pypi.org: fixed-install-nmslib
Non-Metric Space Library (NMSLIB)
- Homepage: https://github.com/nmslib/nmslib
- Documentation: https://fixed-install-nmslib.readthedocs.io/
- License: apache-2.0
-
Latest release: 2.1.2
published over 2 years ago
Rankings
Maintainers (1)
Dependencies
- commons-cli:commons-cli 1.2
- org.apache.thrift:libthrift 0.11.0
- org.slf4j:slf4j-api 1.7.12
- flake8 * development
- numpy >=1.10.0 development
- pip * development
- psutil * development
- pybind11 >=2.2.3 development
- pytest * development
- scipy * development
- setuptools * development
- six * development
- twine * development
- wheel * development
- numpy >=1.10.0
- pybind11 >=2.2.3
- sphinx_rtd_theme *