vearch

Distributed vector search for AI-native applications

https://github.com/vearch/vearch

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 70 committers (1.4%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.2%) to scientific vocabulary

Keywords

ai-native ai-native-database cloud-native document-retrieval embeddings hybrid-search rag retrieval-augmented-generation vector-database vector-search vectors

Keywords from Contributors

mesh interactive serializer packaging network-simulation shellcodes hacking autograding observability optim
Last synced: 6 months ago · JSON representation

Repository

Distributed vector search for AI-native applications

Basic Info
  • Host: GitHub
  • Owner: vearch
  • License: apache-2.0
  • Language: Go
  • Default Branch: master
  • Homepage: https://vearch.github.io
  • Size: 36.1 MB
Statistics
  • Stars: 2,217
  • Watchers: 76
  • Forks: 349
  • Open Issues: 167
  • Releases: 28
Topics
ai-native ai-native-database cloud-native document-retrieval embeddings hybrid-search rag retrieval-augmented-generation vector-database vector-search vectors
Created almost 7 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Code of conduct Governance Roadmap

README.md

简体中文 | English

License: Apache-2.0 Build Status Go Report Card Gitter

Overview

Vearch is a cloud-native distributed vector database for efficient similarity search of embedding vectors in your AI applications.

Key features

  • Hybrid search: Both vector search and scalar filtering.

  • Performance: Fast vector retrieval - search from millions of objects in milliseconds.

  • Scalability & Reliability: Replication and elastic scaling out.

Document

Restful APIs

OpenAPIs

SDK

| SDK | Description | |--------------------------------------------------|--------------------------------| | Python SDK | Python client for Vearch | | Go SDK | Go client for Vearch | | Java SDK | Java client for Vearch | | Rust SDK | Rust client for Vearch |

Usage Cases

Use Vearch as a Memory Backend

Vearch integrates with popular AI frameworks:

| Framework | Integration | |-----------|-------------| | Langchain | Use Vearch as vector store in Langchain | | LlamaIndex | Integrate with LlamaIndex for knowledge bases | | Langchaingo | Go implementation of Langchain with Vearch support | | LangChain4j | Java implementation with Vearch integration |

Real world Demos

  • VisualSearch: Vearch can be leveraged to build a complete visual search system to index billions of images. The image retrieval plugin for object detection and feature extraction is also required.

Quick start

Kubernetes Deployment

```

Via Helm Repository

$ helm repo add vearch https://vearch.github.io/vearch-helm $ helm repo update && helm install my-release vearch/vearch

Or from Local Charts

$ git clone https://github.com/vearch/vearch-helm.git && cd vearch-helm $ helm install my-release ./charts -f ./charts/values.yaml ```

Docker Compose Deployment

```

Standalone Mode

$ cd cloud && cp ../config/config.toml . $ docker-compose --profile standalone up -d

Cluster Mode

$ cd cloud && cp ../config/config_cluster.toml . $ docker-compose --profile cluster up -d ```

Other Deployment Methods - DeployByDocker: Deploy Vearch by Docker - SourceCompileDeployment: Compile Vearch from source code

Components

Vearch Architecture

arc

Master: Responsible for schema management, cluster-level metadata, and resource coordination.

Router: Provides RESTful API: upsert, delete, search and query; request routing, and result merging.

PartitionServer (PS): Hosts document partitions with raft-based replication. Gamma is the core vector search engine implemented based on faiss. It provides the ability of storing, indexing and retrieving the vectors and scalars.

Technical Reference

Academic Citation

When using Vearch in academic or research projects, please cite our paper: @misc{li2019design, title={The Design and Implementation of a Real Time Visual Search System on JD E-commerce Platform}, author={Jie Li and Haifeng Liu and Chuanghua Gui and Jianyu Chen and Zhenyun Ni and Ning Wang}, year={2019}, eprint={1908.07389}, archivePrefix={arXiv}, primaryClass={cs.IR} }

Community Support

Connect With Us

Connect with the Vearch community through multiple channels:

  • GitHub Issues: Report bugs or request features on our issues page
  • Email Discussion: For public discussion or questions, contact us at vearch-maintainers@groups.io
  • Slack Channel: Join our community on Slack for real-time discussions

Contribution

We welcome contributions from the community! Check our contribution guidelines to get started.

License

Vearch is licensed under the Apache License, Version 2.0.

For complete licensing details, please see LICENSE and NOTICE in our repository.


© 2019 Vearch Contributors. All Rights Reserved.

Owner

  • Name: vector search infrastructure for AI applications
  • Login: vearch
  • Kind: organization

GitHub Events

Total
  • Create event: 14
  • Release event: 5
  • Issues event: 16
  • Watch event: 150
  • Delete event: 12
  • Issue comment event: 25
  • Push event: 96
  • Gollum event: 2
  • Pull request review comment event: 3
  • Pull request review event: 6
  • Pull request event: 45
  • Fork event: 24
Last Year
  • Create event: 14
  • Release event: 5
  • Issues event: 16
  • Watch event: 150
  • Delete event: 12
  • Issue comment event: 25
  • Push event: 96
  • Gollum event: 2
  • Pull request review comment event: 3
  • Pull request review event: 6
  • Pull request event: 45
  • Fork event: 24

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 993
  • Total Committers: 70
  • Avg Commits per committer: 14.186
  • Development Distribution Score (DDS): 0.657
Past Year
  • Commits: 265
  • Committers: 9
  • Avg Commits per committer: 29.444
  • Development Distribution Score (DDS): 0.509
Top Committers
Name Email Commits
wxingda v****a@q****m 341
zcdb z****6@o****m 287
ljeagle 7****3@q****m 64
bjliuqiang l****o@j****m 33
ansj a****n@1****m 24
sunjian126 s****1@j****m 22
syhao 3****8@q****m 19
gDreamcatcher 8****1@q****m 19
kevintony001 3****1 16
root r****t@A****L 14
Haifeng Liu b****u 14
dependabot[bot] 4****] 12
nemo 1****4@q****m 10
Xiong LIU l****t@g****m 10
zhanghexian 1****3@q****m 7
yinpengfei7 y****3@j****m 7
Patrick Ge 5****0 7
qiutianme f****9@1****m 6
guoyande 5****a 4
unknown g****e@3****l 4
stuartjing s****g@s****m 3
qiang.zhou y****1@q****m 3
nizyun n****n@1****m 3
Martin7-1 y****e@g****m 3
root r****t@A****L 3
zhanchao1 z****3@j****m 3
zhanghexian1 z****1@j****m 3
“ljeagle” “****i@y****” 3
xqk 3****8@q****m 2
yanwr1 y****1@j****m 2
and 40 more...
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 148
  • Total pull requests: 164
  • Average time to close issues: 7 months
  • Average time to close pull requests: 6 days
  • Total issue authors: 71
  • Total pull request authors: 31
  • Average comments per issue: 3.67
  • Average comments per pull request: 0.14
  • Merged pull requests: 121
  • Bot issues: 0
  • Bot pull requests: 38
Past Year
  • Issues: 11
  • Pull requests: 53
  • Average time to close issues: about 22 hours
  • Average time to close pull requests: 9 days
  • Issue authors: 7
  • Pull request authors: 7
  • Average comments per issue: 0.91
  • Average comments per pull request: 0.23
  • Merged pull requests: 26
  • Bot issues: 0
  • Bot pull requests: 17
Top Authors
Issue Authors
  • guotf520 (16)
  • hanqiushi (9)
  • xueqizhang121 (8)
  • chloefresh (7)
  • bladehliu (6)
  • lqhandsome (5)
  • Girll (4)
  • rrjia (4)
  • CodinSheep (4)
  • xincrazy (3)
  • caixuzhong1 (3)
  • zs420803498 (3)
  • qingfengshiran (2)
  • zcdb (2)
  • cyxu11 (2)
Pull Request Authors
  • dependabot[bot] (38)
  • zcdb (26)
  • wxingda (20)
  • zhanghexian (14)
  • cococo2000 (12)
  • yanwr1 (11)
  • Martin7-1 (4)
  • skayi (2)
  • xueboSmile (2)
  • testwill (2)
  • 908080 (2)
  • mrc1119 (2)
  • finlay-liu (2)
  • liule-pi (2)
  • qiuqiu-lovely (2)
Top Labels
Issue Labels
Feature request (3) Api (2) Install (1) Vector index (1) Bug (1) help wanted (1) Storage (1) Docker (1) Discussion (1)
Pull Request Labels
dependencies (40) go (37) size:M (13) size:S (6) size:L (4) size:XS (4) size:XL (2) Documentation (1) Feature request (1) java (1)

Packages

  • Total packages: 8
  • Total downloads:
    • pypi 457 last-month
  • Total dependent packages: 2
    (may contain duplicates)
  • Total dependent repositories: 1
    (may contain duplicates)
  • Total versions: 54
  • Total maintainers: 2
proxy.golang.org: github.com/vearch/vearch/v3
  • Versions: 25
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 6.9%
Average: 7.4%
Dependent repos count: 7.8%
Last synced: 6 months ago
proxy.golang.org: github.com/vearch/vearch

Copyright 2019 The Vearch Authors. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 7.0%
Average: 8.2%
Dependent repos count: 9.3%
Last synced: 6 months ago
pypi.org: vearch

A library for efficient similarity search and storage of deep learning vectors.

  • Versions: 13
  • Dependent Packages: 1
  • Dependent Repositories: 1
  • Downloads: 208 Last month
Rankings
Stargazers count: 1.7%
Forks count: 3.0%
Dependent packages count: 7.4%
Downloads: 7.5%
Average: 8.4%
Dependent repos count: 22.3%
Maintainers (2)
Last synced: 6 months ago
proxy.golang.org: github.com/vearch/vearch/sdk/go/v3
  • Versions: 0
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 8.4%
Average: 9.0%
Dependent repos count: 9.5%
Last synced: 7 months ago
proxy.golang.org: github.com/vearch/vearch/tools/backup
  • Versions: 0
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 8.5%
Average: 9.0%
Dependent repos count: 9.6%
Last synced: 7 months ago
proxy.golang.org: github.com/vearch/vearch/sdk/go
  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 9.3%
Average: 9.9%
Dependent repos count: 10.5%
Last synced: 7 months ago
pypi.org: vearch-cluster

A library for efficient similarity search and storage of deep learning vectors.

  • Versions: 2
  • Dependent Packages: 1
  • Dependent Repositories: 0
  • Downloads: 164 Last month
Rankings
Stargazers count: 1.7%
Forks count: 3.0%
Dependent packages count: 10.1%
Average: 20.7%
Downloads: 21.5%
Dependent repos count: 67.1%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pyvearch

A library for efficient similarity search and storage of deep learning vectors.

  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 85 Last month
Rankings
Dependent packages count: 10.9%
Average: 36.1%
Dependent repos count: 61.3%
Maintainers (1)
Last synced: 6 months ago

Dependencies

go.mod go
  • github.com/BurntSushi/toml v0.3.1
  • github.com/HdrHistogram/hdrhistogram-go v1.1.2
  • github.com/StackExchange/wmi v1.2.1
  • github.com/apache/thrift v0.12.1-0.20190702001503-1a2dee60b438
  • github.com/caio/go-tdigest v3.1.0+incompatible
  • github.com/cespare/xxhash/v2 v2.1.2
  • github.com/codahale/hdrhistogram v0.9.0
  • github.com/coreos/go-systemd v0.0.0-20191104093116-d3cd4ed1dbcf
  • github.com/dgrijalva/jwt-go v3.2.1-0.20190620180102-5e25c22bd5d6+incompatible
  • github.com/dustin/go-humanize v1.0.0
  • github.com/dustin/gojson v0.0.0-20160307161227-2e71ec9dd5ad
  • github.com/fatih/color v1.7.1-0.20181010231311-3f9d52f7176a
  • github.com/fsnotify/fsnotify v1.4.9
  • github.com/ghodss/yaml v1.0.1-0.20190212211648-25d852aebe32
  • github.com/gin-gonic/gin v1.7.0
  • github.com/gogo/protobuf v1.3.2
  • github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da
  • github.com/golang/mock v1.5.0
  • github.com/golang/protobuf v1.5.2
  • github.com/google/btree v1.0.1
  • github.com/google/flatbuffers v1.11.1-0.20191218192354-ce3a1c43a288
  • github.com/google/uuid v1.3.0
  • github.com/gorilla/mux v1.6.3-0.20180903154305-9e1f5955c0d2
  • github.com/gorilla/websocket v1.4.2
  • github.com/grpc-ecosystem/go-grpc-middleware v1.3.0
  • github.com/grpc-ecosystem/grpc-gateway v1.16.0
  • github.com/hashicorp/consul/api v1.1.0
  • github.com/jasonlvhit/gocron v0.0.0-20190121134850-6771d4b492ba
  • github.com/jonboulle/clockwork v0.2.2
  • github.com/json-iterator/go v1.1.12
  • github.com/juju/ratelimit v1.0.1
  • github.com/julienschmidt/httprouter v1.3.0
  • github.com/kavu/go_reuseport v1.4.1-0.20181221084137-1f6171f327ed
  • github.com/leesper/go_rng v0.0.0-20190531154944-a612b043e353
  • github.com/miekg/dns v1.1.25
  • github.com/mitchellh/mapstructure v1.4.1
  • github.com/mmcloughlin/geohash v0.0.0-20181009053802-f7f2bcae3294
  • github.com/opentracing/opentracing-go v1.1.0
  • github.com/patrickmn/go-cache v2.1.1-0.20180815053127-5633e0862627+incompatible
  • github.com/pkg/errors v0.9.1
  • github.com/pkg/sftp v1.10.1
  • github.com/prometheus/client_golang v1.11.0
  • github.com/prometheus/common v0.32.1
  • github.com/prometheus/procfs v0.7.3
  • github.com/rs/cors v1.6.1-0.20190613161432-33ffc0734c60
  • github.com/shirou/gopsutil v2.17.13-0.20180927124308-a11c78ba2c13+incompatible
  • github.com/shirou/w32 v0.0.0-20160930032740-bb4de0191aa4
  • github.com/shopspring/decimal v1.3.1
  • github.com/sirupsen/logrus v1.8.1
  • github.com/smallnest/pool v0.0.0-20170926025334-4f76a6d6402e
  • github.com/smallnest/rpcx v1.4.2-0.20190627094758-28d08d166104
  • github.com/soheilhy/cmux v0.1.5
  • github.com/spaolacci/murmur3 v1.1.0
  • github.com/spf13/cast v1.3.1
  • github.com/spf13/pflag v1.0.5
  • github.com/tiglabs/raft v0.0.0-20200304095606-b25a44ad8b33
  • github.com/tmc/grpc-websocket-proxy v0.0.0-20201229170055-e5319fda7802
  • github.com/uber/jaeger-client-go v2.30.0+incompatible
  • github.com/uber/jaeger-lib v2.4.1+incompatible
  • github.com/valyala/fastjson v1.1.1
  • github.com/vmihailenco/msgpack v4.0.4+incompatible
  • go.etcd.io/bbolt v1.3.6
  • go.etcd.io/etcd v0.5.0-alpha.5.0.20190801225801-f1c7fd3d53b0
  • go.opencensus.io v0.23.0
  • go.uber.org/atomic v1.9.0
  • go.uber.org/multierr v1.7.0
  • go.uber.org/zap v1.19.1
  • golang.org/x/crypto v0.0.0-20210921155107-089bfa567519
  • golang.org/x/exp v0.0.0-20200224162631-6cc2880d07d6
  • golang.org/x/lint v0.0.0-20210508222113-6edffad5e616
  • golang.org/x/net v0.0.0-20211101193420-4a448f8816b3
  • golang.org/x/sys v0.0.0-20211101204403-39c9dd37992c
  • golang.org/x/text v0.3.7
  • golang.org/x/time v0.0.0-20210723032227-1f47c861a9ac
  • golang.org/x/time=>golang.org/x/time v0.0.0-20190308202827-9d24e82272b4
  • gonum.org/v1/gonum v0.9.3
  • google.golang.org/appengine v1.6.7
  • google.golang.org/genproto v0.0.0-20211101144312-62acf1d99145
  • google.golang.org/grpc v1.41.0
  • google.golang.org/grpc=>google.golang.org/grpc v1.2.1-0.20180928173848-b48e364c83c8
  • gotest.tools v2.1.1-0.20181001141646-317cc193f525+incompatible
  • sigs.k8s.io/yaml v1.3.0
go.sum go
  • 763 dependencies