private-gpt

Interact with your documents using the power of GPT, 100% privately, no data leaks

https://github.com/zylon-ai/private-gpt

Keywords from Contributors

agents gemini anthropic multi-agents vector-database langchain fine-tuning application llamaindex rag

Last synced: 9 months ago · JSON representation ·

Repository

Interact with your documents using the power of GPT, 100% privately, no data leaks

Basic Info

Host: GitHub
Owner: zylon-ai
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://privategpt.dev
Size: 2.65 MB

Statistics

Stars: 56,524
Watchers: 477
Forks: 7,574
Open Issues: 278
Releases: 10

Created about 3 years ago · Last pushed over 1 year ago

Metadata Files

Readme Changelog License Citation

PrivateGPT

Gradio UI

PrivateGPT is a production-ready AI project that allows you to ask questions about your documents using the power of Large Language Models (LLMs), even in scenarios without an Internet connection. 100% private, no data leaves your execution environment at any point.

[!TIP] If you are looking for an enterprise-ready, fully private AI workspace check out Zylon's website or request a demo. Crafted by the team behind PrivateGPT, Zylon is a best-in-class AI collaborative workspace that can be easily deployed on-premise (data center, bare metal...) or in your private cloud (AWS, GCP, Azure...).

The project provides an API offering all the primitives required to build private, context-aware AI applications. It follows and extends the OpenAI API standard, and supports both normal and streaming responses.

The API is divided into two logical blocks:

High-level API, which abstracts all the complexity of a RAG (Retrieval Augmented Generation) pipeline implementation: - Ingestion of documents: internally managing document parsing, splitting, metadata extraction, embedding generation and storage. - Chat & Completions using context from ingested documents: abstracting the retrieval of context, the prompt engineering and the response generation.

Low-level API, which allows advanced users to implement their own complex pipelines: - Embeddings generation: based on a piece of text. - Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested documents.

In addition to this, a working Gradio UI client is provided to test the API, together with a set of useful tools such as bulk model download script, ingestion script, documents folder watch, etc.

🎞️ Overview

[!WARNING] This README is not updated as frequently as the documentation. Please check it out for the latest updates!

Motivation behind PrivateGPT

Generative AI is a game changer for our society, but adoption in companies of all sizes and data-sensitive domains like healthcare or legal is limited by a clear concern: privacy. Not being able to ensure that your data is fully under your control when using third-party AI tools is a risk those industries cannot take.

Primordial version

The first version of PrivateGPT was launched in May 2023 as a novel approach to address the privacy concerns by using LLMs in a complete offline way.

That version, which rapidly became a go-to project for privacy-sensitive setups and served as the seed for thousands of local-focused generative AI projects, was the foundation of what PrivateGPT is becoming nowadays; thus a simpler and more educational implementation to understand the basic concepts required to build a fully local -and therefore, private- chatGPT-like tool.

If you want to keep experimenting with it, we have saved it in the primordial branch of the project.

It is strongly recommended to do a clean clone and install of this new version of PrivateGPT if you come from the previous, primordial version.

Present and Future of PrivateGPT

PrivateGPT is now evolving towards becoming a gateway to generative AI models and primitives, including completions, document ingestion, RAG pipelines and other low-level building blocks. We want to make it easier for any developer to build AI applications and experiences, as well as provide a suitable extensive architecture for the community to keep contributing.

Stay tuned to our releases to check out all the new features and changes included.

📄 Documentation

Full documentation on installation, dependencies, configuration, running the server, deployment options, ingesting local documents, API details and UI features can be found here: https://docs.privategpt.dev/

🧩 Architecture

Conceptually, PrivateGPT is an API that wraps a RAG pipeline and exposes its primitives. * The API is built using FastAPI and follows OpenAI's API scheme. * The RAG pipeline is based on LlamaIndex.

The design of PrivateGPT allows to easily extend and adapt both the API and the RAG implementation. Some key architectural decisions are: * Dependency Injection, decoupling the different components and layers. * Usage of LlamaIndex abstractions such as LLM, BaseEmbedding or VectorStore, making it immediate to change the actual implementations of those abstractions. * Simplicity, adding as few layers and new abstractions as possible. * Ready to use, providing a full implementation of the API and RAG pipeline.

Main building blocks: * APIs are defined in private_gpt:server:<api>. Each package contains an <api>_router.py (FastAPI layer) and an <api>_service.py (the service implementation). Each Service uses LlamaIndex base abstractions instead of specific implementations, decoupling the actual implementation from its usage. * Components are placed in private_gpt:components:<component>. Each Component is in charge of providing actual implementations to the base abstractions used in the Services - for example LLMComponent is in charge of providing an actual implementation of an LLM (for example LlamaCPP or OpenAI).

💡 Contributing

Contributions are welcomed! To ensure code quality we have enabled several format and typing checks, just run make check before committing to make sure your code is ok. Remember to test your code! You'll find a tests folder with helpers, and you can run tests using make test command.

Don't know what to contribute? Here is the public Project Board with several ideas.

Head over to Discord

contributors channel and ask for write permissions on that GitHub project.

💬 Community

Join the conversation around PrivateGPT on our: - Twitter (aka X) - Discord

📖 Citation

If you use PrivateGPT in a paper, check out the Citation file for the correct citation.
You can also use the "Cite this repository" button in this repo to get the citation in different formats.

Here are a couple of examples:

BibTeX

bibtex @software{Zylon_PrivateGPT_2023, author = {Zylon by PrivateGPT}, license = {Apache-2.0}, month = may, title = {{PrivateGPT}}, url = {https://github.com/zylon-ai/private-gpt}, year = {2023} }

APA

Zylon by PrivateGPT (2023). PrivateGPT [Computer software]. https://github.com/zylon-ai/private-gpt

🤗 Partners & Supporters

PrivateGPT is actively supported by the teams behind: * Qdrant, providing the default vector database * Fern, providing Documentation and SDKs * LlamaIndex, providing the base RAG framework and abstractions

This project has been strongly influenced and supported by other amazing projects like LangChain, GPT4All, LlamaCpp, Chroma and SentenceTransformers.

Owner

Name: Zylon
Login: zylon-ai
Kind: organization
Email: hello@zylon.ai

Website: https://zylon.ai/
Twitter: ZylonPrivateGPT
Repositories: 1
Profile: https://github.com/zylon-ai

The AI collaborator for every workplace

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: PrivateGPT
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - name: Zylon by PrivateGPT
    address: hello@zylon.ai
    website: 'https://www.zylon.ai/'
repository-code: 'https://github.com/zylon-ai/private-gpt'
license: Apache-2.0
date-released: '2023-05-02'

Committers

Last synced: about 1 year ago

All Time

Total Commits: 285
Total Committers: 94
Avg Commits per committer: 3.032
Development Distribution Score (DDS): 0.733

Past Year

Commits: 52
Committers: 19
Avg Commits per committer: 2.737
Development Distribution Score (DDS): 0.442

Top Committers

Name	Email	Commits
Iván Martínez	i**t@g**m	76
Javier Martinez	j**8@g**m	29
lopagela	l**m@o**r	16
Pablo Orgaz	p**c@g**m	15
github-actions[bot]	4****]	10
Brett England	b**t@d**m	6
R-Y-M-R	3****R	5
icsy7867	w**3@g**m	5
impulsivus	o**6@i**r	4
jiangzhuo	j**7@g**m	4
alxspiker	a**r@g**m	4
MDW	m****d	4
Andrea Pinto	a**o@g**m	4
Fabio Rossini Sluzala	f**s@g**m	3
Federico Grandi	f**0@g**m	3
Marco Repetto	1****x	3
ivan-ontruck	i**n@o**m	3
Francisco García Sierra	f**a@g**m	2
Fran García	f**1@g**m	2
Gianni Acquisto	3****o	2
Ikko Eltociear Ashimine	e**r@g**m	2
Daniel Gallego Vico	d**o@g**m	2
3ly-13	1****3	2
Jiang Sheng	d****i	2
Liam Dowd	1****d	2
Sorin Neacsu	s****n	2
abhiruka	a****a	2
milescattini	m**i@g**m	2
Ravi	r**d@y**m	2
CognitiveTech	c**q@y**m	2
and 64 more...

Committer Domains (Top 20 + Academic)

robert-hirsch.de: 1 mail.ru: 1 retr0.blog: 1 lederle24.at: 1 hey.com: 1 renninger.dev: 1 danielmcdonald.dev: 1 live.nl: 1 it8.ru: 1 mtulio.eng.br: 1 yandex.ru: 1 sssihl.edu.in: 1 ontruck.com: 1 itu.edu.tr: 1 dbzoo.com: 1 orange.fr: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 177
Total pull requests: 102
Average time to close issues: 4 months
Average time to close pull requests: about 1 month
Total issue authors: 163
Total pull request authors: 63
Average comments per issue: 4.85
Average comments per pull request: 1.81
Merged pull requests: 40
Bot issues: 0
Bot pull requests: 5

Past Year

Issues: 55
Pull requests: 21
Average time to close issues: 6 days
Average time to close pull requests: 3 days
Issue authors: 51
Pull request authors: 16
Average comments per issue: 1.35
Average comments per pull request: 1.1
Merged pull requests: 4
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Rastih271980 (5)
rexzhang2023 (5)
spsach (4)
Siraj-Bhai (4)
anamariaUIC (3)
rmuhawieh (3)
ashunaveed (3)
gulilker (3)
kj4274 (2)
K-J-VV (2)
estkae (2)
beaconai (2)
fred-gb (2)
mmeiste (2)
exander77 (2)

Pull Request Authors

jaluma (47)
github-actions[bot] (12)
kasunami (5)
andreakiro (5)
R-Y-M-R (4)
itsliamdowd (4)
imartinez (3)
basicbloke (3)
qdm12 (3)
raul836 (2)
BBC-Esq (2)
slale-91 (2)
patrickhwood (2)
akshayjalluri6 (2)
zhygallo (2)

Top Labels

Issue Labels

question (41) bug (40) primordial (39) enhancement (7) needs confirmation (4) stale (4) documentation (3) invalid (1) good first issue (1) duplicate (1)

Pull Request Labels

primordial (9) autorelease: pending (9) autorelease: tagged (3) Docker (3) stale (3) enhancement (2) help wanted (1) question (1)

Packages

Total packages: 1
Total downloads: unknown

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 10

proxy.golang.org: github.com/zylon-ai/private-gpt

Documentation: https://pkg.go.dev/github.com/zylon-ai/private-gpt#section-documentation
License: apache-2.0
Latest release: v0.6.2
published almost 2 years ago

Versions: 10
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 5.6%

Average: 5.8%

Dependent repos count: 5.9%

Last synced: 10 months ago

private-gpt

Science Score: 54.0%

Keywords from Contributors

Repository

Basic Info

Statistics

Metadata Files

README.md

PrivateGPT

🎞️ Overview

Motivation behind PrivateGPT

Primordial version

Present and Future of PrivateGPT

📄 Documentation

🧩 Architecture

💡 Contributing

contributors channel and ask for write permissions on that GitHub project.

💬 Community

📖 Citation

BibTeX

APA

🤗 Partners & Supporters

Owner

Citation (CITATION.cff)

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

proxy.golang.org: github.com/zylon-ai/private-gpt

Rankings