legalgpt-mtp

https://github.com/yatharthsameer/legalgpt-mtp

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (15.6%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: yatharthsameer
License: apache-2.0
Language: Python
Default Branch: main
Size: 2.73 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created over 1 year ago · Last pushed over 1 year ago

Metadata Files

Readme Changelog License Citation

🔒 PrivateGPT 📑

Gradio UI

LegalGPT is a production-ready AI project that allows you to ask questions about your documents using the power of Large Language Models (LLMs), even in scenarios without an Internet connection. 100% private, no data leaves your execution environment at any point.

The API is divided into two logical blocks:

High-level API, which abstracts all the complexity of a RAG (Retrieval Augmented Generation) pipeline implementation: - Ingestion of documents: internally managing document parsing, splitting, metadata extraction, embedding generation and storage. - Chat & Completions using context from ingested documents: abstracting the retrieval of context, the prompt engineering and the response generation.

Low-level API, which allows advanced users to implement their own complex pipelines: - Embeddings generation: based on a piece of text. - Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested documents.

In addition to this, a working Gradio UI client is provided to test the API, together with a set of useful tools such as bulk model download script, ingestion script, documents folder watch, etc.

🎞️ Overview

[!WARNING] This README is not updated as frequently as the documentation. Please check it out for the latest updates!

Motivation behind PrivateGPT

Generative AI is a game changer for our society, but adoption in companies of all sizes and data-sensitive domains like healthcare or legal is limited by a clear concern: privacy. Not being able to ensure that your data is fully under your control when using third-party AI tools is a risk those industries cannot take.

Primordial version

The first version of LegalGPT was launched in May 2023 as a novel approach to address the privacy concerns by using LLMs in a complete offline way.

That version, which rapidly became a go-to project for privacy-sensitive setups and served as the seed for thousands of local-focused generative AI projects, was the foundation of what LegalGPT is becoming nowadays; thus a simpler and more educational implementation to understand the basic concepts required to build a fully local -and therefore, private- chatGPT-like tool.

If you want to keep experimenting with it, we have saved it in the primordial branch of the project.

It is strongly recommended to do a clean clone and install of this new version of LegalGPT if you come from the previous, primordial version.

Present and Future of LegalGPT

LegalGPT is now evolving towards becoming a gateway to generative AI models and primitives, including completions, document ingestion, RAG pipelines and other low-level building blocks. We want to make it easier for any developer to build AI applications and experiences, as well as provide a suitable extensive architecture for the community to keep contributing.

Stay tuned to our releases to check out all the new features and changes included.

📄 Documentation

Full documentation on installation, dependencies, configuration, running the server, deployment options, ingesting local documents, API details and UI features can be found here: https://docs.privategpt.dev/

🧩 Architecture

Conceptually, LegalGPT is an API that wraps a RAG pipeline and exposes its primitives. * The API is built using FastAPI and follows OpenAI's API scheme. * The RAG pipeline is based on LlamaIndex.

The design of LegalGPT allows to easily extend and adapt both the API and the RAG implementation. Some key architectural decisions are: * Dependency Injection, decoupling the different components and layers. * Usage of LlamaIndex abstractions such as LLM, BaseEmbedding or VectorStore, making it immediate to change the actual implementations of those abstractions. * Simplicity, adding as few layers and new abstractions as possible. * Ready to use, providing a full implementation of the API and RAG pipeline.

Owner

Name: YATHARTH SAMEER
Login: yatharthsameer
Kind: user
Company: IIT KHARAGPUR

Repositories: 4
Profile: https://github.com/yatharthsameer

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: PrivateGPT
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - name: Zylon by PrivateGPT
    address: hello@zylon.ai
    website: 'https://www.zylon.ai/'
repository-code: 'https://github.com/zylon-ai/private-gpt'
license: Apache-2.0
date-released: '2023-05-02'

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science