gptindexfeb15
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.2%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: alexisneuhaus
- License: mit
- Language: Python
- Default Branch: main
- Size: 4.76 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
🗂️ ️GPT Index
GPT Index is a project consisting of a set of data structures designed to make it easier to use large external knowledge bases with LLMs.
PyPi: https://pypi.org/project/gpt-index/.
Documentation: https://gpt-index.readthedocs.io/en/latest/.
Twitter: https://twitter.com/gpt_index.
Discord: https://discord.gg/dGcwcsnxhU.
LlamaHub (community library of data loaders): https://llamahub.ai
🚀 Overview
NOTE: This README is not updated as frequently as the documentation. Please check out the documentation above for the latest updates!
Context
- LLMs are a phenomenonal piece of technology for knowledge generation and reasoning.
- A big limitation of LLMs is context size (e.g. Davinci's limit is 4096 tokens. Large, but not infinite).
- The ability to feed "knowledge" to LLMs is restricted to this limited prompt size and model weights.
Proposed Solution
At its core, GPT Index contains a toolkit of index data structures designed to easily connect LLM's with your external data. GPT Index helps to provide the following advantages: - Remove concerns over prompt size limitations. - Abstract common usage patterns to reduce boilerplate code in your LLM app. - Provide data connectors to your common data sources (Google Docs, Slack, etc.). - Provide cost transparency + tools that reduce cost while increasing performance.
Each data structure offers distinct use cases and a variety of customizable parameters. These indices can then be queried in a general purpose manner, in order to achieve any task that you would typically achieve with an LLM: - Question-Answering - Summarization - Text Generation (Stories, TODO's, emails, etc.) - and more!
💡 Contributing
Interesting in contributing? See our Contribution Guide for more details.
📄 Documentation
Full documentation can be found here: https://gpt-index.readthedocs.io/en/latest/.
Please check it out for the most up-to-date tutorials, how-to guides, references, and other resources!
💻 Example Usage
pip install gpt-index
Examples are in the examples folder. Indices are in the indices folder (see list of indices below).
To build a simple vector store index: ```python import os os.environ["OPENAIAPIKEY"] = 'YOUROPENAIAPI_KEY'
from gptindex import GPTSimpleVectorIndex, SimpleDirectoryReader documents = SimpleDirectoryReader('data').loaddata() index = GPTSimpleVectorIndex(documents) ```
To save to and load from disk: ```python
save to disk
index.savetodisk('index.json')
load from disk
index = GPTSimpleVectorIndex.loadfromdisk('index.json') ```
To query:
python
index.query("<question_text>?")
🔧 Dependencies
The main third-party package requirements are tiktoken, openai, and langchain.
All requirements should be contained within the setup.py file. To run the package locally without building the wheel, simply run pip install -r requirements.txt.
📖 Citation
Reference to cite if you use GPT Index in a paper:
@software{Liu_GPT_Index_2022,
author = {Liu, Jerry},
doi = {10.5281/zenodo.1234},
month = {11},
title = {{GPT Index}},
url = {https://github.com/jerryjliu/gpt_index},year = {2022}
}
Owner
- Login: alexisneuhaus
- Kind: user
- Repositories: 6
- Profile: https://github.com/alexisneuhaus
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Liu" given-names: "Jerry" orcid: "https://orcid.org/0000-0002-6694-3517" title: "GPT Index" doi: 10.5281/zenodo.1234 date-released: 2022-11-1 url: "https://github.com/jerryjliu/gpt_index"
GitHub Events
Total
Last Year
Dependencies
- discord.py *
- google-api-python-client *
- google-auth-httplib2 *
- google-auth-oauthlib *
- pymongo *
- slack_sdk *
- wikipedia *
- docutils <0.17
- myst-parser *
- sphinx >=4.3.0
- sphinx_rtd_theme >=0.5.1
- pandas *
- dataclasses_json *
- langchain *
- nltk *
- numpy *
- openai >=0.26.4
- pandas *
- tenacity <8.2.0
- tiktoken *
- transformers *
- black ==22.12.0
- flake8 ==6.0.0
- flake8-docstrings ==1.6.0
- ipython ==8.10.0
- isort ==5.11.4
- mypy ==0.991
- pylint ==2.15.10
- pytest ==7.2.1
- pytest-dotenv ==0.5.2
- rake_nltk ==1.0.6
- types-requests ==2.28.11.8
- types-setuptools ==67.1.0.0
- dataclasses_json *
- langchain *
- nltk *
- numpy *
- openai >=0.26.4
- pandas *
- tenacity <8.2.0
- transformers *