Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.3%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: RyanLCh
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 42.2 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 2 years ago · Last pushed over 2 years ago
Metadata Files
Readme Changelog Contributing License Citation

README.md

🗂️ LlamaIndex 🦙

LlamaIndex (GPT Index) is a data framework for your LLM application.

PyPI: - LlamaIndex: https://pypi.org/project/llama-index/. - GPT Index (duplicate): https://pypi.org/project/gpt-index/.

Documentation: https://gpt-index.readthedocs.io/.

Twitter: https://twitter.com/llama_index.

Discord: https://discord.gg/dGcwcsnxhU.

Ecosystem

  • LlamaHub (community library of data loaders): https://llamahub.ai
  • LlamaLab (cutting-edge AGI projects using LlamaIndex): https://github.com/run-llama/llama-lab

🚀 Overview

NOTE: This README is not updated as frequently as the documentation. Please check out the documentation above for the latest updates!

Context

  • LLMs are a phenomenonal piece of technology for knowledge generation and reasoning. They are pre-trained on large amounts of publicly available data.
  • How do we best augment LLMs with our own private data?

We need a comprehensive toolkit to help perform this data augmentation for LLMs.

Proposed Solution

That's where LlamaIndex comes in. LlamaIndex is a "data framework" to help you build LLM apps. It provides the following tools:

  • Offers data connectors to ingest your existing data sources and data formats (APIs, PDFs, docs, SQL, etc.)
  • Provides ways to structure your data (indices, graphs) so that this data can be easily used with LLMs.
  • Provides an advanced retrieval/query interface over your data: Feed in any LLM input prompt, get back retrieved context and knowledge-augmented output.
  • Allows easy integrations with your outer application framework (e.g. with LangChain, Flask, Docker, ChatGPT, anything else).

LlamaIndex provides tools for both beginner users and advanced users. Our high-level API allows beginner users to use LlamaIndex to ingest and query their data in 5 lines of code. Our lower-level APIs allow advanced users to customize and extend any module (data connectors, indices, retrievers, query engines, reranking modules), to fit their needs.

💡 Contributing

Interested in contributing? See our Contribution Guide for more details.

📄 Documentation

Full documentation can be found here: https://gpt-index.readthedocs.io/en/latest/.

Please check it out for the most up-to-date tutorials, how-to guides, references, and other resources!

💻 Example Usage

pip install llama-index

Examples are in the examples folder. Indices are in the indices folder (see list of indices below).

To build a simple vector store index: ```python import os os.environ["OPENAIAPIKEY"] = 'YOUROPENAIAPI_KEY'

from llamaindex import VectorStoreIndex, SimpleDirectoryReader documents = SimpleDirectoryReader('data').loaddata() index = VectorStoreIndex.from_documents(documents) ```

To query: python query_engine = index.as_query_engine() query_engine.query("<question_text>?")

By default, data is stored in-memory. To persist to disk (under ./storage):

python index.storage_context.persist()

To reload from disk: ```python from llamaindex import StorageContext, loadindexfromstorage

rebuild storage context

storagecontext = StorageContext.fromdefaults(persist_dir='./storage')

load index

index = loadindexfromstorage(storagecontext) ```

🔧 Dependencies

The main third-party package requirements are tiktoken, openai, and langchain.

All requirements should be contained within the setup.py file. To run the package locally without building the wheel, simply run pip install -r requirements.txt.

📖 Citation

Reference to cite if you use LlamaIndex in a paper:

@software{Liu_LlamaIndex_2022, author = {Liu, Jerry}, doi = {10.5281/zenodo.1234}, month = {11}, title = {{LlamaIndex}}, url = {https://github.com/jerryjliu/llama_index}, year = {2022} }

Owner

  • Name: Ryan La
  • Login: RyanLCh
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Liu"
  given-names: "Jerry"
  orcid: "https://orcid.org/0000-0002-6694-3517"
title: "LlamaIndex"
doi: 10.5281/zenodo.1234
date-released: 2022-11-1
url: "https://github.com/jerryjliu/llama_index"

GitHub Events

Total
Last Year

Dependencies

.github/workflows/build_package.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/dev_docs.yml actions
  • actions/checkout v2 composite
  • cpina/github-action-push-to-another-repository main composite
.github/workflows/lint.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/publish_release.yml actions
  • actions/checkout v2 composite
  • actions/create-release v1 composite
  • actions/setup-python v2 composite
  • actions/upload-release-asset v1 composite
  • pypa/gh-action-pypi-publish master composite
.github/workflows/publish_release_gpt_index.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • pypa/gh-action-pypi-publish master composite
.github/workflows/unit_test.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
data_requirements.txt pypi
  • boto3 *
  • discord.py *
  • google-api-python-client *
  • google-auth-httplib2 *
  • google-auth-oauthlib *
  • jsonpath-ng *
  • moto *
  • pymongo *
  • slack_sdk *
  • vellum-ai ==0.0.15
  • wikipedia *
docs/requirements.txt pypi
  • docutils <0.17
  • furo >=2023.3.27
  • m2r2 *
  • myst-nb *
  • myst-parser *
  • sphinx >=4.3.0
  • sphinx-autobuild *
  • sphinx_rtd_theme *
requirements.txt pypi
  • black ==22.12.0
  • ipython ==8.10.0
  • mypy ==0.991
  • pre-commit ==3.2.0
  • pylint ==2.15.10
  • pytest ==7.2.1
  • pytest-dotenv ==0.5.2
  • rake_nltk ==1.0.6
  • ruff ==0.0.259
  • types-redis ==4.5.5.0
  • types-requests ==2.28.11.8
  • types-setuptools ==67.1.0.0
setup.py pypi
  • dataclasses_json *
  • fsspec >=2023.5.0
  • langchain >=0.0.154
  • numpy *
  • openai >=0.26.4
  • pandas *
  • sqlalchemy >=2.0.15
  • tenacity >=8.2.0,<9.0.0
  • typing-inspect ==0.8.0
  • typing_extensions ==4.5.0
  • urllib3 <2