Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    7 of 334 committers (2.1%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.0%) to scientific vocabulary

Keywords from Contributors

agents application multi-agents fine-tuning llamaindex rag vector-database mlops data-profilers datacleaner
Last synced: 6 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: seanoliver
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 60.2 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 2 years ago · Last pushed over 2 years ago
Metadata Files
Readme Changelog Contributing License Citation

README.md

🗂️ LlamaIndex 🦙

PyPI - Downloads GitHub contributors Discord

LlamaIndex (GPT Index) is a data framework for your LLM application.

PyPI: - LlamaIndex: https://pypi.org/project/llama-index/. - GPT Index (duplicate): https://pypi.org/project/gpt-index/.

LlamaIndex.TS (Typescript/Javascript): https://github.com/run-llama/LlamaIndexTS.

Documentation: https://gpt-index.readthedocs.io/.

Twitter: https://twitter.com/llama_index.

Discord: https://discord.gg/dGcwcsnxhU.

Ecosystem

  • LlamaHub (community library of data loaders): https://llamahub.ai
  • LlamaLab (cutting-edge AGI projects using LlamaIndex): https://github.com/run-llama/llama-lab

🚀 Overview

NOTE: This README is not updated as frequently as the documentation. Please check out the documentation above for the latest updates!

Context

  • LLMs are a phenomenonal piece of technology for knowledge generation and reasoning. They are pre-trained on large amounts of publicly available data.
  • How do we best augment LLMs with our own private data?

We need a comprehensive toolkit to help perform this data augmentation for LLMs.

Proposed Solution

That's where LlamaIndex comes in. LlamaIndex is a "data framework" to help you build LLM apps. It provides the following tools:

  • Offers data connectors to ingest your existing data sources and data formats (APIs, PDFs, docs, SQL, etc.)
  • Provides ways to structure your data (indices, graphs) so that this data can be easily used with LLMs.
  • Provides an advanced retrieval/query interface over your data: Feed in any LLM input prompt, get back retrieved context and knowledge-augmented output.
  • Allows easy integrations with your outer application framework (e.g. with LangChain, Flask, Docker, ChatGPT, anything else).

LlamaIndex provides tools for both beginner users and advanced users. Our high-level API allows beginner users to use LlamaIndex to ingest and query their data in 5 lines of code. Our lower-level APIs allow advanced users to customize and extend any module (data connectors, indices, retrievers, query engines, reranking modules), to fit their needs.

💡 Contributing

Interested in contributing? See our Contribution Guide for more details.

📄 Documentation

Full documentation can be found here: https://gpt-index.readthedocs.io/en/latest/.

Please check it out for the most up-to-date tutorials, how-to guides, references, and other resources!

💻 Example Usage

pip install llama-index

Examples are in the examples folder. Indices are in the indices folder (see list of indices below).

To build a simple vector store index: ```python import os os.environ["OPENAIAPIKEY"] = 'YOUROPENAIAPI_KEY'

from llamaindex import VectorStoreIndex, SimpleDirectoryReader documents = SimpleDirectoryReader('data').loaddata() index = VectorStoreIndex.from_documents(documents) ```

To query: python query_engine = index.as_query_engine() query_engine.query("<question_text>?")

By default, data is stored in-memory. To persist to disk (under ./storage):

python index.storage_context.persist()

To reload from disk: ```python from llamaindex import StorageContext, loadindexfromstorage

rebuild storage context

storagecontext = StorageContext.fromdefaults(persist_dir='./storage')

load index

index = loadindexfromstorage(storagecontext) ```

🔧 Dependencies

The main third-party package requirements are tiktoken, openai, and langchain.

All requirements should be contained within the setup.py file. To run the package locally without building the wheel, simply run pip install -r requirements.txt.

📖 Citation

Reference to cite if you use LlamaIndex in a paper:

@software{Liu_LlamaIndex_2022, author = {Liu, Jerry}, doi = {10.5281/zenodo.1234}, month = {11}, title = {{LlamaIndex}}, url = {https://github.com/jerryjliu/llama_index}, year = {2022} }

Owner

  • Name: Sean Oliver
  • Login: seanoliver
  • Kind: user
  • Location: San Francisco, CA
  • Company: @gamma-app

Software engineer at @gamma-app. Obsessed with apps, AI, PKM, productivity, and parenting.

GitHub Events

Total
Last Year

Committers

Last synced: about 1 year ago

All Time
  • Total Commits: 1,592
  • Total Committers: 334
  • Avg Commits per committer: 4.766
  • Development Distribution Score (DDS): 0.599
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Jerry Liu j****8@g****m 638
Logan l****h@l****m 195
Simon Suo s****o@g****m 169
Ravi Theja r****1@g****m 20
hongyishi s****8@g****m 19
Sourabh Desai s****i@g****m 19
jon-chuang 9****g 17
Jesse Zhang j****g@g****m 13
Wey Gu w****u@g****m 11
yisding y****g@g****m 9
Jerry Liu j****y@r****m 9
tilleul f****n@g****m 7
Jithin James j****7@g****m 7
Guy Korland g****d@g****m 7
Bruno Bornsztein b****n@g****m 6
Kacper Łukawski k****i 6
Filip Haltmayer 8****t 5
Emanuel Ferreira c****s@g****m 5
Adam Hofmann a****8@g****m 4
Chris Maddox t****7@g****m 4
Ikko Eltociear Ashimine e****r@g****m 4
Mikko M****i 4
Nicolas n****9@g****m 4
Noble Varghese n****6@g****m 4
Ryan Chan r****n@t****k 4
junying1 j****1@g****m 4
Doc Emmett Brown t****h 4
Kaiser Pister p****k@g****m 3
Mourad 1****q 3
Piaoyang Cui b****e@g****m 3
and 304 more...

Issues and Pull Requests

Last synced: 11 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

.github/workflows/build_package.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/codeql.yml actions
  • actions/checkout v3 composite
  • github/codeql-action/analyze v2 composite
  • github/codeql-action/autobuild v2 composite
  • github/codeql-action/init v2 composite
.github/workflows/dev_docs.yml actions
  • actions/checkout v2 composite
  • cpina/github-action-push-to-another-repository main composite
.github/workflows/lint.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/publish_release.yml actions
  • actions/checkout v2 composite
  • actions/create-release v1 composite
  • actions/setup-python v2 composite
  • actions/upload-release-asset v1 composite
  • pypa/gh-action-pypi-publish master composite
.github/workflows/publish_release_gpt_index.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • pypa/gh-action-pypi-publish master composite
.github/workflows/unit_test.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
data_requirements.txt pypi
  • boto3 *
  • discord.py *
  • google-api-python-client *
  • google-auth-httplib2 *
  • google-auth-oauthlib *
  • jsonpath-ng *
  • moto *
  • pymongo *
  • slack_sdk *
  • vellum-ai ==0.0.15
  • wikipedia *
docs/requirements.txt pypi
  • autodoc_pydantic *
  • docutils <0.17
  • furo >=2023.3.27
  • m2r2 *
  • myst-nb *
  • myst-parser *
  • pydantic <2.0.0
  • sphinx >=4.3.0
  • sphinx-autobuild *
  • sphinx_rtd_theme *
pyproject.toml pypi
requirements.txt pypi
  • black ==23.7.0
  • ipython ==8.10.0
  • mypy ==0.991
  • pre-commit ==3.2.0
  • pylint ==2.15.10
  • pytest ==7.2.1
  • pytest-asyncio ==0.21.0
  • pytest-dotenv ==0.5.2
  • rake_nltk ==1.0.6
  • ruff ==0.0.285
  • types-redis ==4.5.5.0
  • types-requests ==2.28.11.8
  • types-setuptools ==67.1.0.0
setup.py pypi
  • beautifulsoup4 *
  • dataclasses_json *
  • fsspec >=2023.5.0
  • langchain >=0.0.293
  • nest_asyncio *
  • nltk *
  • numpy *
  • openai >=0.26.4
  • pandas *
  • sqlalchemy >=2.0.15
  • tenacity >=8.2.0,<9.0.0
  • tiktoken *
  • typing-inspect >=0.8.0
  • typing_extensions >=4.5.0
  • urllib3 <2