llama_index

LlamaIndex is the leading framework for building LLM-powered agents over your data.

https://github.com/run-llama/llama_index

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
○
Academic publication links
✓
Committers with academic emails
23 of 1498 committers (1.5%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (4.1%) to scientific vocabulary

Keywords

agents application data fine-tuning framework llamaindex llm multi-agents rag vector-database

Keywords from Contributors

jax autopep8 codeformatter pre-commit-hook yapf cryptocurrencies formatter python39 python313 python312

Scientific Fields

Engineering Computer Science - 40% confidence

Last synced: 6 months ago · JSON representation

Repository

LlamaIndex is the leading framework for building LLM-powered agents over your data.

Basic Info

Host: GitHub
Owner: run-llama
License: mit
Language: Python
Default Branch: main
Homepage: https://docs.llamaindex.ai
Size: 315 MB

Statistics

Stars: 43,943
Watchers: 263
Forks: 6,323
Open Issues: 277
Releases: 469

Topics

agents application data fine-tuning framework llamaindex llm multi-agents rag vector-database

Created over 3 years ago · Last pushed 6 months ago

Metadata Files

Readme Changelog Contributing License Code of conduct Citation Security

LlamaIndex

LlamaIndex (GPT Index) is a data framework for your LLM application. Building with LlamaIndex typically involves working with LlamaIndex core and a chosen set of integrations (or plugins). There are two ways to start building with LlamaIndex in Python:

Starter: llama-index. A starter Python package that includes core LlamaIndex as well as a selection of integrations.
Customized: llama-index-core. Install core LlamaIndex and add your chosen LlamaIndex integration packages on LlamaHub that are required for your application. There are over 300 LlamaIndex integration packages that work seamlessly with core, allowing you to build with your preferred LLM, embedding, and vector store providers.

The LlamaIndex Python library is namespaced such that import statements which include core imply that the core package is being used. In contrast, those statements without core imply that an integration package is being used.

```python

typical pattern

from llamaindex.core.xxx import ClassABC # core submodule xxx from llamaindex.xxx.yyy import ( SubclassABC, ) # integration yyy for submodule xxx

concrete example

from llamaindex.core.llms import LLM from llamaindex.llms.openai import OpenAI ```

Important Links

LlamaIndex.TS (Typescript/Javascript)

Documentation

X (formerly Twitter)

Discord

Ecosystem

LlamaHub (community library of data loaders)
LlamaLab (cutting-edge AGI projects using LlamaIndex)

Overview

NOTE: This README is not updated as frequently as the documentation. Please check out the documentation above for the latest updates!

Context

LLMs are a phenomenal piece of technology for knowledge generation and reasoning. They are pre-trained on large amounts of publicly available data.
How do we best augment LLMs with our own private data?

We need a comprehensive toolkit to help perform this data augmentation for LLMs.

Proposed Solution

That's where LlamaIndex comes in. LlamaIndex is a "data framework" to help you build LLM apps. It provides the following tools:

Offers data connectors to ingest your existing data sources and data formats (APIs, PDFs, docs, SQL, etc.).
Provides ways to structure your data (indices, graphs) so that this data can be easily used with LLMs.
Provides an advanced retrieval/query interface over your data: Feed in any LLM input prompt, get back retrieved context and knowledge-augmented output.
Allows easy integrations with your outer application framework (e.g. with LangChain, Flask, Docker, ChatGPT, or anything else).

LlamaIndex provides tools for both beginner users and advanced users. Our high-level API allows beginner users to use LlamaIndex to ingest and query their data in 5 lines of code. Our lower-level APIs allow advanced users to customize and extend any module (data connectors, indices, retrievers, query engines, reranking modules), to fit their needs.

Contributing

Interested in contributing? Contributions to LlamaIndex core as well as contributing integrations that build on the core are both accepted and highly encouraged! See our Contribution Guide for more details.

New integrations should meaningfully integrate with existing LlamaIndex framework components. At the discretion of LlamaIndex maintainers, some integrations may be declined.

Documentation

Full documentation can be found here

Please check it out for the most up-to-date tutorials, how-to guides, references, and other resources!

Example Usage

```sh

custom selection of integrations to work with core

pip install llama-index-core pip install llama-index-llms-openai pip install llama-index-llms-replicate pip install llama-index-embeddings-huggingface ```

Examples are in the docs/examples folder. Indices are in the indices folder (see list of indices below).

To build a simple vector store index using OpenAI:

```python import os

os.environ["OPENAIAPIKEY"] = "YOUROPENAIAPI_KEY"

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("YOURDATADIRECTORY").loaddata() index = VectorStoreIndex.fromdocuments(documents) ```

To build a simple vector store index using non-OpenAI LLMs, e.g. Llama 2 hosted on Replicate, where you can easily create a free trial API token:

```python import os

os.environ["REPLICATEAPITOKEN"] = "YOURREPLICATEAPI_TOKEN"

from llamaindex.core import Settings, VectorStoreIndex, SimpleDirectoryReader from llamaindex.embeddings.huggingface import HuggingFaceEmbedding from llama_index.llms.replicate import Replicate from transformers import AutoTokenizer

set the LLM

llama27bchat = "meta/llama-2-7b-chat:8e6975e5ed6174911a6ff3d60540dfd4844201974602551e10e9e87ab143d81e" Settings.llm = Replicate( model=llama27bchat, temperature=0.01, additionalkwargs={"topp": 1, "maxnewtokens": 300}, )

set tokenizer to match LLM

Settings.tokenizer = AutoTokenizer.from_pretrained( "NousResearch/Llama-2-7b-chat-hf" )

set the embed model

Settings.embedmodel = HuggingFaceEmbedding( modelname="BAAI/bge-small-en-v1.5" )

documents = SimpleDirectoryReader("YOURDATADIRECTORY").loaddata() index = VectorStoreIndex.fromdocuments( documents, ) ```

To query:

python query_engine = index.as_query_engine() query_engine.query("YOUR_QUESTION")

By default, data is stored in-memory. To persist to disk (under ./storage):

python index.storage_context.persist()

To reload from disk:

```python from llamaindex.core import StorageContext, loadindexfromstorage

rebuild storage context

storagecontext = StorageContext.fromdefaults(persist_dir="./storage")

load index

index = loadindexfromstorage(storagecontext) ```

Dependencies

We use poetry as the package manager for all Python packages. As a result, the dependencies of each Python package can be found by referencing the pyproject.toml file in each of the package's folders.

bash cd <desired-package-folder> pip install poetry poetry install --with dev

Citation

Reference to cite if you use LlamaIndex in a paper:

@software{Liu_LlamaIndex_2022, author = {Liu, Jerry}, doi = {10.5281/zenodo.1234}, month = {11}, title = {{LlamaIndex}}, url = {https://github.com/jerryjliu/llama_index}, year = {2022} }

Owner

Name: LlamaIndex
Login: run-llama
Kind: organization

Repositories: 29
Profile: https://github.com/run-llama

Committers

Last synced: 10 months ago

All Time

Total Commits: 6,390
Total Committers: 1,498
Avg Commits per committer: 4.266
Development Distribution Score (DDS): 0.843

Past Year

Commits: 2,270
Committers: 719
Avg Commits per committer: 3.157
Development Distribution Score (DDS): 0.789

Top Committers

Name	Email	Commits
Logan	l**h@l**m	1,001
Jerry Liu	j**8@g**m	919
dependabot[bot]	4****]	404
Simon Suo	s**o@g**m	210
Andrei Fajardo	9****i	169
Haotian Zhang	s**g@g**m	135
Massimiliano Pippi	m**i@g**m	125
Ravi Theja	r**1@g**m	104
Laurie Voss	g**b@s**m	72
James Braza	j**a@g**m	69
Sourabh Desai	s**i@g**m	51
Emanuel Ferreira	c**s@g**m	39
Javier Torres	j**s@g**m	30
Matthew Farrellee	m**t@c**u	29
Tomaz Bratanic	b**z@g**m	25
Nick Fiacco	n**o@g**m	24
Ethan Yang	e**g@i**m	22
Ofer Mendelevitch	o**d@g**m	22
Huu Le (Lee)	3****j	20
Guodong	s**g@1**m	20
yisding	y**g@g**m	19
hongyishi	s**8@g**m	19
Roger Yang	8****g	19
Jael Gu	m**u@z**m	18
jon-chuang	9****g	17
Shorthills AI	1****I	16
Rendy Febry	r**y@e**m	16
Anoop Sharma	a**7@g**m	16
Aaron Jimenez	a**v@g**m	16
Jordan Parker	j**6@g**m	15
and 1,468 more...

Committer Domains (Top 20 + Academic)

qq.com: 14 163.com: 9 intel.com: 6 microsoft.com: 6 alibaba-inc.com: 3 elastic.co: 3 pm.me: 3 oracle.com: 3 foxmail.com: 3 zilliz.com: 3 umich.edu: 3 redis.com: 2 yandex.ru: 2 amazon.com: 2 me.com: 2 vesoft.com: 2 naver.com: 2 getboon.ai: 2 turing.ac.uk: 2 mongodb.com: 2 nyu.edu: 1 etu.univ-grenoble-alpes.fr: 1 asu.edu: 1 brown.edu: 1 g.ucla.edu: 1 imperial.ac.uk: 1 cs.wisc.edu: 1 ischool.berkeley.edu: 1 alum.mit.edu: 1 uwaterloo.ca: 1 osu.edu: 1 stu.pku.edu.cn: 1 berkeley.edu: 1 indstate.edu: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 4,387
Total pull requests: 6,265
Average time to close issues: about 2 months
Average time to close pull requests: 6 days
Total issue authors: 2,470
Total pull request authors: 1,353
Average comments per issue: 2.54
Average comments per pull request: 0.73
Merged pull requests: 4,924
Bot issues: 4
Bot pull requests: 410

Past Year

Issues: 1,618
Pull requests: 3,251
Average time to close issues: 22 days
Average time to close pull requests: 2 days
Issue authors: 998
Pull request authors: 634
Average comments per issue: 1.72
Average comments per pull request: 0.71
Merged pull requests: 2,581
Bot issues: 0
Bot pull requests: 38

View more stats

Top Authors

Issue Authors

nerdai (43)
logan-markewich (43)
justinzyw (34)
brycecf (34)
mirallm (30)
gich2009 (29)
Prem-Nitin (28)
strawgate (28)
JINO-ROHIT (27)
DataNoob0723 (22)
912100012 (21)
mw19930312 (20)
LikhithRishi (19)
tituslhy (19)
RakeshReddyKondeti (17)

Pull Request Authors

logan-markewich (1,103)
dependabot[bot] (410)
masci (237)
jerryjliu (222)
nerdai (175)
AstraBert (97)
seldo (76)
ravi03071991 (72)
hatianzhang (58)
sourabhdesai (53)
EmanuelCampos (53)
nightosong (36)
mattf (32)
Javtor (32)
tomasonjo (31)

Top Labels

Issue Labels

triage (2,427) bug (1,832) question (1,615) enhancement (633) stale (427) documentation (70) P1 (54) P0 (39) v0.10.X (33) docs (24) P2 (22) lgtm (21) p2 (18) size:XS (14) size:S (10) request contribution board (9) topic:workflows (8) size:M (4) size:L (3) vector store (3) dependencies (3) good first issue (3) size:XXL (3) azure (2) discord (2) LlamaParse (2) contributions wanted (1) topic:ollama (1) topic:vector stores (1) index (1)

Pull Request Labels

lgtm (3,374) size:XS (1,892) size:L (1,011) size:S (932) size:M (880) dependencies (410) size:XL (378) size:XXL (238) bug (39) triage (37) python (35) question (33) topic:workflows (17) docs (14) enhancement (13) documentation (9) P1 (8) github_actions (7) P0 (6) stale (3) P2 (2) topic:vector stores (1) toipic:storage (1) package:llama-index-readers-confluence (1) package:llama-index-embeddings-nvidia (1) package:llama-index-vector-stores-weaviate (1) package:llama-index-embeddings-vertex (1) package:llama-index-llms-nvidia (1) package:llama-index-postprocessor-dashscope-rerank (1) topic:CI (1)

Packages

Total packages: 12
Total downloads:
- pypi 7,084,371 last-month
Total docker downloads: 637,410

Total dependent packages: 742
(may contain duplicates)
Total dependent repositories: 1,464
(may contain duplicates)
Total versions: 1,087
Total maintainers: 4
Total advisories: 15

pypi.org: llama-index

Interface between LLMs and your data

Homepage: https://llamaindex.ai
Documentation: https://docs.llamaindex.ai/en/stable/
License: MIT
Latest release: 0.12.10
published about 1 year ago

Versions: 468
Dependent Packages: 153
Dependent Repositories: 1,464
Downloads: 2,538,631 Last month
Docker Downloads: 637,410

Rankings

Stargazers count: 0.1%

Forks count: 0.2%

Dependent packages count: 0.2%

Dependent repos count: 0.3%

Downloads: 0.3%

Average: 0.7%

Docker downloads count: 2.9%

Maintainers (2)

jerryjliu simonsdsuo

Advisories (8)

Last synced: about 1 year ago

proxy.golang.org: github.com/run-llama/llama_index

Documentation: https://pkg.go.dev/github.com/run-llama/llama_index#section-documentation
License: mit
Latest release: v0.13.3
published 6 months ago

Versions: 350
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 9.4%

Average: 10.0%

Dependent repos count: 10.6%

Last synced: 6 months ago

pypi.org: llama-index-retrievers-superlinked

LlamaIndex retriever integration for Superlinked

Homepage: https://github.com/run-llama/llama_index
Documentation: https://llama-index-retrievers-superlinked.readthedocs.io/
License: MIT
Latest release: 0.1.1
published 6 months ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 141 Last month

Rankings

Stargazers count: 0.2%

Forks count: 0.2%

Dependent packages count: 8.7%

Average: 14.5%

Dependent repos count: 48.9%

Maintainers (1)

jerryjliu

Last synced: 6 months ago

pypi.org: lindex-patch

Interface between LLMs and your data

Homepage: https://llamaindex.ai
Documentation: https://docs.llamaindex.ai/en/stable/
License: MIT
Latest release: 1.0.23
published 9 months ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 42 Last month

Rankings

Dependent packages count: 9.1%

Average: 30.2%

Dependent repos count: 51.2%

Maintainers (1)

amithkk

Last synced: 6 months ago

pypi.org: llama-index-retrievers-vectorize

llama-index retrievers Vectorize.io integration

Documentation: https://llama-index-retrievers-vectorize.readthedocs.io/
License: MIT License
Latest release: 0.2.0
published 7 months ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 75 Last month

Rankings

Dependent packages count: 9.1%

Average: 30.3%

Dependent repos count: 51.5%

Maintainers (1)

jerryjliu

Last synced: 6 months ago

pypi.org: llama-index-test-starter

Interface between LLMs and your data

Homepage: https://llamaindex.ai
Documentation: https://docs.llamaindex.ai/en/stable/
License: MIT
Latest release: 0.10.0
published about 2 years ago

Versions: 14
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 28 Last month

Rankings

Dependent packages count: 9.9%

Average: 37.6%

Dependent repos count: 65.3%

Maintainers (1)

jerryjliu

Last synced: 6 months ago

pypi.org: llama-index-bundle

Interface between LLMs and your data

Homepage: https://llamaindex.ai
Documentation: https://docs.llamaindex.ai/en/stable/
License: MIT
Latest release: 0.0.1
published about 2 years ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 17 Last month

Rankings

Dependent packages count: 9.9%

Average: 37.7%

Dependent repos count: 65.4%

Maintainers (1)

jerryjliu

Last synced: 6 months ago

pypi.org: flying-delta

Interface between LLMs and your data

Homepage: https://llamaindex.ai
Documentation: https://docs.llamaindex.ai/en/stable/
License: MIT
Latest release: 0.1.0
published about 2 years ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 18 Last month

Rankings

Dependent packages count: 9.9%

Average: 37.7%

Dependent repos count: 65.4%

Maintainers (1)

nerdai

Last synced: 6 months ago

pypi.org: llama-index-legacy

Interface between LLMs and your data

Homepage: https://llamaindex.ai
Documentation: https://docs.llamaindex.ai/en/stable/
License: MIT
Latest release: 0.9.48
published about 2 years ago

Versions: 9
Dependent Packages: 10
Dependent Repositories: 0
Downloads: 339,812 Last month

Rankings

Dependent packages count: 9.9%

Average: 37.7%

Dependent repos count: 65.4%

Maintainers (1)

jerryjliu

Last synced: 6 months ago

pypi.org: llama-index-core

Interface between LLMs and your data

Homepage: https://llamaindex.ai
Documentation: https://docs.llamaindex.ai/en/stable/
License: mit
Latest release: 0.13.4
published 6 months ago

Versions: 231
Dependent Packages: 518
Dependent Repositories: 0
Downloads: 4,205,568 Last month

Rankings

Dependent packages count: 9.9%

Average: 37.7%

Dependent repos count: 65.4%

Maintainers (1)

jerryjliu

Advisories (7)

Last synced: 6 months ago

pypi.org: flying-delta-legacy

Interface between LLMs and your data

Homepage: https://llamaindex.ai
Documentation: https://docs.llamaindex.ai/en/stable/
License: MIT
Latest release: 0.9.42
published about 2 years ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 19 Last month

Rankings

Dependent packages count: 9.9%

Average: 37.7%

Dependent repos count: 65.5%

Maintainers (1)

nerdai

Last synced: 6 months ago

pypi.org: flying-delta-core

Interface between LLMs and your data

Homepage: https://llamaindex.ai
Documentation: https://docs.llamaindex.ai/en/stable/
License: MIT
Latest release: 0.9.40
published about 2 years ago

Versions: 6
Dependent Packages: 61
Dependent Repositories: 0
Downloads: 20 Last month

Rankings

Dependent packages count: 10.0%

Average: 38.0%

Dependent repos count: 66.0%

Maintainers (1)

nerdai

Last synced: 6 months ago

Dependencies

.github/workflows/build_package.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/codeql.yml actions

actions/checkout v3 composite
github/codeql-action/analyze v2 composite
github/codeql-action/autobuild v2 composite
github/codeql-action/init v2 composite

.github/workflows/dev_docs.yml actions

actions/checkout v2 composite
cpina/github-action-push-to-another-repository main composite

.github/workflows/lint.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/publish_release.yml actions

actions/checkout v2 composite
actions/create-release v1 composite
actions/setup-python v2 composite
actions/upload-release-asset v1 composite
pypa/gh-action-pypi-publish master composite

.github/workflows/publish_release_gpt_index.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite
pypa/gh-action-pypi-publish master composite

.github/workflows/unit_test.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

data_requirements.txt pypi

boto3 *
discord.py *
google-api-python-client *
google-auth-httplib2 *
google-auth-oauthlib *
jsonpath-ng *
moto *
pymongo *
slack_sdk *
vellum-ai ==0.0.15
wikipedia *

docs/requirements.txt pypi

autodoc_pydantic *
docutils <0.17
furo >=2023.3.27
m2r2 *
myst-nb *
myst-parser *
pydantic <2.0.0
sphinx >=4.3.0
sphinx-autobuild *
sphinx_rtd_theme *

pyproject.toml pypi

requirements.txt pypi

black ==23.7.0
ipython ==8.10.0
mypy ==0.991
pre-commit ==3.2.0
pylint ==2.15.10
pytest ==7.2.1
pytest-asyncio ==0.21.0
pytest-dotenv ==0.5.2
rake_nltk ==1.0.6
ruff ==0.0.285
types-redis ==4.5.5.0
types-requests ==2.28.11.8
types-setuptools ==67.1.0.0

setup.py pypi

beautifulsoup4 *
dataclasses_json *
fsspec >=2023.5.0
langchain >=0.0.293
nest_asyncio *
nltk *
numpy *
openai >=0.26.4
pandas *
sqlalchemy >=2.0.15
tenacity >=8.2.0,<9.0.0
tiktoken *
typing-inspect >=0.8.0
typing_extensions >=4.5.0
urllib3 <2

llama_index

Science Score: 49.0%

Keywords

Keywords from Contributors

Scientific Fields

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

LlamaIndex

typical pattern

concrete example

Important Links

Ecosystem

Overview

Context

Proposed Solution

Contributing

Documentation

Example Usage

custom selection of integrations to work with core

set the LLM

set tokenizer to match LLM

set the embed model

rebuild storage context

load index

Dependencies

Citation

Owner

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: llama-index

Rankings

Maintainers (2)

Advisories (8)

proxy.golang.org: github.com/run-llama/llama_index

Rankings

pypi.org: llama-index-retrievers-superlinked

Rankings

Maintainers (1)

pypi.org: lindex-patch

Rankings

Maintainers (1)

pypi.org: llama-index-retrievers-vectorize

Rankings

Maintainers (1)

pypi.org: llama-index-test-starter

Rankings

Maintainers (1)

pypi.org: llama-index-bundle

Rankings

Maintainers (1)

pypi.org: flying-delta

Rankings

Maintainers (1)

pypi.org: llama-index-legacy

Rankings

Maintainers (1)

pypi.org: llama-index-core

Rankings

Maintainers (1)

Advisories (7)

pypi.org: flying-delta-legacy

Rankings

Maintainers (1)