llama_index
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
✓Committers with academic emails
7 of 334 committers (2.1%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (17.0%) to scientific vocabulary
Keywords from Contributors
Repository
Basic Info
- Host: GitHub
- Owner: seanoliver
- License: mit
- Language: Python
- Default Branch: main
- Size: 60.2 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
🗂️ LlamaIndex 🦙
LlamaIndex (GPT Index) is a data framework for your LLM application.
PyPI: - LlamaIndex: https://pypi.org/project/llama-index/. - GPT Index (duplicate): https://pypi.org/project/gpt-index/.
LlamaIndex.TS (Typescript/Javascript): https://github.com/run-llama/LlamaIndexTS.
Documentation: https://gpt-index.readthedocs.io/.
Twitter: https://twitter.com/llama_index.
Discord: https://discord.gg/dGcwcsnxhU.
Ecosystem
- LlamaHub (community library of data loaders): https://llamahub.ai
- LlamaLab (cutting-edge AGI projects using LlamaIndex): https://github.com/run-llama/llama-lab
🚀 Overview
NOTE: This README is not updated as frequently as the documentation. Please check out the documentation above for the latest updates!
Context
- LLMs are a phenomenonal piece of technology for knowledge generation and reasoning. They are pre-trained on large amounts of publicly available data.
- How do we best augment LLMs with our own private data?
We need a comprehensive toolkit to help perform this data augmentation for LLMs.
Proposed Solution
That's where LlamaIndex comes in. LlamaIndex is a "data framework" to help you build LLM apps. It provides the following tools:
- Offers data connectors to ingest your existing data sources and data formats (APIs, PDFs, docs, SQL, etc.)
- Provides ways to structure your data (indices, graphs) so that this data can be easily used with LLMs.
- Provides an advanced retrieval/query interface over your data: Feed in any LLM input prompt, get back retrieved context and knowledge-augmented output.
- Allows easy integrations with your outer application framework (e.g. with LangChain, Flask, Docker, ChatGPT, anything else).
LlamaIndex provides tools for both beginner users and advanced users. Our high-level API allows beginner users to use LlamaIndex to ingest and query their data in 5 lines of code. Our lower-level APIs allow advanced users to customize and extend any module (data connectors, indices, retrievers, query engines, reranking modules), to fit their needs.
💡 Contributing
Interested in contributing? See our Contribution Guide for more details.
📄 Documentation
Full documentation can be found here: https://gpt-index.readthedocs.io/en/latest/.
Please check it out for the most up-to-date tutorials, how-to guides, references, and other resources!
💻 Example Usage
pip install llama-index
Examples are in the examples folder. Indices are in the indices folder (see list of indices below).
To build a simple vector store index: ```python import os os.environ["OPENAIAPIKEY"] = 'YOUROPENAIAPI_KEY'
from llamaindex import VectorStoreIndex, SimpleDirectoryReader documents = SimpleDirectoryReader('data').loaddata() index = VectorStoreIndex.from_documents(documents) ```
To query:
python
query_engine = index.as_query_engine()
query_engine.query("<question_text>?")
By default, data is stored in-memory.
To persist to disk (under ./storage):
python
index.storage_context.persist()
To reload from disk: ```python from llamaindex import StorageContext, loadindexfromstorage
rebuild storage context
storagecontext = StorageContext.fromdefaults(persist_dir='./storage')
load index
index = loadindexfromstorage(storagecontext) ```
🔧 Dependencies
The main third-party package requirements are tiktoken, openai, and langchain.
All requirements should be contained within the setup.py file. To run the package locally without building the wheel, simply run pip install -r requirements.txt.
📖 Citation
Reference to cite if you use LlamaIndex in a paper:
@software{Liu_LlamaIndex_2022,
author = {Liu, Jerry},
doi = {10.5281/zenodo.1234},
month = {11},
title = {{LlamaIndex}},
url = {https://github.com/jerryjliu/llama_index},
year = {2022}
}
Owner
- Name: Sean Oliver
- Login: seanoliver
- Kind: user
- Location: San Francisco, CA
- Company: @gamma-app
- Website: https://seanoliver.dev/
- Twitter: SeanOliver
- Repositories: 76
- Profile: https://github.com/seanoliver
Software engineer at @gamma-app. Obsessed with apps, AI, PKM, productivity, and parenting.
GitHub Events
Total
Last Year
Committers
Last synced: about 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Jerry Liu | j****8@g****m | 638 |
| Logan | l****h@l****m | 195 |
| Simon Suo | s****o@g****m | 169 |
| Ravi Theja | r****1@g****m | 20 |
| hongyishi | s****8@g****m | 19 |
| Sourabh Desai | s****i@g****m | 19 |
| jon-chuang | 9****g | 17 |
| Jesse Zhang | j****g@g****m | 13 |
| Wey Gu | w****u@g****m | 11 |
| yisding | y****g@g****m | 9 |
| Jerry Liu | j****y@r****m | 9 |
| tilleul | f****n@g****m | 7 |
| Jithin James | j****7@g****m | 7 |
| Guy Korland | g****d@g****m | 7 |
| Bruno Bornsztein | b****n@g****m | 6 |
| Kacper Łukawski | k****i | 6 |
| Filip Haltmayer | 8****t | 5 |
| Emanuel Ferreira | c****s@g****m | 5 |
| Adam Hofmann | a****8@g****m | 4 |
| Chris Maddox | t****7@g****m | 4 |
| Ikko Eltociear Ashimine | e****r@g****m | 4 |
| Mikko | M****i | 4 |
| Nicolas | n****9@g****m | 4 |
| Noble Varghese | n****6@g****m | 4 |
| Ryan Chan | r****n@t****k | 4 |
| junying1 | j****1@g****m | 4 |
| Doc Emmett Brown | t****h | 4 |
| Kaiser Pister | p****k@g****m | 3 |
| Mourad | 1****q | 3 |
| Piaoyang Cui | b****e@g****m | 3 |
| and 304 more... | ||
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 11 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v3 composite
- github/codeql-action/analyze v2 composite
- github/codeql-action/autobuild v2 composite
- github/codeql-action/init v2 composite
- actions/checkout v2 composite
- cpina/github-action-push-to-another-repository main composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v2 composite
- actions/create-release v1 composite
- actions/setup-python v2 composite
- actions/upload-release-asset v1 composite
- pypa/gh-action-pypi-publish master composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- pypa/gh-action-pypi-publish master composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- boto3 *
- discord.py *
- google-api-python-client *
- google-auth-httplib2 *
- google-auth-oauthlib *
- jsonpath-ng *
- moto *
- pymongo *
- slack_sdk *
- vellum-ai ==0.0.15
- wikipedia *
- autodoc_pydantic *
- docutils <0.17
- furo >=2023.3.27
- m2r2 *
- myst-nb *
- myst-parser *
- pydantic <2.0.0
- sphinx >=4.3.0
- sphinx-autobuild *
- sphinx_rtd_theme *
- black ==23.7.0
- ipython ==8.10.0
- mypy ==0.991
- pre-commit ==3.2.0
- pylint ==2.15.10
- pytest ==7.2.1
- pytest-asyncio ==0.21.0
- pytest-dotenv ==0.5.2
- rake_nltk ==1.0.6
- ruff ==0.0.285
- types-redis ==4.5.5.0
- types-requests ==2.28.11.8
- types-setuptools ==67.1.0.0
- beautifulsoup4 *
- dataclasses_json *
- fsspec >=2023.5.0
- langchain >=0.0.293
- nest_asyncio *
- nltk *
- numpy *
- openai >=0.26.4
- pandas *
- sqlalchemy >=2.0.15
- tenacity >=8.2.0,<9.0.0
- tiktoken *
- typing-inspect >=0.8.0
- typing_extensions >=4.5.0
- urllib3 <2