https://github.com/atlarge-research/on-simulating-llm-ecosystems-under-inference
https://github.com/atlarge-research/on-simulating-llm-ecosystems-under-inference
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.7%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: atlarge-research
- Default Branch: main
- Size: 7.02 MB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
On Simulating LLM Ecosystems under Inference
Abstract
Large Language Models (LLMs) are widely used by our increasingly digitalized society, but raise sustainability, performance, and financial concerns, especially as inference workloads grow. To improve LLM Ecosystems, we envision simulators and simulation-based digital twins becoming primary decision-making tools. LLM Ecosystems leverage many heterogeneous components, making simulation a non-trivial, yet critical operation. The simulation challenge is exacerbated by the absence of a comprehensive reference architecture of LLM Ecosystems; the lack of such a conceptual model can be costly. Without a reference architecture, even the most experienced stakeholders could tinker in researching, engineering, or maintaining LLM Ecosystems. In this work, we bring a three-fold contribution to the scientific community. Firstly, we synthesize, propose, and validate a reference architecture (RA) of LLM Ecosystems under inference. Then, adhering to the reference architecture, we design Kavier, the first simulation instrument able to predict the performance, sustainability, and efficiency of LLM Ecosystems under inference, through discrete-event and (KV)-cache-aware simulation. Lastly, through experiments with a prototype and real-world traces, (i) we measure the accuracy of Kavier and the performance in massive-scale simulations, (ii) we multi-tier analyze LLM Ecosystems under various caching policies, and (iii) we compare the performance, sustainability, and efficiency of different KV-Caching policies.
This repository
This repository is the home of Kavier, the first scientific instrument for predicting performance, sustainability, and efficiency of LLM ecosystems under inference, through discrete-event, cache-aware simulation.
Owner
- Name: @Large Research
- Login: atlarge-research
- Kind: organization
- Email: info@atlarge-research.com
- Website: http://atlarge-research.com/
- Twitter: LargeResearch
- Repositories: 24
- Profile: https://github.com/atlarge-research
Massivizing Computer Systems
GitHub Events
Total
- Push event: 1
Last Year
- Push event: 1