https://github.com/atlarge-research/on-simulating-llm-ecosystems-under-inference

https://github.com/atlarge-research/on-simulating-llm-ecosystems-under-inference

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.7%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: atlarge-research
  • Default Branch: main
  • Size: 7.02 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 11 months ago · Last pushed 11 months ago
Metadata Files
Readme

README.md

On Simulating LLM Ecosystems under Inference

Abstract

Large Language Models (LLMs) are widely used by our increasingly digitalized society, but raise sustainability, performance, and financial concerns, especially as inference workloads grow. To improve LLM Ecosystems, we envision simulators and simulation-based digital twins becoming primary decision-making tools. LLM Ecosystems leverage many heterogeneous components, making simulation a non-trivial, yet critical operation. The simulation challenge is exacerbated by the absence of a comprehensive reference architecture of LLM Ecosystems; the lack of such a conceptual model can be costly. Without a reference architecture, even the most experienced stakeholders could tinker in researching, engineering, or maintaining LLM Ecosystems. In this work, we bring a three-fold contribution to the scientific community. Firstly, we synthesize, propose, and validate a reference architecture (RA) of LLM Ecosystems under inference. Then, adhering to the reference architecture, we design Kavier, the first simulation instrument able to predict the performance, sustainability, and efficiency of LLM Ecosystems under inference, through discrete-event and (KV)-cache-aware simulation. Lastly, through experiments with a prototype and real-world traces, (i) we measure the accuracy of Kavier and the performance in massive-scale simulations, (ii) we multi-tier analyze LLM Ecosystems under various caching policies, and (iii) we compare the performance, sustainability, and efficiency of different KV-Caching policies.

This repository

This repository is the home of Kavier, the first scientific instrument for predicting performance, sustainability, and efficiency of LLM ecosystems under inference, through discrete-event, cache-aware simulation.

  1. Thesis (PDF)
  2. Kavier
  3. LLM Trace Archive
  4. Tracer
  5. Reproducibility Capsule

Owner

  • Name: @Large Research
  • Login: atlarge-research
  • Kind: organization
  • Email: info@atlarge-research.com

Massivizing Computer Systems

GitHub Events

Total
  • Push event: 1
Last Year
  • Push event: 1