Recent Releases of medusa-llm
medusa-llm - Medusa-v0.1
Medusa is a easy-to-use framework that democratizes the acceleration techniques for LLM generation. Medusa-v0.1 uses several extra light-weighted decoding head, and exclude the need for draft model.
- Jupyter Notebook
Published by harveyp123 over 2 years ago