https://github.com/chrisdonahue/mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 3 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.2%) to scientific vocabulary
Last synced: 9 months ago
·
JSON representation
Repository
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
Basic Info
- Host: GitHub
- Owner: chrisdonahue
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://llm.mlc.ai/docs
- Size: 14.7 MB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of mlc-ai/mlc-llm
Created over 2 years ago
· Last pushed over 2 years ago
https://github.com/chrisdonahue/mlc-llm/blob/main/
[discord-url]: https://discord.gg/9Xpy2HGBuD # MLC LLM [Documentation](https://llm.mlc.ai/docs) | [Blog](https://blog.mlc.ai/) | [Discord][discord-url] **M**achine **L**earning **C**ompilation for **L**arge **L**anguage **M**odels (MLC LLM) is a high-performance universal deployment solution that allows native deployment of any large language models with native APIs with compiler acceleration. The mission of this project is to enable everyone to develop, optimize and deploy AI models natively on everyone's devices with ML compilation techniques. **Universal deployment.** MLC LLM supports the following platforms and hardware:
| AMD GPU | NVIDIA GPU | Apple GPU | Intel GPU | |
|---|---|---|---|---|
| Linux / Win | Vulkan, ROCm | Vulkan, CUDA | N/A | Vulkan |
| macOS | Metal (dGPU) | N/A | Metal | Metal (iGPU) |
| Web Browser | WebGPU and WASM | |||
| iOS / iPadOS | Metal on Apple A-series GPU | |||
| Android | OpenCL on Adreno GPU | OpenCL on Mali GPU | ||
| Architecture | Prebuilt Model Variants |
|---|---|
| Llama | Llama-2, Code Llama, Vicuna, WizardLM, WizardMath, OpenOrca Platypus2, FlagAlpha Llama-2 Chinese, georgesung Llama-2 Uncensored |
| GPT-NeoX | RedPajama |
| GPT-J | |
| RWKV | RWKV-raven |
| MiniGPT | |
| GPTBigCode | WizardCoder |
| ChatGLM | |
| StableLM |
References (Click to expand)
```bibtex @inproceedings{tensorir, author = {Feng, Siyuan and Hou, Bohan and Jin, Hongyi and Lin, Wuwei and Shao, Junru and Lai, Ruihang and Ye, Zihao and Zheng, Lianmin and Yu, Cody Hao and Yu, Yong and Chen, Tianqi}, title = {TensorIR: An Abstraction for Automatic Tensorized Program Optimization}, year = {2023}, isbn = {9781450399166}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3575693.3576933}, doi = {10.1145/3575693.3576933}, booktitle = {Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2}, pages = {804817}, numpages = {14}, keywords = {Tensor Computation, Machine Learning Compiler, Deep Neural Network}, location = {Vancouver, BC, Canada}, series = {ASPLOS 2023} } @inproceedings{metaschedule, author = {Shao, Junru and Zhou, Xiyou and Feng, Siyuan and Hou, Bohan and Lai, Ruihang and Jin, Hongyi and Lin, Wuwei and Masuda, Masahiro and Yu, Cody Hao and Chen, Tianqi}, booktitle = {Advances in Neural Information Processing Systems}, editor = {S. Koyejo and S. Mohamed and A. Agarwal and D. Belgrave and K. Cho and A. Oh}, pages = {35783--35796}, publisher = {Curran Associates, Inc.}, title = {Tensor Program Optimization with Probabilistic Programs}, url = {https://proceedings.neurips.cc/paper_files/paper/2022/file/e894eafae43e68b4c8dfdacf742bcbf3-Paper-Conference.pdf}, volume = {35}, year = {2022} } @inproceedings{tvm, author = {Tianqi Chen and Thierry Moreau and Ziheng Jiang and Lianmin Zheng and Eddie Yan and Haichen Shen and Meghan Cowan and Leyuan Wang and Yuwei Hu and Luis Ceze and Carlos Guestrin and Arvind Krishnamurthy}, title = {{TVM}: An Automated {End-to-End} Optimizing Compiler for Deep Learning}, booktitle = {13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)}, year = {2018}, isbn = {978-1-939133-08-3}, address = {Carlsbad, CA}, pages = {578--594}, url = {https://www.usenix.org/conference/osdi18/presentation/chen}, publisher = {USENIX Association}, month = oct, } ```Owner
- Name: Chris Donahue
- Login: chrisdonahue
- Kind: user
- Location: Pittsburgh
- Website: chrisdonahue.com
- Twitter: chrisdonahuey
- Repositories: 55
- Profile: https://github.com/chrisdonahue
Assistant professor @ CMU CSD. Part-time research scientist at Google Magenta. Machine learning for music, and creative interaction.
GitHub Events
Total
- Watch event: 2
Last Year
- Watch event: 2