LangFair
LangFair: A Python Package for Assessing Bias and Fairness in Large Language Model Use Cases - Published in JOSS (2025)
lazyllm-llamafactory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
litgpt
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
txtai
💡 All-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows
kani
kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)
datastew
Python library for intelligent data stewardship using Large Language Model (LLM) embeddings
farm-haystack
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
tensorzero
TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.
learn_prompting
Prompt Engineering, Generative AI, and LLM Guide by Learn Prompting | Join our discord for the largest Prompt Engineering learning community
tree-of-thought-prompting
Using Tree-of-Thought Prompting to boost ChatGPT's reasoning
chronos-forecasting
Chronos: Pretrained Models for Probabilistic Time Series Forecasting
symbolicai
A neurosymbolic perspective on LLMs
mllmcelltype
🏆 #1 Multi-LLM consensus framework | 550+ stars | 95% accuracy | 10+ LLM providers | Leading cell annotation tool
llms-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
eval-suite
[ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.
https://github.com/modelscope/data-juicer
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
chinese-llama-alpaca-2
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
multilspy
multilspy is a lsp client library in Python intended to be used to build applications around language servers.
libre-chat
🦙 Free and Open Source Large Language Model (LLM) chatbot web UI and API. Self-hosted, offline capable and easy to setup.
surprisal
A unified interface for computing surprisal (log probabilities) from language models! Supports neural, symbolic, and black-box API models.
cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
llm-sandbox
Lightweight and portable LLM sandbox runtime (code interpreter) Python library.
@llm-tools/embedjs
A NodeJS RAG framework to easily work with LLMs and embeddings
deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
monitors4codegen
Code and Data artifact for NeurIPS 2023 paper - "Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context". `multispy` is a lsp client library in Python intended to be used to build applications around language servers.
https://github.com/alan-turing-institute/robots-in-disguise
Information and materials for the Turing's "robots-in-disguise" reading group on fundamental AI research.
q-galore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
py-alpaca-eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
chinese-llama-alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
openqa-eval
ACL 2023: Evaluating Open-Domain Question Answering in the Era of Large Language Models
xfinder
[ICLR 2025] xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation
odin-slides
This is an advanced Python tool that empowers you to effortlessly draft customizable PowerPoint slides using the Generative Pre-trained Transformer (GPT) of your choice. Leveraging the capabilities of Large Language Models (LLM), odin-slides enables you to turn the lengthiest Word documents into well organized presentations.
stormtrooper
Zero/few shot learning components for scikit-learn pipelines with LLMs and transformers.
climsight
A next-generation climate information system that uses large language models (LLMs) alongside high-resolution climate model data, scientific literature, and diverse databases to deliver accurate, localized, and context-aware climate assessments.
odin-tabs
The Odin Tabs extension is a browser extension that allows you to navigate through your browser tabs using speech recognition and the Large Language Model (LLM) of your choice.
gemgpt
Explore the power of Gemma model with GemGPT, a project leveraging AI for innovative solutions. Join us in shaping the future of AI!
https://github.com/bytedance/salmonn
SALMONN family: A suite of advanced multi-modal LLMs
upgini
Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs
napolab
The Natural Portuguese Language Benchmark (Napolab). Stay up to date with the latest advancements in Portuguese language models and their performance across carefully curated Portuguese language tasks.
https://github.com/bowang-lab/bioreason
BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model
https://github.com/bigscience-workshop/petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
https://github.com/amazon-science/auto-cot
Official implementation for "Automatic Chain of Thought Prompting in Large Language Models" (stay tuned & more will be updated)
cellama
Cell type annotation with local Large Language Models (LLMs) - Ensuring privacy and speed with extensive customized reports
https://github.com/aisuko/notebooks
Implementation for the different ML tasks on Kaggle platform with GPUs.
https://github.com/bigscience-workshop/xmtf
Crosslingual Generalization through Multitask Finetuning
https://github.com/amazon-science/mezo_svrg
Code the ICML 2024 paper: "Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models"
https://github.com/altunenes/calcarine
Desktop VLM: Real-time FastVLM analysis of video & textures with live compute shaders
https://github.com/astorfi/llm-alignment-project
A comprehensive template for aligning large language models (LLMs) using Reinforcement Learning from Human Feedback (RLHF), transfer learning, and more. Build your own customizable LLM alignment solution with ease.
https://github.com/bigscience-workshop/data-preparation
Code used for sourcing and cleaning the BigScience ROOTS corpus
https://github.com/astorfi/large-scale-ai-blueprint
A comprehensive guide designed to empower readers with advanced strategies and practical insights for developing, optimizing, and deploying scalable AI models in real-world applications.
textgraphs
TextGraphs + LLMs + graph ML for entity extraction, linking, ranking, and constructing a lemma graph
urban-worm
Urban-Worm is a Python library that integrates remote sensing imagery, street view data, and multimodal model to assess environments and urban units
NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
phenomics-assistant
LLM retrieval augmented generation agent for the Monarch Knowledge graph.
longmem
Official implementation of our NeurIPS 2023 paper "Augmenting Language Models with Long-Term Memory".
dpl
[NeurIPS 2023] Multi-fidelity hyperparameter optimization with deep power laws that achieves state-of-the-art results across diverse benchmarks.
simplyretrieve
Lightweight chat AI platform featuring custom knowledge, open-source LLMs, prompt-engineering, retrieval analysis. Highly customizable. For Retrieval-Centric & Retrieval-Augmented Generation.
flashrag
⚡FlashRAG: A Python Toolkit for Efficient RAG Research (WWW2025 Resource)
awesome-llms
🤓 A collection of AWESOME structured summaries of Large Language Models (LLMs)
folktexts
Evaluate uncertainty, calibration, accuracy, and fairness of LLMs on real-world survey data!
argumentation-mining-transformers
Argumentation Mining Transformers Module (AMTM) implementation.
https://github.com/csinva/iprompt
Finding semantically meaningful and accurate prompts.
https://github.com/csinva/clinical-rule-survey
Analyzing clinical decision instruments through the lens of data and large language models.
https://github.com/bigcode-project/selfcodealign
[NeurIPS'24] SelfCodeAlign: Self-Alignment for Code Generation
repilot
Repilot, a patch generation tool introduced in the ESEC/FSE'23 paper "Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair"
icsfsurvey
Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄.
https://github.com/chen-yang-liu/promptcc
PyTorch implementation of 'A Decoupling Paradigm With Prompt Learning for Remote Sensing Image Change Captioning'
llms-from-scratch
Build your own Large Language Model from scratch with this code repository. Learn the ins and outs of LLMs like GPT. 🚀💻
llms4subjects
The official SemEval 2025 Task 5 - LLMs4Subjects - Shared Task Dataset repository
https://github.com/ammarlodhi255/self-healing-llm-pipeline
This repo contains the code written primarily in Golang for a self-healing large language model (LLM) pipeline that iteratively corrects errors in its own generated code.
https://github.com/amazon-science/dstc12-controllable-conversational-theme-detection
Data & code for DSTC12, Controllable Conversational Theme Detection track
awesome-attention-heads
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.
evalplus
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
flash-linear-attention
🚀 Efficient implementations of state-of-the-art linear attention models
https://github.com/bytedance/shot2story
A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.
llms4subjects
The official GermEval 2025 Task - LLMs4Subjects - Shared Task Dataset Repository