Scientific Software
Updated 10 months ago

Turftopic — Peer-reviewed • Rank 13.8 • Science 98%

Turftopic: Topic Modelling with Contextual Representations from Sentence Transformers - Published in JOSS (2025)

Mathematics
Scientific Software · Peer-reviewed
Scientific Software
Updated 10 months ago

UralicNLP — Peer-reviewed • Rank 12.8 • Science 98%

UralicNLP: An NLP Library for Uralic Languages - Published in JOSS (2019)

Scientific Software
Updated 10 months ago

LangFair — Peer-reviewed • Rank 14.9 • Science 95%

LangFair: A Python Package for Assessing Bias and Fairness in Large Language Model Use Cases - Published in JOSS (2025)

Scientific Software
Updated 10 months ago

EcoLogits — Peer-reviewed • Rank 8.1 • Science 98%

EcoLogits: Evaluating the Environmental Impacts of Generative AI - Published in JOSS (2025)

Scientific Software
Updated 10 months ago

ollamar — Peer-reviewed • Rank 12.7 • Science 93%

ollamar: An R package for running large language models - Published in JOSS (2025)

Scientific Software · Peer-reviewed
Updated 10 months ago

trafilatura • Rank 26.3 • Science 77%

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

Updated 10 months ago

transformers • Rank 38.7 • Science 64%

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Scientific Software
Updated 10 months ago

MM-PoE — Peer-reviewed • Rank 5.8 • Science 93%

MM-PoE: Multiple Choice Reasoning via. Process of Elimination using Multi-Modal Models - Published in JOSS (2025)

Artificial Intelligence and Machine Learning
Scientific Software · Peer-reviewed
Updated 10 months ago

datasets • Rank 34.4 • Science 64%

🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

Updated 10 months ago

biochatter • Rank 15.0 • Science 77%

Backend library for conversational AI in biomedicine

Updated 10 months ago

embed • Rank 20.7 • Science 67%

Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali

Updated 10 months ago

litgpt • Rank 23.6 • Science 64%

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Updated 10 months ago

lida • Rank 19.4 • Science 67%

Automatic Generation of Visualizations and Infographics using Large Language Models

Updated 10 months ago

spezillm • Rank 8.0 • Science 77%

Large Language Model (LLM) module for the Spezi Ecosystem

Updated 10 months ago

dolma • Rank 19.6 • Science 64%

Data and tools for generating and inspecting OLMo pre-training data.

Updated 10 months ago

llama_index • Rank 33.9 • Science 49%

LlamaIndex is the leading framework for building LLM-powered agents over your data.

Updated 10 months ago

farm-haystack • Rank 28.7 • Science 54%

AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

Updated 10 months ago

letta_api • Rank 18.4 • Science 64%

Letta is the platform for building stateful agents: open AI with advanced memory that can learn and self-improve over time.

Updated 10 months ago

bentoml • Rank 26.1 • Science 54%

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Updated 10 months ago

cntext • Rank 12.3 • Science 67%

text analysis, supporting multiple methods including word count, readability, document similarity, sentiment analysis, Word2Vec/GloVe, and Large Language Models (LLMs).文本分析包,支持字数统计、可读性、文档相似度、情感分析在内的多种文本分析方法。

Updated 10 months ago

tiny_qa_benchmark_pp • Rank 2.1 • Science 77%

Tiny QA Benchmark++ a micro-benchmark suite (52-item gold + on-demand multilingual synthetic packs), generator CLI, and CI-ready eval harness for ultra-fast LLM smoke-testing & regression-catching.

Updated 10 months ago

pandas-ai • Rank 24.5 • Science 54%

Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.

Updated 10 months ago

medusa-llm • Rank 13.6 • Science 64%

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Updated 10 months ago

tensorzero • Rank 23.4 • Science 54%

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.

Updated 10 months ago

learn_prompting • Rank 13.3 • Science 64%

Prompt Engineering, Generative AI, and LLM Guide by Learn Prompting | Join our discord for the largest Prompt Engineering learning community

Updated 10 months ago

mlora-cli • Rank 11.3 • Science 64%

An Efficient "Factory" to Build Multiple LoRA Adapters

Updated 10 months ago

rdocdump • Rank 8.1 • Science 67%

rdocdump: Dump ‘R’ Package Source, Documentation, and Vignettes into One File

Updated 10 months ago

langchain • Rank 39.0 • Science 36%

🦜🔗 Build context-aware reasoning applications 🦜🔗

Updated 10 months ago

tree-of-thought-prompting • Rank 8.0 • Science 67%

Using Tree-of-Thought Prompting to boost ChatGPT's reasoning

Updated 10 months ago

hallucination-leaderboard • Rank 10.8 • Science 64%

Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents

Updated 10 months ago

vulntrain • Rank 7.6 • Science 67%

A tool to generate datasets and models based on vulnerabilities descriptions from @Vulnerability-Lookup.

Updated 10 months ago

mosec • Rank 20.4 • Science 54%

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

Updated 10 months ago

chemlift • Rank 4.0 • Science 67%

Language-interfaced fine-tuning for chemistry

Updated 10 months ago

alignment-handbook • Rank 16.9 • Science 54%

Robust recipes to align language models with human and AI preferences

Updated 10 months ago

json_repair • Rank 26.6 • Science 44%

A python module to repair invalid JSON from LLMs

Updated 10 months ago

sesgos_llm • Rank 1.1 • Science 67%

¿Cómo “se equivocan” los modelos LLM?

Updated 10 months ago

optimate • Rank 12.8 • Science 54%

A collection of libraries to optimise AI model performances

Updated 10 months ago

agentlego • Rank 12.6 • Science 54%

Enhance LLM agents with rich tool APIs

Updated 10 months ago

openllm • Rank 22.1 • Science 44%

Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

Updated 10 months ago

bugbug • Rank 11.6 • Science 54%

Platform for Machine Learning projects on Software Engineering

Updated 10 months ago

banks • Rank 21.3 • Science 44%

LLM prompt language based on Jinja. Banks provides tools and functions to build prompts text and chat messages from generic blueprints. It allows attaching metadata to prompts to ease their management, and versioning is first-class citizen. Banks provides ways to store prompts on disk along with their metadata.

Updated 10 months ago

beeai-framework • Rank 21.3 • Science 44%

Build production-ready AI agents in both Python and Typescript.

Updated 10 months ago

chinese-llama-alpaca-2 • Rank 11.0 • Science 54%

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

Updated 10 months ago

medicalgpt • Rank 10.7 • Science 54%

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO、GRPO。

Updated 10 months ago

oss-fuzz-gen • Rank 10.6 • Science 54%

LLM powered fuzzing via OSS-Fuzz.

Updated 10 months ago

llmebench • Rank 10.5 • Science 54%

Benchmarking Large Language Models

Updated 10 months ago

libre-chat • Rank 10.3 • Science 54%

🦙 Free and Open Source Large Language Model (LLM) chatbot web UI and API. Self-hosted, offline capable and easy to setup.

Updated 10 months ago

sparql-llm • Rank 10.3 • Science 54%

🦜✨ Chat system and reusable components to improve LLMs capabilities when generating SPARQL queries

Updated 10 months ago

llm-lct-sequencing • Rank 2.2 • Science 62%

AI Semantic Insights: LLM Toolkit for Analysing Educational Practices and Knowledge Building.

Updated 10 months ago

magicoder • Rank 9.0 • Science 54%

[ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct

Updated 10 months ago

basic-memory • Rank 18.8 • Science 44%

AI conversations that actually remember. Never re-explain your project to Claude again. Local-first, integrates with Obsidian. Join our Discord: https://discord.gg/tyvKNccgqN

Updated 10 months ago

deeplake • Rank 25.6 • Science 36%

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

Updated 10 months ago

chinese-mixtral • Rank 7.1 • Science 54%

中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)

Updated 10 months ago

monitors4codegen • Rank 7.0 • Science 54%

Code and Data artifact for NeurIPS 2023 paper - "Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context". `multispy` is a lsp client library in Python intended to be used to build applications around language servers.

Updated 10 months ago

https://github.com/khoj-ai/khoj • Rank 24.1 • Science 36%

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

Updated 9 months ago

https://github.com/deepset-ai/haystack-core-integrations • Rank 22.9 • Science 36%

Additional packages (components, document stores and the likes) to extend the capabilities of Haystack

Updated 10 months ago

https://github.com/nixtla/nixtla • Rank 22.6 • Science 36%

TimeGPT-1: production ready pre-trained Time Series Foundation Model for forecasting and anomaly detection. Generative pretrained transformer for time series trained on over 100B data points. It's capable of accurately predicting various domains such as retail, electricity, finance, and IoT with just a few lines of code 🚀.

Updated 10 months ago

rl-llm-calibration-test • Rank 4.5 • Science 54%

Attempt at replication of the parts of the paper "Language models (mostly) know what they know", on open datasets, and models.

Updated 10 months ago

tap_llm_course • Rank 1.4 • Science 57%

Materials for the TAP (Tendencias en Aprendizaje Pronfundo) subject from the Master in Robotics and Artificial Intelligence from the Universidad de León

Updated 10 months ago

chinese-llama-alpaca • Rank 12.2 • Science 46%

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Updated 10 months ago

grammarflow • Rank 5.5 • Science 52%

Powering Agent Chains by Constraining LLM Outputs

Updated 10 months ago

lotr • Rank 3.4 • Science 54%

Low Tensor Rank adaptation of large language models

Updated 10 months ago

canada-labour-research-assistant • Rank 3.3 • Science 54%

The Canada Labour Research Assistant (CLaRA) is a privacy-first LLM-powered RAG AI assistant proposing Easily Verifiable Direct Quotations (EVDQ) to mitigate hallucinations in answering questions about Canadian labour laws, standards, and regulations. It works entirely offline and locally, guaranteeing the confidentiality of your conversations.

Updated 10 months ago

grimoire • Rank 5.9 • Science 51%

Grimoire is All You Need for Enhancing Large Language Models

Updated 10 months ago

langchain-rdf • Rank 2.2 • Science 54%

🦉 Utilities to improve LLMs capabilities when working with SPARQL endpoints and RDF knowledge graphs, compatible with LangChain

Updated 10 months ago

nutcracker • Rank 1.9 • Science 54%

Large Model Evaluation Experiments

Updated 10 months ago

agentica • Rank 11.6 • Science 44%

Agentica: Effortlessly Build Intelligent, Reflective, and Collaborative Multimodal AI Agents! 构建智能的多模态AI Agent。

Updated 10 months ago

https://github.com/mobiusml/hqq • Rank 19.5 • Science 36%

Official implementation of Half-Quadratic Quantization (HQQ)