Scientific Software
Updated 6 months ago

Turftopic — Peer-reviewed • Rank 13.8 • Science 98%

Turftopic: Topic Modelling with Contextual Representations from Sentence Transformers - Published in JOSS (2025)

Mathematics
Scientific Software · Peer-reviewed
Scientific Software
Updated 6 months ago

UralicNLP — Peer-reviewed • Rank 12.8 • Science 98%

UralicNLP: An NLP Library for Uralic Languages - Published in JOSS (2019)

Scientific Software
Updated 6 months ago

LangFair — Peer-reviewed • Rank 14.9 • Science 95%

LangFair: A Python Package for Assessing Bias and Fairness in Large Language Model Use Cases - Published in JOSS (2025)

Scientific Software
Updated 6 months ago

EcoLogits — Peer-reviewed • Rank 8.1 • Science 98%

EcoLogits: Evaluating the Environmental Impacts of Generative AI - Published in JOSS (2025)

Scientific Software
Updated 6 months ago

ollamar — Peer-reviewed • Rank 12.7 • Science 93%

ollamar: An R package for running large language models - Published in JOSS (2025)

Scientific Software · Peer-reviewed
Updated 6 months ago

trafilatura • Rank 26.3 • Science 77%

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

Updated 6 months ago

transformers • Rank 38.7 • Science 64%

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Scientific Software
Updated 6 months ago

MM-PoE — Peer-reviewed • Rank 5.8 • Science 93%

MM-PoE: Multiple Choice Reasoning via. Process of Elimination using Multi-Modal Models - Published in JOSS (2025)

Artificial Intelligence and Machine Learning
Scientific Software · Peer-reviewed
Updated 6 months ago

datasets • Rank 34.4 • Science 64%

🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

Updated 6 months ago

embed • Rank 20.7 • Science 67%

Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali

Updated 6 months ago

litgpt • Rank 23.6 • Science 64%

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Updated 6 months ago

lida • Rank 19.4 • Science 67%

Automatic Generation of Visualizations and Infographics using Large Language Models

Updated 6 months ago

spezillm • Rank 8.0 • Science 77%

Large Language Model (LLM) module for the Spezi Ecosystem

Updated 6 months ago

dolma • Rank 19.6 • Science 64%

Data and tools for generating and inspecting OLMo pre-training data.

Updated 6 months ago

llama_index • Rank 33.9 • Science 49%

LlamaIndex is the leading framework for building LLM-powered agents over your data.

Updated 6 months ago

farm-haystack • Rank 28.7 • Science 54%

AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

Updated 6 months ago

letta_api • Rank 18.4 • Science 64%

Letta is the platform for building stateful agents: open AI with advanced memory that can learn and self-improve over time.

Updated 6 months ago

bentoml • Rank 26.1 • Science 54%

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Updated 6 months ago

cntext • Rank 12.3 • Science 67%

text analysis, supporting multiple methods including word count, readability, document similarity, sentiment analysis, Word2Vec/GloVe, and Large Language Models (LLMs).文本分析包,支持字数统计、可读性、文档相似度、情感分析在内的多种文本分析方法。

Updated 6 months ago

tiny_qa_benchmark_pp • Rank 2.1 • Science 77%

Tiny QA Benchmark++ a micro-benchmark suite (52-item gold + on-demand multilingual synthetic packs), generator CLI, and CI-ready eval harness for ultra-fast LLM smoke-testing & regression-catching.

Updated 6 months ago

pandas-ai • Rank 24.5 • Science 54%

Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.

Updated 6 months ago

medusa-llm • Rank 13.6 • Science 64%

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Updated 6 months ago

tensorzero • Rank 23.4 • Science 54%

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.

Updated 6 months ago

learn_prompting • Rank 13.3 • Science 64%

Prompt Engineering, Generative AI, and LLM Guide by Learn Prompting | Join our discord for the largest Prompt Engineering learning community

Updated 6 months ago

mlora-cli • Rank 11.3 • Science 64%

An Efficient "Factory" to Build Multiple LoRA Adapters

Updated 6 months ago

rdocdump • Rank 8.1 • Science 67%

rdocdump: Dump ‘R’ Package Source, Documentation, and Vignettes into One File

Updated 6 months ago

langchain • Rank 39.0 • Science 36%

🦜🔗 Build context-aware reasoning applications 🦜🔗

Updated 6 months ago

tree-of-thought-prompting • Rank 8.0 • Science 67%

Using Tree-of-Thought Prompting to boost ChatGPT's reasoning

Updated 6 months ago

hallucination-leaderboard • Rank 10.8 • Science 64%

Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents

Updated 6 months ago

vulntrain • Rank 7.6 • Science 67%

A tool to generate datasets and models based on vulnerabilities descriptions from @Vulnerability-Lookup.

Updated 6 months ago

mosec • Rank 20.4 • Science 54%

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

Updated 6 months ago

chemlift • Rank 4.0 • Science 67%

Language-interfaced fine-tuning for chemistry

Updated 6 months ago

alignment-handbook • Rank 16.9 • Science 54%

Robust recipes to align language models with human and AI preferences

Updated 6 months ago

json_repair • Rank 26.6 • Science 44%

A python module to repair invalid JSON from LLMs

Updated 6 months ago

sesgos_llm • Rank 1.1 • Science 67%

¿Cómo “se equivocan” los modelos LLM?

Updated 6 months ago

optimate • Rank 12.8 • Science 54%

A collection of libraries to optimise AI model performances

Updated 6 months ago

agentlego • Rank 12.6 • Science 54%

Enhance LLM agents with rich tool APIs

Updated 6 months ago

openllm • Rank 22.1 • Science 44%

Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

Updated 6 months ago

bugbug • Rank 11.6 • Science 54%

Platform for Machine Learning projects on Software Engineering

Updated 6 months ago

banks • Rank 21.3 • Science 44%

LLM prompt language based on Jinja. Banks provides tools and functions to build prompts text and chat messages from generic blueprints. It allows attaching metadata to prompts to ease their management, and versioning is first-class citizen. Banks provides ways to store prompts on disk along with their metadata.

Updated 6 months ago

beeai-framework • Rank 21.3 • Science 44%

Build production-ready AI agents in both Python and Typescript.

Updated 6 months ago

chinese-llama-alpaca-2 • Rank 11.0 • Science 54%

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

Updated 6 months ago

medicalgpt • Rank 10.7 • Science 54%

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO、GRPO。

Updated 6 months ago

oss-fuzz-gen • Rank 10.6 • Science 54%

LLM powered fuzzing via OSS-Fuzz.

Updated 6 months ago

llmebench • Rank 10.5 • Science 54%

Benchmarking Large Language Models

Updated 6 months ago

libre-chat • Rank 10.3 • Science 54%

🦙 Free and Open Source Large Language Model (LLM) chatbot web UI and API. Self-hosted, offline capable and easy to setup.

Updated 6 months ago

sparql-llm • Rank 10.3 • Science 54%

🦜✨ Chat system and reusable components to improve LLMs capabilities when generating SPARQL queries

Updated 6 months ago

llm-lct-sequencing • Rank 2.2 • Science 62%

AI Semantic Insights: LLM Toolkit for Analysing Educational Practices and Knowledge Building.

Updated 6 months ago

magicoder • Rank 9.0 • Science 54%

[ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct

Updated 6 months ago

basic-memory • Rank 18.8 • Science 44%

AI conversations that actually remember. Never re-explain your project to Claude again. Local-first, integrates with Obsidian. Join our Discord: https://discord.gg/tyvKNccgqN

Updated 6 months ago

deeplake • Rank 25.6 • Science 36%

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

Updated 6 months ago

chinese-mixtral • Rank 7.1 • Science 54%

中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)

Updated 6 months ago

monitors4codegen • Rank 7.0 • Science 54%

Code and Data artifact for NeurIPS 2023 paper - "Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context". `multispy` is a lsp client library in Python intended to be used to build applications around language servers.

Updated 6 months ago

https://github.com/khoj-ai/khoj • Rank 24.1 • Science 36%

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

Updated 5 months ago

https://github.com/deepset-ai/haystack-core-integrations • Rank 22.9 • Science 36%

Additional packages (components, document stores and the likes) to extend the capabilities of Haystack

Updated 6 months ago

https://github.com/nixtla/nixtla • Rank 22.6 • Science 36%

TimeGPT-1: production ready pre-trained Time Series Foundation Model for forecasting and anomaly detection. Generative pretrained transformer for time series trained on over 100B data points. It's capable of accurately predicting various domains such as retail, electricity, finance, and IoT with just a few lines of code 🚀.

Updated 6 months ago

rl-llm-calibration-test • Rank 4.5 • Science 54%

Attempt at replication of the parts of the paper "Language models (mostly) know what they know", on open datasets, and models.

Updated 6 months ago

tap_llm_course • Rank 1.4 • Science 57%

Materials for the TAP (Tendencias en Aprendizaje Pronfundo) subject from the Master in Robotics and Artificial Intelligence from the Universidad de León

Updated 6 months ago

chinese-llama-alpaca • Rank 12.2 • Science 46%

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Updated 6 months ago

grammarflow • Rank 5.5 • Science 52%

Powering Agent Chains by Constraining LLM Outputs

Updated 6 months ago

lotr • Rank 3.4 • Science 54%

Low Tensor Rank adaptation of large language models

Updated 6 months ago

canada-labour-research-assistant • Rank 3.3 • Science 54%

The Canada Labour Research Assistant (CLaRA) is a privacy-first LLM-powered RAG AI assistant proposing Easily Verifiable Direct Quotations (EVDQ) to mitigate hallucinations in answering questions about Canadian labour laws, standards, and regulations. It works entirely offline and locally, guaranteeing the confidentiality of your conversations.

Updated 6 months ago

grimoire • Rank 5.9 • Science 51%

Grimoire is All You Need for Enhancing Large Language Models

Updated 6 months ago

langchain-rdf • Rank 2.2 • Science 54%

🦉 Utilities to improve LLMs capabilities when working with SPARQL endpoints and RDF knowledge graphs, compatible with LangChain

Updated 6 months ago

nutcracker • Rank 1.9 • Science 54%

Large Model Evaluation Experiments

Updated 6 months ago

agentica • Rank 11.6 • Science 44%

Agentica: Effortlessly Build Intelligent, Reflective, and Collaborative Multimodal AI Agents! 构建智能的多模态AI Agent。

Updated 6 months ago

https://github.com/mobiusml/hqq • Rank 19.5 • Science 36%

Official implementation of Half-Quadratic Quantization (HQQ)