Multi-view-AE
Multi-view-AE: A Python package for multi-view autoencoder models - Published in JOSS (2023)
MM-PoE
MM-PoE: Multiple Choice Reasoning via. Process of Elimination using Multi-Modal Models - Published in JOSS (2025)
IncrementalInference
Clique recycling non-Gaussian (multi-modal) factor graph solver; also see Caesar.jl.
Caesar
Robust robotic localization and mapping, together with NavAbility(TM). Reach out to info@wherewhen.ai for help.
deepke
[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
https://github.com/modelscope/data-juicer
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
https://github.com/bytedance/salmonn
SALMONN family: A suite of advanced multi-modal LLMs
farmvibes-ai
FarmVibes.AI: Multi-Modal GeoSpatial ML Models for Agriculture and Sustainability
https://github.com/kyegomez/aoa-torch
Implementation of Attention on Attention in Zeta
https://github.com/brains-on-code/codersmuse
CodersMUSE is a prototype implementation to explore multi-modal data of program-comprehension experiments.
https://github.com/924973292/idea
【CVPR2025】IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification
https://github.com/ammarlodhi255/metadata-augmented-neural-networks-for-wild-animal-classification
This repository contains the implementation code for the paper "Metadata Augmented Neural Networks For Wild Animal Classification": https://www.sciencedirect.com/science/article/pii/S1574954124003479.
https://github.com/amazon-science/contrastive_emc2
Code the ICML 2024 paper: "EMC^2: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence"
https://github.com/amazon-science/fmcore
Running GenAI models at every scale, on every modality
https://github.com/924973292/mambapro
【AAAI2025】MambaPro: Multi-Modal Object Re-Identification with Mamba Aggregation and Synergistic Prompt
https://github.com/924973292/demo
【AAAI2025】DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification
https://github.com/924973292/top-reid
【AAAI2024】TOP-ReID: Multi-spectral Object Re-Identification with Token Permutation
https://github.com/924973292/editor
【CVPR2024】Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification
https://github.com/biodt/bfm-model
Multi-modal Foundation Model for Biodiversity dynamics forecasting
https://github.com/brainlesion/preprocessing
preprocessing tools for multi-modal 3D brain imaging
https://github.com/openbmb/minicpm-v
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and Video Understanding on Your Phone
https://github.com/awslabs/rhubarb
A Python framework for multi-modal document understanding with Amazon Bedrock