Updated 9 months ago
https://github.com/ai4bharat/indicinstruct
Code repository for "Introducing Airavata: Hindi Instruction-tuned LLM"
Updated 9 months ago
https://github.com/ai4bharat/setu
Setu is a comprehensive pipeline designed to clean, filter, and deduplicate diverse data sources including Web, PDF, and Speech data. Built on Apache Spark, Setu encompasses four key stages: document preparation, document cleaning and analysis, flagging and filtering, and deduplication.
Updated 9 months ago
https://github.com/ai4co/parco
PARCO: Parallel AutoRegressive Combinatorial Optimization
Updated 9 months ago
https://github.com/ai4co/unsupervised-co-ucom2
[ICML'24] Tackling Prevalent Conditions in Unsupervised Combinatorial Optimization: Cardinality, Minimum, Covering, and More
Updated 9 months ago
https://github.com/ai4co/rl
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
Updated 9 months ago
https://github.com/ai4er-cdt/earthquake-predictability
Codebase for the 2023 GTC Project on Earthquake Predictability