https://github.com/betswish/cross-lingual-consistency
Easy-to-use framework for evaluating cross-lingual consistency of factual knowledge (Supported LLaMA, BLOOM, mT5, RoBERTa, etc.) Paper here: https://aclanthology.org/2023.emnlp-main.658/
efficient-task-transfer
Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021
https://github.com/asyml/texar-pytorch
Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/
https://github.com/deepset-ai/farm
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
zabantu-beta
ZaBantu is a fleet of light-weight Masked Language Models for Southern Bantu Languages
llms-from-scratch
Build your own Large Language Model from scratch with this code repository. Learn the ins and outs of LLMs like GPT. 🚀💻
https://github.com/cyberagentailab/japanese-nli-model
This repository provides the code for Japanese NLI model, a fine-tuned masked language model.
roberta-legal-portuguese
Related resources to the paper RoBERTaLexPT: A Legal RoBERTa Model pretrained with deduplication for Portuguese.
https://github.com/cosmaadrian/nli-stress-test
Official repository for the EMNLP 2024 paper "How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics"