multilang-probe
A solution to detect languages and type characters in a multilingual setting.
turkish-question-generation
Automated question generation and question answering from Turkish texts using text-to-text transformers
https://github.com/douglasneuroinformatics/opendatacapture
An electronic data capture platform for administering remote and in-person clinical instruments
scribesalad
A collection of YouTube videos transcripts : Podcasts (Joe Rogan Experience, Tim Ferris, Jocko podcast, ..), lectures (YaleCourses, MIT lectures, ..). A big transcripts salad spanning history, geography, science, politics, film making and more.
https://github.com/bigscience-workshop/data-preparation
Code used for sourcing and cleaning the BigScience ROOTS corpus
allophant
A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.
universalpython
Write Python in any human language. UniversalPython is a transpiler which makes it possible to write Python code in different human languages like Urdu, German, Czech, and more. The code is translated to Python.
https://github.com/bkader/ci-gettext-hook
This hook enables the use of php_gettext for CodeIgniter framework
https://github.com/ai4bharat/indicinstruct
Code repository for "Introducing Airavata: Hindi Instruction-tuned LLM"
https://github.com/asreview/asreview-multilingual-feature-extractor
A model extension for ASReview. ASReview multilingual feature extractor is a feature extractor based on distiluse-base-multilingual-cased-v1.