Shekar: A Python Toolkit for Persian Natural Language Processing
Shekar: A Python Toolkit for Persian Natural Language Processing - Published in JOSS (2025)
simplemma
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
xsnts-analyzer
Application that scrapes Twitter/X data, performs Polish-language text normalization and lemmatization, builds topic models with MALLET, and runs sentiment analysis for downstream analytics. Created as a project for Master Thesis.
https://github.com/chartes/deucalion-model-af
Modèle Pie pour la lemmatisation de l’ancien français
https://github.com/acdh-oeaw/tokeneditor
TokenEditor is a web application for manual annotation (or manual review of automatic annotations) of text. Albeit primarily aimed at reviewing PoS tags and lemmas, it is fully customizable, to support any annotation levels.
pyrrha
A language-independent post-correction app for POS-tagging and lemmatization
pet-project-nlp
Natural language processing pet project. It includes data web scraping, lemmatizing, stemming, and working with related words (hyponyms, hypernyms, meronyms, holonyms). This specific code gathers all data from chosen pages of the Suspilne (Суспільне) webpage. Next, the data is manipulated and processed for future analysis