Augmenty
Augmenty: A Python Library for Structured Text Augmentation - Published in JOSS (2024)
hanlp
中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
catalyst
Accelerated deep learning R&D
tasksource
Datasets collection and preprocessings framework for NLP extreme multitask learning
annif
Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums.
torchdistill
A coding-free framework built on PyTorch for reproducible deep learning studies. PyTorch Ecosystem. 🏆26 knowledge distillation methods presented at CVPR, ICLR, ECCV, NeurIPS, ICCV, etc are implemented so far. 🎁 Trained models, training logs and configurations are available for ensuring the reproducibiliy and benchmark.
knowprompt
[WWW 2022] KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction
https://github.com/google-research/retvec
RETVec is an efficient, multilingual, and adversarially-robust text vectorizer.
classy-classification
This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-shot classification with Huggingface.
pytextclassifier
pytextclassifier is a toolkit for text classification. 文本分类,LR,Xgboost,TextCNN,FastText,TextRNN,BERT等分类模型实现,开箱即用。
obsei
Obsei is a low code AI powered automation tool. It can be used in various business flows like social listening, AI based alerting, brand image analysis, comparative study and more .
aniemore
Emotions recognition from audio and text files (only russian language)
https://github.com/brucewlee/lingfeat
[EMNLP 2021] LingFeat - A Comprehensive Linguistic Features Extraction ToolKit for Readability Assessment
spacy-wrap
spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to include existing fine-tuned models within your SpaCy workflow.
https://github.com/chenghaomou/pytorch-pqrnn
Implementation of pQRNN in PyTorch
https://github.com/cahya-wirawan/text-classification
Text Classification engine using several algorithms in machine learning
https://github.com/bayer-group/xtars-naacl2022
Zero/few-shot learning for classification with very large label sets and long-tailed distribution of labels in data points
https://github.com/bagustris/isst_2019
Repository for text emotion recognition submitted to ISST 2019
backprop
Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
band
BAND:BERT Application aNd Deployment, A simple and efficient BERT model training and deployment framework.
https://github.com/cair/textunderstandingtsetlinmachine
Using the Tsetlin Machine to learn human-interpretable rules for high-accuracy text categorization with medical applications
https://github.com/autodistill/autodistill-setfit
Train a SetFit model for use in text classification.
https://github.com/capjamesg/textannotate
An annotation and review tool for text classification model training.
https://github.com/csinva/iprompt
Finding semantically meaningful and accurate prompts.
hackathon-leaderboard
Automated Leaderboard System for Hackathon Evaluation Using Large Language Models
https://github.com/ben-aaron188/snlp
2-day course on Statistical Natural Language Processing in R (foundational level)
porkcnn
A Small Project for Pork Barrel Legislation Classification Using Convolutional Neural Networks (Lour's Pork Barrel Classifier (羅老師肉桶法案分類器)🍖🐖 🥩🐷
https://github.com/chaoscodes/untl
EMNLP'2022: Unsupervised Non-transferable Text Classification
azimuth
Helping AI practitioners better understand their datasets and models in text classification. From ServiceNow.
banglasenti-dataset-prep
BanglaSenti Dataset Preparation: Bangla Sentiment Analysis CSV Dataset for NLP & Machine Learning
https://github.com/bgonzalezbustamante/textclass-benchmark
TextClass Benchmark Leaderboards