contextualspellcheck
✔️Contextual word checker for better suggestions (not actively maintained)
uform
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
adapters
A Unified Library for Parameter-Efficient and Modular Transfer Learning
code-bert-score
CodeBERTScore: an automatic metric for code generation, based on BERTScore
tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
pytextclassifier
pytextclassifier is a toolkit for text classification. 文本分类,LR,Xgboost,TextCNN,FastText,TextRNN,BERT等分类模型实现,开箱即用。
detoxify
Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using ⚡ Pytorch Lightning and 🤗 Transformers. For access to our API, please email us at contact@unitary.ai.
transformer-srl
Reimplementation of a BERT based model (Shi et al, 2019), currently the state-of-the-art for English SRL. This model implements also predicate disambiguation.
transformers-tutorials
This repository contains demos I made with the Transformers library by HuggingFace.
nerpy
🌈 NERpy: Implementation of Named Entity Recognition using Python. 命名实体识别工具,支持BertSoftmax、BertSpan等模型,开箱即用。
hugsvision
HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision
https://github.com/cedrickchee/awesome-transformer-nlp
A curated list of NLP resources focused on Transformer networks, attention mechanism, GPT, BERT, ChatGPT, LLMs, and transfer learning.
efficient-task-transfer
Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021
quickai
QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.
https://github.com/asyml/texar-pytorch
Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/
https://github.com/cvi-szu/linly
Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集
https://github.com/beomi/transformers-language-modeling
Train 🤗transformers with DeepSpeed: ZeRO-2, ZeRO-3
https://github.com/deepset-ai/farm
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
https://github.com/bytedance/lightseq
LightSeq: A High Performance Library for Sequence Processing and Generation
https://github.com/explosion/spacy-transformers
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
backprop
Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
gpl
Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
band
BAND:BERT Application aNd Deployment, A simple and efficient BERT model training and deployment framework.
https://github.com/bytedance/bytetransformer
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
https://github.com/ai-forever/ner-bert
BERT-NER (nert-bert) with google bert https://github.com/google-research.
https://github.com/aryashah2k/nlp-data-augmentation
Implementing 5 Different Approaches To Augmenting Data For Natural Language Processing Tasks.
https://github.com/amazon-science/bold
Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper
https://github.com/amazon-science/transformers-data-augmentation
Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper
tsdae
Tranformer-based Denoising AutoEncoder for Sentence Transformers Unsupervised pre-training.
https://github.com/awslabs/mlm-scoring
Python library & examples for Masked Language Model Scoring (ACL 2020)
word-embeddings-repository-for-turkish
Code for "A Comprehensive Analysis of Static Word Embeddings for Turkish". Expert Systems with Applications 2024.
https://github.com/johnsnowlabs/johnsnowlabs
Gateway into the John Snow Labs Ecosystem
https://github.com/cyberagentailab/japanese-nli-model
This repository provides the code for Japanese NLI model, a fine-tuned masked language model.
https://github.com/cabralpinto/wildfire-heat-map-generation
Wildfire Heat Map Generation with Twitter and BERT
contextual-spell-checker-for-bangla
Automatic Context Sensitive Spelling Correction for Bangla Text Using Bert and Levenstein Distance
llms-from-scratch
Build your own Large Language Model from scratch with this code repository. Learn the ins and outs of LLMs like GPT. 🚀💻
https://github.com/atharvapathak/twitter_sentiment_analysis_project
Twitter sentiment analysis is the process of analyzing tweets posted on the Twitter platform to determine the overall sentiment expressed within them. It involves using natural language processing (NLP) and machine learning techniques to classify tweets.
partial-embedding-matrix-adaptation
Vocabulary-level memory efficiency for language model fine-tuning.
https://github.com/alcantarar/biomchbert
Repository for BiomchBERT, the neural network classifying papers for the weekly Biomch-L Literature Update
banglasenti-dataset-prep
BanglaSenti Dataset Preparation: Bangla Sentiment Analysis CSV Dataset for NLP & Machine Learning
https://github.com/dimits-ts/large-text-nlp-survey
A survey paper exploring the use of state-of-the-art deep neural network architectures in NLP problems featuring very large documents.
https://github.com/cedrickchee/bert-pytorch
Google AI BERT 2018 pytorch implementation
automated-identification-of-security-relevant-configuration-settings-using-nlp
This repository is part of the paper "Automated Identification of Security-Relevant Configuration Settings Using NLP" accepted at the Industry Showcase track at the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE). https://conf.researchr.org/track/ase-2022/ase-2022-industry-showcase.
FMAT
😷 The Fill-Mask Association Test (FMAT): Measuring Propositions in Natural Language.
https://github.com/cluebbers/nlp_deeplearning_spring2023
Implementing and fine-tuning BERT for sentiment analysis, paraphrase detection, and semantic textual similarity tasks. Includes code, data, and detailed results.