Updated 6 months ago

uform • Rank 15.6 • Science 64%

Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

Updated 6 months ago

textgen • Rank 12.8 • Science 64%

TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet and so on. 文本生成模型,实现了包括LLaMA,ChatGLM,BLOOM,GPT2,Seq2Seq,BART,T5,UDA等模型的训练和预测,开箱即用。

Updated 6 months ago

code-bert-score • Rank 15.6 • Science 54%

CodeBERTScore: an automatic metric for code generation, based on BERTScore

Updated 6 months ago

tokenizers • Rank 14.0 • Science 54%

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Updated 6 months ago

pytextclassifier • Rank 12.2 • Science 54%

pytextclassifier is a toolkit for text classification. 文本分类,LR,Xgboost,TextCNN,FastText,TextRNN,BERT等分类模型实现,开箱即用。

Updated 6 months ago

detoxify • Rank 21.0 • Science 44%

Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using ⚡ Pytorch Lightning and 🤗 Transformers. For access to our API, please email us at contact@unitary.ai.

Updated 6 months ago

transformer-srl • Rank 10.4 • Science 54%

Reimplementation of a BERT based model (Shi et al, 2019), currently the state-of-the-art for English SRL. This model implements also predicate disambiguation.

Updated 6 months ago

bangla-bert • Rank 4.4 • Science 54%

Bangla-Bert is a pretrained bert model for Bengali language

Updated 6 months ago

transformers-tutorials • Rank 11.4 • Science 46%

This repository contains demos I made with the Transformers library by HuggingFace.

Updated 6 months ago

nerpy • Rank 9.0 • Science 44%

🌈 NERpy: Implementation of Named Entity Recognition using Python. 命名实体识别工具,支持BertSoftmax、BertSpan等模型,开箱即用。

Updated 6 months ago

efficient-task-transfer • Rank 3.6 • Science 41%

Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

Updated 6 months ago

quickai • Rank 10.5 • Science 26%

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

Updated 5 months ago

https://github.com/asyml/texar-pytorch • Rank 14.5 • Science 20%

Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/

Updated 5 months ago

https://github.com/cvi-szu/linly • Rank 9.4 • Science 23%

Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集

Updated 4 months ago

https://github.com/deepset-ai/farm • Rank 19.1 • Science 10%

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

Updated 5 months ago

https://github.com/compnet/tibert • Rank 0.0 • Science 26%

End-to-End BERT-Based Coreference System

Updated 5 months ago

gpl • Rank 11.9 • Science 10%

Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577

Updated 6 months ago

band • Rank 8.7 • Science 13%

BAND:BERT Application aNd Deployment, A simple and efficient BERT model training and deployment framework.

Updated 5 months ago

https://github.com/bytedance/bytetransformer • Rank 6.9 • Science 13%

optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052

Updated 5 months ago

text-sim • Rank 7.9 • Science 10%

文本相似度(匹配)计算,提供Baseline、训练、推理、指标分析...代码包含TensorFlow/Pytorch双版本

Updated 5 months ago

https://github.com/amazon-science/bold • Science 13%

Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper

Updated 5 months ago

https://github.com/amazon-science/transformers-data-augmentation • Science 26%

Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper

Updated 6 months ago

medkit-lib • Science 44%

Toolkit for a learning health system

Updated 6 months ago

tsdae • Science 54%

Tranformer-based Denoising AutoEncoder for Sentence Transformers Unsupervised pre-training.

Updated 5 months ago

https://github.com/awslabs/mlm-scoring • Science 10%

Python library & examples for Masked Language Model Scoring (ACL 2020)

Updated 6 months ago

openai-clip • Science 67%

Simple implementation of OpenAI CLIP model in PyTorch.

Updated 6 months ago

word-embeddings-repository-for-turkish • Science 49%

Code for "A Comprehensive Analysis of Static Word Embeddings for Turkish". Expert Systems with Applications 2024.

Updated 5 months ago

https://github.com/cyberagentailab/japanese-nli-model • Science 10%

This repository provides the code for Japanese NLI model, a fine-tuned masked language model.

Updated 6 months ago

contextual-spell-checker-for-bangla • Science 26%

Automatic Context Sensitive Spelling Correction for Bangla Text Using Bert and Levenstein Distance

Updated 6 months ago

llms-from-scratch • Science 26%

Build your own Large Language Model from scratch with this code repository. Learn the ins and outs of LLMs like GPT. 🚀💻

Updated 5 months ago

https://github.com/atharvapathak/twitter_sentiment_analysis_project • Science 23%

Twitter sentiment analysis is the process of analyzing tweets posted on the Twitter platform to determine the overall sentiment expressed within them. It involves using natural language processing (NLP) and machine learning techniques to classify tweets.

Updated 6 months ago

partial-embedding-matrix-adaptation • Science 41%

Vocabulary-level memory efficiency for language model fine-tuning.

Updated 5 months ago

https://github.com/alcantarar/biomchbert • Science 10%

Repository for BiomchBERT, the neural network classifying papers for the weekly Biomch-L Literature Update

Updated 6 months ago

banglasenti-dataset-prep • Science 44%

BanglaSenti Dataset Preparation: Bangla Sentiment Analysis CSV Dataset for NLP & Machine Learning

Updated 4 months ago

https://github.com/dimits-ts/large-text-nlp-survey • Science 10%

A survey paper exploring the use of state-of-the-art deep neural network architectures in NLP problems featuring very large documents.

Updated 5 months ago

https://github.com/cedrickchee/bert-pytorch • Science 10%

Google AI BERT 2018 pytorch implementation

Updated 6 months ago

automated-identification-of-security-relevant-configuration-settings-using-nlp • Science 52%

This repository is part of the paper "Automated Identification of Security-Relevant Configuration Settings Using NLP" accepted at the Industry Showcase track at the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE). https://conf.researchr.org/track/ase-2022/ase-2022-industry-showcase.

Updated 6 months ago

nusabert • Science 54%

NusaBERT: Teaching IndoBERT to be multilingual and multicultural!

Updated 5 months ago

https://github.com/cluebbers/nlp_deeplearning_spring2023 • Science 20%

Implementing and fine-tuning BERT for sentiment analysis, paraphrase detection, and semantic textual similarity tasks. Includes code, data, and detailed results.