Projects | Open Source Science

Scientific Software

Updated 6 months ago

Turftopic — Peer-reviewed • Rank 13.8 • Science 98%

Turftopic: Topic Modelling with Contextual Representations from Sentence Transformers - Published in JOSS (2025)

contextual llm topic-modeling transformers

Mathematics

Scientific Software · Peer-reviewed

Scientific Software

Updated 6 months ago

Speakerbox — Peer-reviewed • Rank 10.9 • Science 100%

Speakerbox: Few-Shot Learning for Speaker Identification with Transformers - Published in JOSS (2023)

audio-classification speaker-id speaker-identification transformers

Engineering (40%)

Scientific Software · Peer-reviewed

Scientific Software

Updated 6 months ago

Jury — Peer-reviewed • Rank 14.8 • Science 93%

Jury: A Comprehensive Evaluation Toolkit - Published in JOSS (2024)

datasets evaluate evaluation huggingface machine-learning metrics natural-language-processing nlp nlp-evaluation python pytorch transformers

Scientific Software · Peer-reviewed

Scientific Software

Updated 6 months ago

pactus — Peer-reviewed • Rank 8.1 • Science 98%

pactus: A Python framework for trajectory classification - Published in JOSS (2023)

classification classification-models evaluation-framework python trajectory trajectory-analysis transformers

Scientific Software · Peer-reviewed

Updated 6 months ago

lazyllm-llamafactory • Rank 25.6 • Science 77%

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

agent ai deepseek fine-tuning gemma gpt instruction-tuning large-language-models llama llama3 llm lora moe nlp peft qlora quantization qwen rlhf transformers

Updated 6 months ago

decimer • Rank 15.9 • Science 77%

DECIMER Image Transformer is a deep-learning-based tool designed for automated recognition of chemical structure images. Leveraging transformer architectures, the model converts chemical images into SMILES strings, enabling the digitization of chemical data from scanned documents, literature, and patents.

chemical-image-recognition decimer deep-learning image-data-mining python tensorflow tpu transformers

Updated 6 months ago

txtai • Rank 22.2 • Science 64%

💡 All-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows

ai artificial-intelligence embeddings information-retrieval language-model large-language-models llm machine-learning nlp python rag retrieval-augmented-generation search search-engine semantic-search sentence-embeddings transformers txtai vector-database vector-search

Updated 6 months ago

farm-haystack • Rank 28.7 • Science 54%

AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

agent agents ai gemini generative-ai gpt-4 information-retrieval large-language-models llm machine-learning nlp orchestration python pytorch question-answering rag retrieval-augmented-generation semantic-search summarization transformers

Updated 6 months ago

supar • Rank 14.9 • Science 67%

:rocket: State-of-the-art parsers for natural language.

constituency-parsing dependency-parsing pytorch semantic-dependency-parsing semiring state-of-the-art structured-prediction transformers

Updated 6 months ago

kvpress • Rank 14.6 • Science 67%

LLM KV cache compression made easy

inference kv-cache kv-cache-compression large-language-models llm long-context python pytorch transformers

Updated 6 months ago

inseq • Rank 13.6 • Science 67%

Interpretability for sequence generation models 🐛 🔍

attribution-methods captum deep-learning explainable-ai generative-ai huggingface interpretability language-generation language-model large-language-models natural-language-processing sequence-to-sequence transformers

Updated 6 months ago

adapters • Rank 24.9 • Science 54%

A Unified Library for Parameter-Efficient and Modular Transfer Learning

adapters bert lora natural-language-processing nlp parameter-efficient-learning parameter-efficient-tuning pytorch transformers

Updated 6 months ago

rwkv-lm • Rank 11.2 • Science 67%

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.

attention-mechanism chatgpt deep-learning gpt gpt-2 gpt-3 language-model linear-attention lstm pytorch rnn rwkv transformer transformers

Updated 6 months ago

gpt-neox • Rank 13.8 • Science 64%

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

deepspeed-library gpt-3 language-model transformers

Updated 6 months ago

learn_prompting • Rank 13.3 • Science 64%

Prompt Engineering, Generative AI, and LLM Guide by Learn Prompting | Join our discord for the largest Prompt Engineering learning community

chatgpt chatgpt-api deep-learning gpt-3 gpt-4 gpt-4-api gpt3 large-language-models llm machine-learning nlp openai-api prompt-engineering prompt-toolkit prompt-tuning prompting transformers

Updated 6 months ago

chronos-forecasting • Rank 20.5 • Science 54%

Chronos: Pretrained Models for Probabilistic Time Series Forecasting

artificial-intelligence forecasting foundation-models huggingface huggingface-transformers large-language-models llm machine-learning pretrained-models time-series time-series-forecasting timeseries transformers

Updated 6 months ago

transformers-interpret • Rank 19.3 • Science 54%

Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.

captum computer-vision deep-learning explainable-ai interpretability machine-learning model-explainability natural-language-processing neural-network nlp transformers transformers-model

Updated 6 months ago

bio-epidemiology-ner • Rank 5.9 • Science 67%

Recognize bio-medical entities from a text corpus

biomedical epidemiology ner nlp transformers

Updated 6 months ago

mlm-bias • Rank 4.9 • Science 67%

Measuring Biases in Masked Language Models for PyTorch Transformers. Support for multiple social biases and evaluation measures.

bias-evaluation computational-social-science explainable-ai explainable-ml huggingface interpretable-ai interpretable-ml machine-learning masked-language-modeling masked-language-models pytorch transformers

Updated 6 months ago

span-marker • Rank 17.2 • Science 54%

SpanMarker for Named Entity Recognition

huggingface ner nlp spacy spacy-extension transformers

Updated 6 months ago

alignment-handbook • Rank 16.9 • Science 54%

Robust recipes to align language models with human and AI preferences

llm rlhf transformers

Updated 6 months ago

fastrag • Rank 14.7 • Science 54%

Efficient Retrieval Augmentation and Generation Framework

benchmark colbert diffusion generative-ai information-retrieval knowledge-graph llm multi-modal nlp question-answering semantic-search sentence-transformers summarization transformers

Updated 6 months ago

tokenizers • Rank 14.0 • Science 54%

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

bert gpt language-model natural-language-processing natural-language-understanding nlp transformers

Updated 6 months ago

wiki-entity-summarization-preprocessor • Rank 0.7 • Science 67%

Convert Wikidata and Wikipedia raw files to filterable formats with a focus of marking Wikidata as summaries based on their Wikipedia abstracts.

distilbert java neo4j networkx postgresql python transformers wikes wiki-entity-summarization wikies

Updated 6 months ago

knn-transformers • Rank 5.6 • Science 62%

PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including an implementation of kNN-LM and kNN-MT

huggingface knn knn-lm knn-mt knn-transformers knnlm knnmt language language-models machine models nearest nearest-neighbor neighbor neuro-symbolic pytorch retomaton transformers translation

Updated 6 months ago

frame-semantic-transformer • Rank 10.9 • Science 54%

Frame Semantic Parser based on T5 and FrameNet

framenet huggingface nlp semantic-parsing t5 transformers

Updated 6 months ago

maestro • Rank 10.5 • Science 54%

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

captioning fine-tuning florence-2 multimodal objectdetection paligemma phi-3-vision qwen2-vl transformers vision-and-language vqa

Mathematics (40%)

Updated 6 months ago

transformer-srl • Rank 10.4 • Science 54%

Reimplementation of a BERT based model (Shi et al, 2019), currently the state-of-the-art for English SRL. This model implements also predicate disambiguation.

allennlp bert conll2012 dataset labeling natural-language-processing nlp propbank pytorch role semantic semantic-role-labeling shi span srl srl-annotations srltagger transformer transformers verbatlas

Updated 6 months ago

optimum • Rank 27.2 • Science 36%

🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools

graphcore habana inference intel onnx onnxruntime optimization pytorch quantization tflite training transformers

Updated 6 months ago

linear-relational • Rank 7.8 • Science 54%

Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs) for LLMs in PyTorch

ai huggingface-transformers llms pytorch transformers

Updated 6 months ago

text • Rank 15.7 • Science 46%

Using Transformers from HuggingFace in R

deep-learning machine-learning nlp transformers

Updated 6 months ago

turkish-question-generation • Rank 3.9 • Science 57%

Automated question generation and question answering from Turkish texts using text-to-text transformers

arxiv mt5 multilingual neptune-ai nlp question-answering question-generation t5 transformers turkish wandb xquad

Updated 5 months ago

https://github.com/alan-turing-institute/robots-in-disguise • Rank 6.7 • Science 54%

Information and materials for the Turing's "robots-in-disguise" reading group on fundamental AI research.

deep-learning diffusion-models foundation-model hut23 language-models large-language-models machine-learning nlp transformers

Updated 6 months ago

bangla-bert • Rank 4.4 • Science 54%

Bangla-Bert is a pretrained bert model for Bengali language

bangla bangla-nlp bert lm nlp transformers

Updated 6 months ago

transformers-tutorials • Rank 11.4 • Science 46%

This repository contains demos I made with the Transformers library by HuggingFace.

bert gpt-2 layoutlm pytorch transformers vision-transformer

Updated 6 months ago

huggingsound • Rank 13.3 • Science 44%

HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools

asr audio automatic-speech-recognition speech speech-recognition speech-to-text transformers

Updated 6 months ago

nerpy • Rank 9.0 • Science 44%

🌈 NERpy: Implementation of Named Entity Recognition using Python. 命名实体识别工具，支持BertSoftmax、BertSpan等模型，开箱即用。

bert bert-softmax bert-span named-entity-recognition ner nlp pytorch transformers

Updated 5 months ago

https://github.com/betswish/cross-lingual-consistency • Rank 3.9 • Science 49%

Easy-to-use framework for evaluating cross-lingual consistency of factual knowledge (Supported LLaMA, BLOOM, mT5, RoBERTa, etc.) Paper here: https://aclanthology.org/2023.emnlp-main.658/

bloom knowledge llama mt5 roberta transformers

Updated 6 months ago

hugsvision • Rank 6.1 • Science 46%

HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision

bert computer-vision deep-learning deit detr huggingface image-classification image-generation machine-learning object-detection pretrained-models pythorch pytorch pytorch-transformers semantic-segmentation state-of-the-art torchvision transformers vit yolo

Updated 6 months ago

stormtrooper • Rank 7.6 • Science 44%

Zero/few shot learning components for scikit-learn pipelines with LLMs and transformers.

chatgpt few-shot-learning gpt-4 large-language-models llm scikit-learn transformer transformers zero-shot-learning

Updated 6 months ago

marqo-fashionclip • Rank 6.4 • Science 44%

State-of-the-art CLIP/SigLIP embedding models finetuned for the fashion domain. +57% increase in evaluation metrics vs FashionCLIP 2.0.

clip embeddings fashion-classifier fashionclip informationretrieval multimodal recomendations search transformers vectorsearch vision-transformer

Updated 6 months ago

argostranslate • Rank 24.2 • Science 26%

Open-source offline translation library written in Python

language-models linux machine-translation nlp open-source python transformers translation

Updated 6 months ago

porag • Rank 4.9 • Science 44%

Fully Configurable RAG Pipeline for Bengali Language RAG Applications. Supports both Local and Huggingface Models, Built with Langchain.

ai bengali bengali-nlp chromadb langchain llama3 llm nlp rag transformers

Updated 4 months ago

https://github.com/google-research/scenic • Rank 12.8 • Science 36%

Scenic: A Jax Library for Computer Vision Research and Beyond

attention computer-vision deep-learning jax research transformers vision-transformer

Updated 6 months ago

logfire-callback • Rank 2.6 • Science 44%

A callback for logging training events from Hugging Face's Transformers to Logfire 🤗

callback huggingface huggingface-transformers logfire logfire-callback logging pydantic trainer training transformers

Updated 5 months ago

https://github.com/superduper-io/superduper • Rank 20.6 • Science 26%

Superduper: End-to-end framework for building custom AI applications and agents.

ai chatbot data database distributed-ml inference llm-inference llm-serving llmops ml mlops mongodb pretrained-models python pytorch rag semantic-search torch transformers vector-search

Updated 5 months ago

https://github.com/beomi/exbert-transformers • Rank 9.2 • Science 36%

exBERT on Transformers🤗

exbert transformers

Updated 6 months ago

efficient-task-transfer • Rank 3.6 • Science 41%

Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

adapters bert nlp roberta transfer-learning transformers

Updated 6 months ago

napolab • Rank 7.0 • Science 36%

The Natural Portuguese Language Benchmark (Napolab). Stay up to date with the latest advancements in Portuguese language models and their performance across carefully curated Portuguese language tasks.

benchmarks catalan datasets english galician hate-speech huggingface huggingface-transformers large-language-models nlp portuguese python question-answering semantic-similarity spanish text-simplification textual-entailment transformers

Updated 5 months ago

https://github.com/biomedsciai/biomed-multi-omic • Rank 4.4 • Science 36%

Build foundation model on RNA or DNA data

foundation-models genomics transcriptomics transformers

Updated 6 months ago

astronet • Rank 3.7 • Science 36%

Efficient Deep Learning for Real-time Classification of Astronomical Transients and Multivariate Time-series

astroinformatics deep-compression deep-learning depthwise-separable-convolutions efficient-deep-learning real-time tflite time-series time-series-classification transformers

Updated 6 months ago

spacy-wrap • Rank 12.4 • Science 26%

spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to include existing fine-tuned models within your SpaCy workflow.

deep-learning huggingface huggingface-transformers language-model machine-learning natural-language-processing nlp pytorch spacy spacy-extension spacy-extensions spacy-models spacy-nlp spacy-pipeline spacy-transformers text-classification transformers

Updated 5 months ago

https://github.com/chris-santiago/met • Rank 1.8 • Science 36%

Reproducing the MET framework with PyTorch

adversarial-learning hydra masked-autoencoder pytorch pytorch-lightning self-supervised-learning taskfile transformers

Updated 5 months ago

https://github.com/bioinfomachinelearning/deepinteract • Rank 7.8 • Science 23%

A geometric deep learning framework (Geometric Transformers) for predicting protein interface contacts. (ICLR 2022)

bioinformatics deep-learning docker geometric-deep-learning graph-neural-networks machine-learning protein-protein-interactions proteins transformers

Updated 5 months ago

https://github.com/beomi/easy-lm-trainer • Rank 4.8 • Science 26%

🤗 최소한의 세팅으로 LM을 학습하기 위한 샘플코드

boilerplate huggingface language-model transformers

Updated 5 months ago

https://github.com/aisuko/notebooks • Rank 4.6 • Science 26%

Implementation for the different ML tasks on Kaggle platform with GPUs.

accelerator computer-vision fine-tuning kaggle large-language-models multimodal natural-language-processing neural-network peft pytorch quantization renforcement-learning tensorboard transformers visulization wandb

Updated 5 months ago

https://github.com/ai-forever/mgpt • Rank 7.0 • Science 23%

Multilingual Generative Pretrained Model

chinese english generative-model gpt-2 gpt-3 hindi korean multilingual multilingual-models natural-language-generation natural-language-processing russian transformer transformers

Updated 5 months ago

https://github.com/beomi/transformers-language-modeling • Rank 3.1 • Science 26%

Train 🤗transformers with DeepSpeed: ZeRO-2, ZeRO-3

bert deepspeed language-model transformers

Updated 5 months ago

https://github.com/dadananjesha/ai-text-humanizer-app • Rank 3.0 • Science 26%

Transform AI-generated text into formal, human-like, and academic writing with ease, avoids AI detector!

ai-humanizer natural-language-processing nltk nltk-python nltk-tokenizer open-source python3 transformers

Updated 5 months ago

https://github.com/compvis/geometry-free-view-synthesis • Rank 6.0 • Science 23%

Is a geometric model required to synthesize novel views from a single image?

novel-view-synthesis transformers

Updated 5 months ago

https://github.com/beomi/bitnet-transformers • Rank 5.7 • Science 23%

0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture

llm quantization quantization-aware-training transformers

Updated 5 months ago

https://github.com/rindow/rindow-neuralnetworks • Rank 14.5 • Science 13%

Neural networks library for machine learning on PHP

cnn convolutional-neural-networks deep-learning dnn gpgpu gpu machine-learning ml neural-network openblas opencl php php8 recurrent-neural-networks rnn transformers

Updated 5 months ago

https://github.com/bramvanroy/lt3-2019-transformer-trainer • Rank 0.7 • Science 26%

Transformer trainer for variety of classification problems that has been used in-house at LT3 for different research topics.

nlp pytorch transformers

Updated 5 months ago

https://github.com/buaadreamer/dlpy • Rank 0.7 • Science 23%

Programming Language for Deep Learning in Python

baichuan2 chatbot deep-learning gpt-2 huggingface language-model mlp programming-language python3 transformers

Updated 6 months ago

backprop • Rank 12.9 • Science 10%

Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.

bert fine-tuning image-classification language-model multilingual-models natural-language-processing nlp question-answering text-classification transfer-learning transformers

Updated 6 months ago

gpl • Rank 11.9 • Science 10%

Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577

bert domain-adaptation information-retrieval nlp transformers vector-search

Updated 5 months ago

https://github.com/ai-forever/ru-gpts • Rank 9.0 • Science 10%

Russian GPT3 models.

deep-learning gpt3 language-model russian russian-language transformers

Updated 6 months ago

fluence • Rank 8.2 • Science 10%

A deep learning library based on Pytorch focussed on low resource language research and robustness

attention deep-learning nlp pytorch transformers

Updated 5 months ago

https://github.com/daniel-furman/sft-demos • Rank 4.4 • Science 13%

Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.

chatbot deep-learning instruction-tuning llama nlp text-generation transformers

Updated 5 months ago

https://github.com/beomi/gemma-easylm • Rank 6.3 • Science 10%

Train GEMMA on TPU/GPU! (Codebase for training Gemma-Ko Series)

easylm flax gemma huggingface jax language-model tpu transformers

Updated 6 months ago

uniformers • Science 36%

Token-free Language Modeling with ByGPT5 & Friends!

generation language modeling poetry token-free transformers

Updated 5 months ago

https://github.com/cedrickchee/pytorch-pretrained-bert • Science 10%

PyTorch version of Google AI's BERT model with script to load Google's pre-trained models

bidirectional machine-learning-algorithms natural-language-understanding nlp pretrained-models pytorch transformers

Updated 5 months ago

https://github.com/chenghaomou/embeddings • Science 10%

zero-vocab or low-vocab embeddings

embeddings nlp text-processing transformers

Updated 5 months ago

https://github.com/chirayu-tripathi/paper-implementations • Science 23%

My implementation of Machine Learning and Deep Learning papers from scratch.

natural-language-processing research-paper-implementation transformers

Updated 5 months ago

https://github.com/cosbidev/naim • Science 23%

Official implementation for the paper ``Not Another Imputation Method: A Transformer-based Model for Missing Values in Tabular Datasets´´

attention-mechanism missing-data tabular-data transformers

Updated 6 months ago

balena • Science 44%

BALanced Execution through Natural Activation : a human-computer interaction methodology for code running.

execution python3 sentence-similarity sentence-transformers speech-recognition speech-to-function speech-to-text terminal transformers wav2vec2

Updated 5 months ago

https://github.com/cyberagentailab/japanese-nli-model • Science 10%

This repository provides the code for Japanese NLI model, a fine-tuned masked language model.

bert japanese natural-language-processing natural-language-understanding nli nlp roberta sentence-transformers transformers

Updated 6 months ago

zeldarose • Science 44%

Train transformer-based models.

bert fine-tuning machine-learning neural-networks nlp pretraining transformers

Updated 6 months ago

transformers-tf-finetune • Science 44%

Scripts to finetune huggingface transformers models with Tensorflow 2

nlp tensorflow transformers

Updated 6 months ago

ai-essayist • Science 44%

SKKU AI X Bookathon 4회 [쿠봇] 팀의 레포지토리입니다.

essay-generation gpt natural-language-generation pytorch-lightning transformers

Updated 6 months ago

llm-confidentiality • Science 54%

Whispers in the Machine: Confidentiality in Agentic Systems

chatgpt confidentiality deep-learning framework gpt llm llm-security machine-learning openai prompt-engineering prompt-injection prompt-toolkit security systems-security transformers

Updated 5 months ago

https://github.com/dc-research/tempo • Science 36%

The official code for "TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting (ICLR 2024)". TEMPO is one of the very first open source Time Series Foundation Models for forecasting task v1.0 version.

forecasting forecasting-models forecasting-time-series foundation-models gpt pretrained-language-model pretrained-models time-series time-series-analysis transformer transformers transformers-models

Updated 5 months ago

https://github.com/ai4co/routefinder • Science 36%

[ICML'24 FM-Wild Oral] RouteFinder: Towards Foundation Models for Vehicle Routing Problems

ai4co foundation-models neural-combinatorial-optimization rl4co transformers vehicle-routing-problem

Updated 6 months ago

bitcoin-trader-ml • Science 44%

Automated 24/7 bitcoin trader for Coinbase using Transformer Neural Networks

ai artificial-intelligence auto-trade auto-trading bitcoin btc crypto-bot cryptocurrency transformer transformers

Updated 6 months ago

master-thesis • Science 44%

One Bit at a Time: Impact of Quantisation on Neural Machine Translation

encoder-decoder fully-quantized-transformer nmt pytorch quantization quantization-aware-training seq2seq transformer transformers

Updated 6 months ago

tsdae • Science 54%

Tranformer-based Denoising AutoEncoder for Sentence Transformers Unsupervised pre-training.

bert bert-embeddings lemone lemone-io machine-learning nltk pre-training python sentence-transformers transformers tsdae unsupervised-learning

Updated 6 months ago

llms4om • Science 54%

LLMs4OM: Matching Ontologies with Large Language Models

llms ontology-mapping ontology-matching transformers

Updated 6 months ago

indic-syntax-evaluation • Science 39%

Vyākarana: A Colorless Green Benchmark for Syntactic Evaluation in Indic Languages

dataset deep-learning hacktoberfest indic-nlp syntax transformers

Updated 5 months ago

https://github.com/chris-santiago/tsfeast • Science 10%

A collection of Scikit-Learn compatible time series transformers and tools.

data-science feature-engineering python scikit-learn time-series timeseries-features transformers

Updated 6 months ago

speechbrain • Science 64%

A PyTorch-based Speech Toolkit

asr audio audio-processing deep-learning huggingface language-model pytorch speaker-diarization speaker-recognition speaker-verification speech-enhancement speech-processing speech-recognition speech-separation speech-to-text speech-toolkit speechrecognition spoken-language-understanding transformers voice-recognition

Updated 5 months ago

https://github.com/cluebbers/nlp_deeplearning_spring2023 • Science 20%

Implementing and fine-tuning BERT for sentiment analysis, paraphrase detection, and semantic textual similarity tasks. Includes code, data, and detailed results.

adamw-optimizer bert deep-learning natural-language-processing paraphrase-detection python pytorch semantic-similarity sentiment-analysis sophia tensorflow transformers

Updated 6 months ago

dpo-rlhf-paraphrase-types • Science 67%

Enhancing paraphrase-type generation using Direct Preference Optimization (DPO) and Reinforcement Learning from Human Feedback (RLHF), with large-scale HPC support. This project aligns model outputs to human-ranked data for robust, safety-focused NLP.

alignment deep-learning direct-preference-optimization human-feedback paraphrase-generation paraphrase-type-generation reinforcement-learning transformers

Updated 6 months ago

partial-embedding-matrix-adaptation • Science 41%

Vocabulary-level memory efficiency for language model fine-tuning.

bert huggingface nlp transformers

Updated 6 months ago

pangoling • Science 36%

An R package for estimating the log-probabilities of words in a given context using transformer models.

nlp psycholinguistics r r-package rstats transformers

Updated 6 months ago

gold-standard-toxicity • Science 67%

Gold Standard for Toxicity and Incivility Project

gold-standard incivil-language llms toxicity toxicity-classification transformers

Updated 6 months ago

lemone-api • Science 44%

Lemone: the API for french tax law and embeddings computation 🇫🇷

api bert-model classification-model docker e5 embeddings embeddings-model fastapi french-tax gte sentence-transformers similarity-search tax taxation transformers

Updated 6 months ago

linktransformer • Science 54%

A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning

deep-learning entity-matching entity-resolution huggingface-transformers nlp python record-linkage sentence-transformers transformers

Updated 6 months ago

staged-training • Science 54%

Staged Training for Transformer Language Models

deep-learning language-model nlp pytorch transformers

Updated 5 months ago

https://github.com/alan-turing-institute/prompto • Science 26%

An open source library for asynchronous querying of LLM endpoints

deep-learning hut23 large-language-models llm-eval llm-evaluation llms machine-learning natural-language-processing nlp python transformer transformers

Updated 5 months ago

https://github.com/ai-forever/model-zoo • Science 10%

NLP model zoo for Russian

bert nlp pytorch roberta roberta-model russian russian-language t5 t5-model transformers