TextDescriptives
TextDescriptives: A Python package for calculating a large variety of metrics from text - Published in JOSS (2023)
Augmenty
Augmenty: A Python Library for Structured Text Augmentation - Published in JOSS (2024)
TRUNAJOD
TRUNAJOD: A text complexity library to enhance natural language processing - Published in JOSS (2021)
Mordecai
Mordecai: Full Text Geoparsing and Event Geocoding - Published in JOSS (2017)
edsnlp
Modular, fast NLP framework, compatible with Pytorch and spaCy, offering tailored support for French clinical notes.
pytextrank
Python implementation of TextRank algorithms ("textgraphs") for phrase extraction
contextualspellcheck
✔️Contextual word checker for better suggestions (not actively maintained)
negativas
negativas, uma ferramenta para auxiliar na busca e classificação de negações sentenciais no Português Brasileiro.
classy-classification
This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-shot classification with Huggingface.
thinc
🔮 A refreshing functional take on deep learning, compatible with your favorite libraries
asent
Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.
concise-concepts
This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with entity scoring.
crosslingual-coreference
A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.
astred
An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For instance useful for comparing a translation with the original text, to find differences and similarities between two different translations, or to see how a machine translation differs from a reference translation.
https://github.com/brucewlee/lingfeat
[EMNLP 2021] LingFeat - A Comprehensive Linguistic Features Extraction ToolKit for Readability Assessment
spacy-wrap
spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to include existing fine-tuned models within your SpaCy workflow.
https://github.com/bramvanroy/spacy_conll
Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doc and its sentences and tokens. Can also be used as a command-line tool.
https://github.com/brucewlee/lftk
[BEA @ ACL 2023] General-purpose tool for linguistic features extraction; Tested on readability assessment, essay scoring, fake news detection, hate speech detection, etc.
https://github.com/explosion/spacy-stanza
💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy
https://github.com/centrefordigitalhumanities/textminer
A script to detect named entities and store them in an Elasticsearch annotated_text field
https://github.com/explosion/spacy-transformers
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
https://github.com/argilla-io/spacy-wordnet
spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface
epitator
EpiTator annotates epidemiological information in text documents. It is the natural language processing framework that powers GRITS and EIDR Connect.
https://github.com/ccoreilly/spacy-catala
Spacy NLP Model for the Catalan language
https://github.com/dcavar/antisemitismdatathon2020
This is project material for the Antisemitism Datathon and Hackathon 2020 at Indiana University
https://github.com/ccoreilly/spacy-catala-generator
Training and dataset used for the catalan spacy model
https://github.com/aryashah2k/sasbitathon-winningsolution
1st Place solution for the SAS | GIM Bitathon, an annual Data Science Hackathon organized by SAS and Goa Institute of Management. The dataset worked on is the subset of the consumer complaints database provided by www.consumerfinance.gov
https://github.com/acdh-oeaw/acdh-prodigy-utils
custom loaders for spaCy's prodigy
charles-burney-digital
Digitale Aufbereitung, Anreicherung und Geovisualisierung eines Reiseberichts des Musikhistorikers Charles Burney, mithilfe von Transkribus, Spacy-NER und Nodegoat
bachelor_thesis_project
System for Training-based Expansion of Tools for Proper Name Mentions Recognition Based on Active Learning
https://github.com/atharvapathak/twitter_sentiment_analysis_project
Twitter sentiment analysis is the process of analyzing tweets posted on the Twitter platform to determine the overall sentiment expressed within them. It involves using natural language processing (NLP) and machine learning techniques to classify tweets.
spacy-models
German spaCy models trained on German UD-HDT and on a collection of gold and silver standard NER corpora
https://github.com/centre-for-humanities-computing/conspiracies
A python package for discovering and examining conspiracies using NLP.