mlconjug3
A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.
https://github.com/google-research/tapas
End-to-end neural table-text understanding models.
kex
Kex is a python library for unsupervised keyword extraction from a document, providing an easy interface and benchmarks on 15 public datasets.
loksabha-questions
Questions asked in the Lok Sabha - collection and analysis of trends. Creating the dataset from scratch.
https://github.com/csinva/gpt-paper-title-generator
Generating paper titles (and more!) with GPT trained on data scraped from arXiv.
neurox
A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.
https://github.com/argosopentech/metaltranslate
Customizable machine translation in C++
https://github.com/bentoml/transformers-nlp-service
Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more
bundestags-mine
Analysing the German Bundestag by means of Natural Language Processing with the Bundestags-Mine.
https://github.com/ammarlodhi255/image-captioning-system-to-assist-the-blind
An image captioning system that is able to predict and speak out a caption of an image taken by visually impaired.
https://github.com/aarhus-psychiatry-research/care-ml
CARE-ML: Predicting the use of restraint on psychiatric inpatients using EHRs and ML. Developed by sarakolding and signekb for their Master's Thesis.
annodash
[JAMIA Open] A clinical terminology annotation dashboard created using Plotly Dash & the MIMIC-IV database.
trump-speech-analysis
Statistical patterns in political rhetoric: The quantitative analysis of Donald Trump's 2024 election campaign speeches
https://github.com/boeing/sdr-hazards-classification
Collaboration work between FAA and Boeing on identifying safety hazards in Service Difficulty Reports (SDR)
pet-project-nlp
Natural language processing pet project. It includes data web scraping, lemmatizing, stemming, and working with related words (hyponyms, hypernyms, meronyms, holonyms). This specific code gathers all data from chosen pages of the Suspilne (Суспільне) webpage. Next, the data is manipulated and processed for future analysis
https://github.com/chartes/deucalion-model-af
Modèle Pie pour la lemmatisation de l’ancien français
clinical_trial_risk
Clinical Trial Risk Tool for analysing clinical trial protocols using natural language processing and assessing the risk of ending uninformatively, developed by Fast Data Science