unsupervised-multilingual-text-detoxifier
https://github.com/bda-kts/unsupervised-multilingual-text-detoxifier
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.7%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: BDA-KTS
- License: mit
- Language: Python
- Default Branch: main
- Size: 242 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 1
- Releases: 0
Metadata Files
README.md
mDetoxifier Multilingual unsupervised text detoxifier
Description
This method leverages advanced natural language processing techniques to automatically identify and neutralize toxic language in text, regardless of the language used. By combining language detection, toxic word identification, linguistic pattern analysis, and state-of-the-art mask prediction models, mDetoxifier transforms harmful content into neutral text while maintaining the original intent and meaning. The approach is fully unsupervised, making it adaptable to new languages and domains without the need for extensive labeled datasets. This makes it a powerful tool for researchers, developers, and organizations seeking to promote healthier online communication across diverse linguistic communities.
Keywords
text-detoxification, toxicity,mask-prediction, sentence-similarity, sequence-to-sequence models
Relevant research questions that could be adressed with the help of this method
- Presence of offensive language in tweets in social media and how to detoxify them (S. Poria, E. Cambria, D. Hazarika, P. Vij, A deeper look into sarcastic tweets using deep convolutional neural networks, arXiv preprint arXiv:1610.08815 (2016))
- Detecting and neutralising hate speech in various social media platforms (P. Liu, J. Guberman, L. Hemphill, A. Culotta, Forecasting the presence and intensity of hostility on instagram using linguistic and social features, in: Proceedings of the International AAAI Conference on Web and Social Media, volume 12, 2018.)
- Toxicity mitigation for low-resource languages (P. Liu, J. Guberman, L. Hemphill, A. Culotta, Forecasting the presence and intensity of hostility on instagram using linguistic and social features, in: Proceedings of the International AAAI Conference on Web and Social Media, volume 12, 2018)
Social Science Usecase
Mary is a researcher who wants to investigate the Hate in online and in traditional media. She wants to study impacts on individuals, audiences, and communities and wants to find neutralised alternatives to toxic texts. She has a huge collection of toxic inputs in many different languages, from different websites but wants to have them all at one place and search those pertaining to gun. She uses the search box to find methods related to toxicity.The search functionality of the MH shows her a list of related methods and tutorials. She then uses mDetoxifier Multilingual unsupervised text detoxifier to neutralise the texts in various languages to carry out her research.
Repository Structure
mdetox.py - The main file to run the project
Environment Setup
This program requires Python 3.x to run.
How to Use
Call the method in the following way
maskedsentences = masksimilarwords(hi, hindisentence,selected_tokens)
where hi are the toxic words in hindi language and hindi_sentences are a list of toxic sentences in hindi. The same method can be applied to any languages where the the toxic words and sentences can be replaced with the desired language
Digital Behavioral data
Sample Input
A list of toxic sentences. Can be anything, for example a sample list of toxic test for multiple languages can be found here https://huggingface.co/datasets/textdetox/multilingualparadetoxtest
Sample Output
A list of detoxified sentences along with their corrosponding toxic ones and the detected language
mdetox pipeline
- Language Detection Module The first step a toxic text passes through is a language detection module. We used the Python langdetect5 library for this purpose.
- Toxic Words Identification and Masking To identify toxic words in the sentences, we adopted a combination of hashing-based techniques and log-odds ratio. As a starting point, we utilized the list of toxic lexicons.We employed a hashing-based sequence-matching mechanism7 to identify words similar to these lexicons beyond a certain threshold. These identified toxic words were then removed from the sentences and replaced with masks.
- Mask Placement with Linguistic Patterns
Languages follow certain grammatical paradigms or linguistic rules that aid in constructing sentences. By observing these rules, we were able to better process the masks in sentences. An example is showed below
- Mask Prediction Following the process of identifying and masking toxic words, and implementing linguistic rules, we were left with sentences containing masked toxic words. To handle these, we used the XLM-RoBERTa large model .Using this model, we predicted the top three probable replacements for each mask and generated sentences accordingly
- Sentence Similarity From our resultant sentences we chose the one that had the lowest score indicating resultant sentence closest to the input toxic sentence as our selected output sentence
Limitation
The method needs a list of toxic lexicons(curse words in specific languages). A list of toxic lexicons for 9 languages(English, Spanish, German, Chinese, Arabic, Hindi, Ukrainian, Russian, and Amharic) are provided here. User can add/edit as per need and will https://huggingface.co/datasets/textdetox/multilingualtoxiclexicon
Contact
Susmita.Gangopadhyay@gesis.org
Publication
- HybridDetox: Combining Supervised and Unsupervised Methods for Effective Multilingual Text Detoxification (Susmita Gangopadhyay, M.Taimoor Khan and Hajira Jabeen) In review for PAN CLEF Multilingual Text Detoxification Challenge
Owner
- Name: BDA-KTS
- Login: BDA-KTS
- Kind: organization
- Repositories: 1
- Profile: https://github.com/BDA-KTS
Citation (citation.cff)
cff-version: 1.2.0
title: mDetoxifier Multilingual unsupervised text detoxifier
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Susmita
family-names: Gangopadhyay
orcid: '0009-0009-1520-9070'
affiliation: 'GESIS , Leibniz Institute for the Social Sciences'
abstract: >-
mDetoxifier is a multilingual unsupervised text detoxifier that removes toxic content from text data. It is designed to work with multiple languages and can be used in various applications where text toxicity needs to be mitigated.
keywords:
- text detoxification
- multilingual
- unsupervised learning
- natural language processing
- toxicity removal
license: MIT
GitHub Events
Total
- Issues event: 1
- Issue comment event: 1
- Push event: 1
- Pull request event: 1
- Fork event: 1
Last Year
- Issues event: 1
- Issue comment event: 1
- Push event: 1
- Pull request event: 1
- Fork event: 1
Dependencies
- PyYAML ==6.0.2
- aiohappyeyeballs ==2.6.1
- aiohttp ==3.12.13
- aiosignal ==1.3.2
- attrs ==25.3.0
- certifi ==2025.6.15
- charset-normalizer ==3.4.2
- click ==8.2.1
- datasets ==3.6.0
- dill ==0.3.8
- filelock ==3.18.0
- frozenlist ==1.7.0
- fsspec ==2025.3.0
- hf-xet ==1.1.5
- huggingface-hub ==0.33.1
- idna ==3.10
- joblib ==1.5.1
- langdetect ==1.0.9
- multidict ==6.6.2
- multiprocess ==0.70.16
- nltk ==3.9.1
- numpy ==2.3.1
- packaging ==25.0
- pandas ==2.3.0
- propcache ==0.3.2
- pyarrow ==20.0.0
- python-dateutil ==2.9.0.post0
- pytz ==2025.2
- regex ==2024.11.6
- requests ==2.32.4
- safetensors ==0.5.3
- six ==1.17.0
- tokenizers ==0.21.2
- tqdm ==4.67.1
- transformers ==4.53.0
- typing_extensions ==4.14.0
- tzdata ==2025.2
- urllib3 ==2.5.0
- xxhash ==3.5.0
- yarl ==1.20.1