https://github.com/dimits-ts/llm_moderation_research

Synthetic dialogue creation. Experiments exploring the role of LLMs in the moderation of deliberation systems.

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.0%) to scientific vocabulary

Keywords

llm moderation natural-language-processing research synthetic-dataset-generation

Last synced: 4 months ago · JSON representation

Repository

Synthetic dialogue creation. Experiments exploring the role of LLMs in the moderation of deliberation systems.

Basic Info

Host: GitHub
Owner: dimits-ts
Language: Jupyter Notebook
Default Branch: master
Homepage:
Size: 32.8 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 1
Releases: 0

Topics

llm moderation natural-language-processing research synthetic-dataset-generation

Created over 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme

Mitigating Polarization in Online Discussions Through Adaptive Moderation Techniques

This repository houses the code, documentation, paper and summplementary materials for my thesis conducted between the AUEB MSc in Data Science and Archimedes/Athena RC.

The subject of this thesis is the development of a framework capable of generating synthetic discussion between LLM user-agents and LLM moderators/facilitators as well as the automated annotation of conversations by LLM annotator-agents with different socio-demographic backgrounds.

Apart from the framework itself, we include the experiments and analysis presented in the paper, as well as the produced synthetic conversation datasets.

Abstract

Online discussion moderation/facilitation is crucial for discussions to flourish and prevent polarization and toxicity, which nowdays seem omnipresent. However, being heavily based on humans, this moderation/facilitation proves costly, time-consuming and non-scalable, which has led many to turn to LLMs for discourse facilitation. In this thesis, we explore the use of LLMs as pseudo-users in online discussions, as a cost-efficient, realistic and scalable way of substituting initial LLM facilitation experiments, which would ordinarily necessitate costly human involvement. Furthermore, we show that including socio-demographic backgrounds in our LLM users leads to more realistic discussions. We explore the use of LLM annotators to estimate discussion quality, using a new statistical test to gauge annotator polarization, and prove that using socio-demographic backgrounds in LLM annotators does not meaningfully affect their judgments. Finally, we release a synthetic-discussion creation and annotation framework, three synthetic datasets resulting from our experiments, as well as subsequent analysis and findings from these datasets.

Concepts

The subject of this thesis; developing a framework where many LLM user-agents can simulate online discussions. We prime the LLM user-agents to lower the quality of the conversation by any means, while concurrently instructing the LLM-moderator/facilitator to keep the conversation quality as high as possible.

Thesis research goal

Our framework further incorporates automated LLM-based annotations of these synthetic discussions, allowing for an inexpensive comparison of the effects of various factors such as moderator strategy, moderator presence, and LLM user prompts. Ordinarily, using LLMs for annotation presents two distinct issues; the model's inherent biases and the question of how representative their annotations are in comparison with ones that would be made by humans. While the latter concern can only be conclusively addressed by a correlation study, we attempt to address the former by using annotators with different SDBs. This also allows us to assess whether and how different LLM personalities influence the annotation process. Thesis annotation procedure

Requirements & Usage

Refer to src/README.md for usage instruction and software requirements.

Structure

src/: Code, input/output data, results, data analysis
paper/: Source code and PDF for the thesis
presentations/: Presentations concerning various aspects of this research

Owner

Name: Dimitris Tsirmpas
Login: dimits-ts
Kind: user

Repositories: 1
Profile: https://github.com/dimits-ts

I like playing around with data and building stuff.

GitHub Events

Total

Delete event: 2
Push event: 21
Pull request event: 3
Create event: 2

Last Year

Delete event: 2
Push event: 21
Pull request event: 3
Create event: 2

Committers

Last synced: over 1 year ago

All Time

Total Commits: 231
Total Committers: 2
Avg Commits per committer: 115.5
Development Distribution Score (DDS): 0.017

Past Year

Commits: 231
Committers: 2
Avg Commits per committer: 115.5
Development Distribution Score (DDS): 0.017

Top Committers

Name	Email	Commits
Dimitris Tsirmpas	t**m@g**m	227
Dimitris Tsirmpas	7****s	4

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 0
Total pull requests: 7
Average time to close issues: N/A
Average time to close pull requests: less than a minute
Total issue authors: 0
Total pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 6
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 7
Average time to close issues: N/A
Average time to close pull requests: less than a minute
Issue authors: 0
Pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 6
Bot issues: 0
Bot pull requests: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/dimits-ts/llm_moderation_research

Science Score: 26.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Mitigating Polarization in Online Discussions Through Adaptive Moderation Techniques

Abstract

Concepts

Requirements & Usage

Structure

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels