algorithmic_bias_in_echo_chamber_formation

Computational Social Science Project: "Algorithmic Bias in Echo Chamber Formation".

https://github.com/inphyt/algorithmic_bias_in_echo_chamber_formation

Science Score: 41.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.9%) to scientific vocabulary

Keywords

agent-based-modeling algorithmic-bias community-detection computational-social-science echo-chamber election-analysis multilayer-graphs opinion-dynamics social-media social-network social-network-analysis twitter twitter-api
Last synced: 6 months ago · JSON representation ·

Repository

Computational Social Science Project: "Algorithmic Bias in Echo Chamber Formation".

Basic Info
Statistics
  • Stars: 11
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Topics
agent-based-modeling algorithmic-bias community-detection computational-social-science echo-chamber election-analysis multilayer-graphs opinion-dynamics social-media social-network social-network-analysis twitter twitter-api
Created almost 3 years ago · Last pushed over 2 years ago
Metadata Files
Readme License Citation

README.md

<!-- Title -->

Computational Social Science Project

Algorithmic Bias in Echo Chamber Formation

License: MIT Docs: Slides Docs: Report DOI

Goal

The research question is to assess algorithmic impact on Twitter echo chamber formation. In order to minimize biases, we observe the same user pool debating over the same topic (US Elections) before and after the introduction of the recommendation algorithm in 2014. Therefore, exploiting a complete dataset of 2012 tweets relative to 2012 U.S. elections, we extract a sample of users to initialize, calibrate and validate an ABM to describe a free Twitter. The follow, retweet, favorite, mention (including replies) and hashtag activity of the same pool of users is being monitored in order to get the data needed to later model today's algorithm-biased Twitter echo chamber dynamics and compare it with what would predict the validated algorithm-free model. So we organized the work so far as follows.

Definitions

Recommendation algorithm

A recommendation algorithm is a algorithm whose inputs are all the tweets produced at a certain model iteration and outputs the feed for each user during next iteration.

Parametric algorithm

The parametric algorithm is a recommendation algorithm explicitly implemented by us, that takes into account (via parameters to be set/fitted) the agents' full activity. It is said to be free if it produces each user's feed only looking at who her friends are and in what chronological order they tweeted. Twitter is called free if endowed with a free parametric recommendation algorithm.

Echo chamber

An echo chamber on a graph is a subgraph defined by two conditions:

  1. It is recognized as a cluster by a (given) clustering algorithm
  2. It must exceed a certain opinion homogeneity threshold, given a measure of opinion homogeneity on a subgraph.

Data

2012 collection

Tweets selection & initial processing

We collected all 2012 tweet objects filtering out all the irrelevant fields and selecting only english tweets.

Users selection

We performed a two-step selection over the remaining pool of tweets:

  • hashtag-based user mining
  • subscription date based user selection (i.e we kept only those still-existing users whose subscription date is the same as the one we got from 2012 data).

2020 collection

Multithread interaction network scraping

We scraped, their timeline-aggregated interaction (retweet, mention) network and identified all the users who took part in it (i.e. they are connected to at least one other user).

This way we obtained a pool of $\sim 10,000$ users that both took part in the 2012 U.S. election debate and still exist today, allowing us to compare echo chambers formation before and after the introduction of the recommendation algorithm.

Multithread monitoring strategy deployment

A detailed temporal activity monitoring pipeline has been launched over such $10,000$ users. This will allow us to associate a timestamp to events (such as follow activity) which would otherwise have no time indication or ordering should one use Twitter historical API. We need to gather temporal information of such detailed activity in order to later model Twitter dynamics and then reproduce it with a parametric algorithm.

Agent-Based Model

Computational Framework

The framework we used is Agents.jl, an intuitive yet powerful ABM Julia library . A key ingredient in Agents.jl ABM models is the base space, for which we adopted an ad-hoc extension of Graphs.jl model.

Overview

The ambient space of the model consists in a 4-layer multiplex graph, whose levels are the follow, retweet, favorite and mention (including replies) networks.

Each node is occupied by an agent that represents a Twitter user. Agents are initialized with activity rate parameters drawn from the data, including its opinion about the Democrat/Republican debate encoded in a real number $o \in [-1,+1]$.

The tweet is an object composed of the author id and her opinion at the time of writing.

At each iteration of the model dynamics, each agent reads the tweets selected and ordered for her by the recommendation algorithm (albeit topological), consequently changes her opinion via an opinion dynamics model and decides who to follow, unfollow, retweet, unretweet, etc.

Initialization, calibration & validation

A time step is selected so that it encompasses a statistically significant portion of 2012 data. The distributions extracted from the first temporal slice is used to initialize the model. The rest, except for the last slice, is instead adopted to calibrate the model (parameters such as changes in activity rates).
The last slice will be used for validation.

Recommendation Algorithm

2020 Twitter dynamics

From the data scraped using the monitor, we intend to train and validate a model able to perform link prediction on the multiplex graph of our ABM. Later, we aim to fit the parametric algorithm on the dynamics predicted by the link prediction model on the multiplex graph. Such fitted parametric algorithm will output the predicted feeds of all users. This in turn will let us draw a temporal directed tweet network, whose nodes are users and edges from user $i$ to user $j$ iff user $j$ read a tweet of $i$. This new network encodes the actual information flow on Twitter, the one where the concept of echo chambers makes sense.

Owner

  • Name: Interdisciplinary Physics Team (InPhyT)
  • Login: InPhyT
  • Kind: organization
  • Email: inphyt@gmail.com
  • Location: Turin, Italy

Complex Systems Modelling Group: Computational Social Science, Epidemiology and Neuroscience.

Citation (CITATION.bib)

@software{Monticone_Moroni_CSS_2023,
         abstract     = {Computational Social Science Project: Algorithmic Bias in Echo Chamber Formation.},
         author       = {Monticone, Pietro and Moroni, Claudio},
         doi          = {},
         institution  = {University of Turin (UniTO)},
         keywords     = {Computational Social Science, Election Data, Opinion Inference, API, Data Science, Data Mining, Scraper, Social Media, Twitter, Twitter API, Graph Algorithms, Social Network, Social Network Analysis, Opinion Mining, Data Visualization, Network Visualization, Network Analysis, Echo Chamber, Polarization, Radicalization, Algorithmic Bias, Graph Statistics},
         license      = {MIT},
         organization = {Interdisciplinary Physics Team (InPhyT)},
         title        = {Algorithmic Bias in Echo Chamber Formation},
         url          = {https://github.com/InPhyT/Algorithmic_Bias_in_Echo_Chamber_Formation},
         year         = {2023}
         }

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels