youtube-workshop-gesis-2022

Materials for the 2022 GESIS Training workshop "Automatic Sampling and Analysis of YouTube Comments"

https://github.com/jobreu/youtube-workshop-gesis-2022

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.6%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Materials for the 2022 GESIS Training workshop "Automatic Sampling and Analysis of YouTube Comments"

Basic Info
  • Host: GitHub
  • Owner: jobreu
  • Language: HTML
  • Default Branch: main
  • Size: 54.9 MB
Statistics
  • Stars: 10
  • Watchers: 2
  • Forks: 3
  • Open Issues: 0
  • Releases: 0
Created about 4 years ago · Last pushed about 4 years ago
Metadata Files
Readme Citation

README.md

Workshop "Automatic Sampling and Analysis of YouTube Comments", GESIS 2022

Materials for the 2022 GESIS Training workshop "Automatic Sampling and Analysis of YouTube Comments"

Johannes Breuer (johannes.breuer@gesis.org, \@MattEagle09); Julian Kohne (Julian.Kohne@gesis.org, \@JuuuuKoooo); M. Rohangis Mohseni (Rohangis.Mohseni@tu-ilmenau.de, \@romohseni)

Please link to the workshop GitHub repository


Workshop description

YouTube is the largest and most popular video platform on the internet. The producers and users of YouTube content generate huge amounts of data. These data are also of interest to researchers (in the social sciences as well as other disciplines) for studying different aspects of online media use and communication. Accessing and working with these data, however, can be challenging. In this workshop, we will first discuss the potential of YouTube data for research in the social sciences, and then introduce participants to different tools and methods for sampling and analyzing data from YouTube. We will then demonstrate and compare several tools for collecting YouTube data. Our focus for the main part of the workshop will be on using the tuber package for R to collect data via the YouTube API and wrangling and analyzing the data in R (using various packages). Regarding the type of data, we will focus on user comments but also will also (briefly) look into other YouTube data, such as video statistics and subtitles. For the comments, we will show how to clean/process them in R, how to deal with emojis, and how to do some basic forms of automated text analysis (e.g., word frequencies, sentiment analysis). While we believe that YouTube data has great potential for research in the social sciences (and other disciplines), we will also discuss the unique challenges and limitations of using this data.

Target group

The workshop is aimed at people who are interested in using YouTube data for their research.

Learning objectives

Participants will learn how they can use YouTube data for their research. They will get to know tools and methods for collecting YouTube data. By the end of the workshop, participants should be able to... - automatically collect YouTube data - process/clean it - do some basic (exploratory) analyses of user comments

Prerequisites

Participants should at least have some basic knowledge of R and, ideally, also the tidyverse. Basic R knowledge can, for example, be acquired through the swirl course "R Programming" (see https://swirlstats.com/) or the RStudio Primer "Programming basics", both of which are available for free. There also are many brief online introductions to the tidyverse, such as this blog post by Dominic Royé or this workshop by Olivier Gimenez.

For the exercises as well as for "coding along" with the slides, access to the YouTube API is required. Information on this can be found in the slides on the YouTube API Setup.

Timetable

Day 1

| Time | Topic | | ------------- | --------------------------------------- | | 10:00 - 11:00 | Introduction | | 11:00 - 11:30 | Break | | 11:30 - 12:30 | The YouTube API | | 12:30 - 13:30 | Lunch Break | | 13:30 - 15:00 | Collecting data with tuber for R | | 15:00 - 15:30 | Break | | 15:30 - 17:00 | Processing and cleaning user comments |

Day 2

| Time | Topic | | ------------- | --------------------------------------- | | 09:00 - 10:30 | Basic text analysis of user comments | | 10:30 - 11:00 | Break | | 11:00 - 12:00 | Sentiment analysis of user comments | | 12:00 - 13:00 | Lunch Break | | 13:00 - 14:00 | Excursus: Retrieving video subtitles | | 14:00 - 14:30 | Break | | 14:30 - 16:00 | Practice session, questions, & outlook |

Materials

Slides

A1 Introduction

A2 The YouTube API

A3 Collecting data with tuber

A4 Processing and cleaning user comments

B1 Basic text analysis

B2 Sentiment analysis of user comments

B3 Excursus: Retrieving video subtitles

B4 Outlook, Recap, Practice

Exercises

A2 YouTube API exercises

A3 tuber data collection exercises

A4 Processing and cleaning user comments exercises

B1 Basic text analysis exercises

B2 Sentiment analysis of user comments exercises

Solutions

A2 YouTube API exercise solutions

A3 tuber data collection exercise solutions

A4 Processing and cleaning user comments exercise solutions

B1 Basic text analysis exercise solutions

B2 Sentiment analysis of user comments exercises

Owner

  • Name: Johannes Breuer
  • Login: jobreu
  • Kind: user
  • Location: Cologne, Germany
  • Company: GESIS - Leibniz Institute for the Social Sciences

Senior researcher at GESIS - Leibniz Institute for the Social Sciences and @CAIS-Research

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use these materials, please cite them as follows."
authors:
- family-names: "Breuer"
  given-names: "Johannes"
  orcid: "https://orcid.org/0000-0001-5906-7873"
- family-names: "Kohne"
  given-names: "Julian"
- family-names: "Mohseni"
  given-names: "M. Rohangis"
  orcid: "https://orcid.org/0000-0001-7686-8322"
title: "Workshop Automatic Sampling and Analysis of YouTube Comments"
date-released: 2022-02-21
url: "https://github.com/jobreu/youtube-workshop-gesis-2022"
license: CC-BY-4.0

GitHub Events

Total
Last Year

Issues and Pull Requests

Last synced: 11 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels