fourplebsapi

R implementation for the 4plebs.org API

https://github.com/buehlk/fourplebsapi

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.0%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

R implementation for the 4plebs.org API

Basic Info
  • Host: GitHub
  • Owner: buehlk
  • License: other
  • Language: R
  • Default Branch: main
  • Size: 1.41 MB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 1
  • Open Issues: 1
  • Releases: 1
Created almost 4 years ago · Last pushed over 1 year ago
Metadata Files
Readme Contributing License Citation Codemeta

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# fouRplebsAPI


[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-stable.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable) 
[![R-CMD-check](https://github.com/buehlk/fouRplebsAPI/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/buehlk/fouRplebsAPI/actions/workflows/R-CMD-check.yaml)


[![DOI](https://zenodo.org/badge/499033736.svg)](https://zenodo.org/badge/latestdoi/499033736)




The R package fouRplebsAPI enables researchers to query the 4chan database archived by [4plebs.org](https://www.4plebs.org/). 
This database is the largest ongoing archive of the ever-disappearing posts on the imageboard 4chan. With this package researchers can use the detailed search functionalities offered by 4plebs and retrieve structured data of the communication on 4chan.

The package is based on [4plebs API documentation](https://4plebs.tech/foolfuuka/). 

## Citation

If fouRplebsAPI is helpful for your research, please cite as:

Buehling, K. (2022). fouRplebsAPI: R package for accessing 4chan posts via the 4plebs.org API (Version 0.9.0). https://doi.org/10.5281/zenodo.6637440



## Installation

You can install the fouRplebsAPI from GitHub with:

``` r
# install.packages("devtools")
devtools::install_github("buehlk/fouRplebsAPI")
```

The 4chan boards currently covered by 4plebs are:

```{r echo = FALSE, results = 'asis', warning=FALSE}
library(knitr)
kable(data.frame("Board name" = c("Politically Incorrect", "High Resolution", "Traditional Games", "Television & Film", "Paranormal", "Sh*t 4chan Says", "Auto", "Advice", "Travel", "Flash", "Sports", "My Little Politics", "Mecha & Auto"),
                 "Abbreviation" = c("pol", "hr", "tg", "tv", "x", "s4s", "o", "adv", "trv", "f", "sp", "mlpol", "mo")
                 )
      )

```


## Search the 4chan archive

While this package includes several funtions that allow researchers to query and inspect specific 4chan posts (get_4chan_post) or threads (get_4chan_thread), researchers wanting to collect data from the 4plebs archive will probably be interested in collecting a larger amount of data.

The *first* way of collecting data is by collecting the latest threads in a given board. Let's say you are interested in the 20 latest threads from the "Advice" board (excluding the comments accompanying the opening post), one way of querying the data is:

```{r example1}
library(fouRplebsAPI)

recentAdv <- get_4chan_board_range(board = "adv", page_start = 1, page_stop = 2, latest_comments = FALSE)

str(recentAdv, vec.len = 1, nchar.max = 60)
```

The output description can be found in the function documentations. Theoretically it would be possible to scrape vast ranges of the archive with this function, even though the API has an API rate limit, which slows down the querying process. 

A *second* way collecting 4chan data with this package is the search function. 4plebs allows for a very detailed search with many search filter. I will show only simple examples of the data that can be collected with fouRplebsAPI.

The example I show here is rather cheerful, because I would like to avoid the more controversial topics for which 4chan, especially the /pol/ board is notorious. Researchers, for instance the ones interested in the political communication of actors with contentious ideologies, will find it easy to adapt this example. 
But this one is about vacation.

Let's find the communication in the "Travel" board that discusses Mallorca, Spain.

First, to get a first impression of the search results, one can inspect a snippet of the 25 most recent posts containing the search term "mallorca".

```{r example2}
mallorca_snippet <- search_4chan_snippet(boards = "trv", start_date = "2021-01-01", end_date = "2022-12-31", text = "mallorca")

str(mallorca_snippet, vec.len = 1, nchar.max = 60)
```
Note that the function search_4chan_snippet() also prints the total number of search results and the estimated time to retrieve them with search_4chan(). This estimation is based on 5 requests per minute API limit.

Users only interested in the number of results can just retrieve them by changing the parameter result_type to "results_num". Now one can compare the number of posts mentioning Mallorca between different time periods. For example, pre-pandemic vs. post-pandemic:

```{r example3}
mallorca_pre <- search_4chan_snippet(boards = "trv", start_date = "2018-01-01", end_date = "2019-12-31", text = "mallorca", result_type = "results_num")
mallorca_post <- search_4chan_snippet(boards = "trv", start_date = "2020-01-01", end_date = "2021-12-31", text = "mallorca", result_type = "results_num")

data.frame("Years" = c("2018 & 2019", "2020 & 2021"),
       "Total results" = c(mallorca_pre["total_found"], mallorca_post["total_found"])
       )
```
It seems that this island has been mentioned more as people tended to stay home.

Researchers interested in gathering more data than just a snippet of the posts can use the function search_4chan(). Staying with the example of posts mentioning Mallorca over time, one could be inclined to ask, whether the image of Mallorca has changed during the pandemic. Apart from simply getting *all* the posts mentioning a search term, it is for example possible to filter for posts containing image data:

```{r example4}
mallorca_pre_pics <- search_4chan(boards = "trv", start_date = "2018-01-01", end_date = "2019-12-31", text = "mallorca", show_only = "image")
mallorca_post_pics <- search_4chan_snippet(boards = "trv", start_date = "2018-01-01", end_date = "2019-12-31", text = "mallorca", show_only = "image")

head(mallorca_post_pics$media_link)
```
The column media_link provides the downloadable image links of the posts retrieved.

## Citation

If fouRplebsAPI is helpful for your research, please cite as:

Buehling, K. (2022). fouRplebsAPI: R package for accessing 4chan posts via the 4plebs.org API (Version 0.9.0). https://doi.org/10.5281/zenodo.6637440

Owner

  • Login: buehlk
  • Kind: user

Citation (CITATION.cff)

# -----------------------------------------------------------
# CITATION file created with {cffr} R package, v0.2.2
# See also: https://docs.ropensci.org/cffr/
# -----------------------------------------------------------

cff-version: 1.2.0
message: 'To cite package "fouRplebsAPI" in publications use:'
type: software
license: MIT
title: 'fouRplebsAPI: R package for accessing 4chan posts via the 4plebs.org API'
version: 0.9.0
doi: 10.5281/zenodo.6637440
abstract: This package allows queries to the 4plebs.org API. Researchers can thereby
  access the largest public 4chan archive, called 4plebs.org. Visual and text data
  posted in 4chan can be searched whithin and across 4chan boards and can also be
  accessed with detailed queries and stored in a structured manner.
authors:
- family-names: Buehling
  given-names: Kilian
  email: kilian.buehling@fu-berlin.de
  orcid: 0000-0002-5244-7547
preferred-citation:
  type: manual
  title: 'fouRplebsAPI: R package for accessing 4chan posts via the 4plebs.org API'
  authors:
  - family-names: Buehling
    given-names: Kilian
    email: kilian.buehling@fu-berlin.de
    orcid: 0000-0002-5244-7547
  doi: 10.5281/zenodo.6637440
  url: https://www.doi.org/10.5281/zenodo.6637440
  version: 0.9.0
  abstract: This package allows queries to the 4plebs.org API. Researchers can thereby
    access the largest public 4chan archive, called 4plebs.org. Visual and text data
    posted in 4chan can be searched whithin and across 4chan boards and can also be
    accessed with detailed queries and stored in a structured manner.
  repository-code: https://github.com/buehlk/fouRplebsAPI
  url: https://github.com/buehlk/fouRplebsAPI
  contact:
  - family-names: Buehling
    given-names: Kilian
    email: kilian.buehling@fu-berlin.de
    orcid: 0000-0002-5244-7547
  license: MIT
  year: '2022'
repository-code: https://github.com/buehlk/fouRplebsAPI
url: https://github.com/buehlk/fouRplebsAPI
contact:
- family-names: Buehling
  given-names: Kilian
  email: kilian.buehling@fu-berlin.de
  orcid: 0000-0002-5244-7547
references:
- type: software
  title: httr
  abstract: 'httr: Tools for Working with URLs and HTTP'
  notes: Imports
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
  year: '2022'
  url: https://CRAN.R-project.org/package=httr
- type: software
  title: jsonlite
  abstract: 'jsonlite: A Simple and Robust JSON Parser and Generator for R'
  notes: Imports
  authors:
  - family-names: Ooms
    given-names: Jeroen
    email: jeroen@berkeley.edu
    orcid: https://orcid.org/0000-0002-4035-0289
  year: '2022'
  url: https://CRAN.R-project.org/package=jsonlite
- type: software
  title: purrr
  abstract: 'purrr: Functional Programming Tools'
  notes: Imports
  authors:
  - family-names: Henry
    given-names: Lionel
    email: lionel@rstudio.com
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
  year: '2022'
  url: https://CRAN.R-project.org/package=purrr
- type: software
  title: stringr
  abstract: 'stringr: Simple, Consistent Wrappers for Common String Operations'
  notes: Imports
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
  year: '2022'
  url: https://CRAN.R-project.org/package=stringr
- type: software
  title: dplyr
  abstract: 'dplyr: A Grammar of Data Manipulation'
  notes: Imports
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
    orcid: https://orcid.org/0000-0003-4757-117X
  - family-names: François
    given-names: Romain
    orcid: https://orcid.org/0000-0002-2444-4226
  - family-names: Henry
    given-names: Lionel
  - family-names: Müller
    given-names: Kirill
    orcid: https://orcid.org/0000-0002-1416-3412
  year: '2022'
  url: https://CRAN.R-project.org/package=dplyr
- type: software
  title: covr
  abstract: 'covr: Test Coverage for Packages'
  notes: Suggests
  authors:
  - family-names: Hester
    given-names: Jim
    email: james.f.hester@gmail.com
  year: '2022'
  url: https://CRAN.R-project.org/package=covr
- type: software
  title: knitr
  abstract: 'knitr: A General-Purpose Package for Dynamic Report Generation in R'
  notes: Suggests
  authors:
  - family-names: Xie
    given-names: Yihui
    email: xie@yihui.name
    orcid: https://orcid.org/0000-0003-0645-5666
  year: '2022'
  url: https://CRAN.R-project.org/package=knitr
- type: software
  title: rmarkdown
  abstract: 'rmarkdown: Dynamic Documents for R'
  notes: Suggests
  authors:
  - family-names: Allaire
    given-names: JJ
    email: jj@rstudio.com
  - family-names: Xie
    given-names: Yihui
    email: xie@yihui.name
    orcid: https://orcid.org/0000-0003-0645-5666
  - family-names: McPherson
    given-names: Jonathan
    email: jonathan@rstudio.com
  - family-names: Luraschi
    given-names: Javier
    email: javier@rstudio.com
  - family-names: Ushey
    given-names: Kevin
    email: kevin@rstudio.com
  - family-names: Atkins
    given-names: Aron
    email: aron@rstudio.com
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
  - family-names: Cheng
    given-names: Joe
    email: joe@rstudio.com
  - family-names: Chang
    given-names: Winston
    email: winston@rstudio.com
  - family-names: Iannone
    given-names: Richard
    email: rich@rstudio.com
    orcid: https://orcid.org/0000-0003-3925-190X
  year: '2022'
  url: https://CRAN.R-project.org/package=rmarkdown
- type: software
  title: roxyglobals
  abstract: 'roxyglobals: Roxgen2 Global Variable Declarations'
  notes: Suggests
  authors:
  - family-names: North
    given-names: Anthony
    email: anthony.jl.north@gmail.com
  year: '2022'
  url: https://github.com/anthonynorth/roxyglobals
  version: '>= 0.2.1'
- type: software
  title: testthat
  abstract: 'testthat: Unit Testing for R'
  notes: Suggests
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
  year: '2022'
  url: https://CRAN.R-project.org/package=testthat
  version: '>= 3.0.0'

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "identifier": "fouRplebsAPI",
  "description": "This package allows queries to the 4plebs.org API. Researchers can thereby access the largest public 4chan archive, called 4plebs.org. Visual and text data posted in 4chan can be searched whithin and across 4chan channels and can also be accessed with detailed queries and stored in a structured manner.",
  "name": "fouRplebsAPI: R implementation for the 4plebs.org API",
  "codeRepository": "https://github.com/buehlk/fouRplebsAPI",
  "issueTracker": "https://github.com/buehlk/fouRplebsAPI/issues",
  "license": "https://spdx.org/licenses/MIT",
  "version": "0.1.0",
  "programmingLanguage": {
    "@type": "ComputerLanguage",
    "name": "R",
    "url": "https://r-project.org"
  },
  "runtimePlatform": "R version 4.1.1 (2021-08-10)",
  "author": [
    {
      "@type": "Organization",
      "name": "person"
    },
    {
      "@type": "Person",
      "givenName": [
        "email",
        "="
      ],
      "familyName": "kilian.buehling@fu-berlin.de)"
    }
  ],
  "maintainer": [
    {
      "@type": "Person",
      "givenName": [
        "The",
        "package"
      ],
      "familyName": "maintainer",
      "email": "yourself@somewhere.net"
    }
  ],
  "softwareSuggestions": [
    {
      "@type": "SoftwareApplication",
      "identifier": "covr",
      "name": "covr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=covr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "roxyglobals",
      "name": "roxyglobals",
      "version": ">= 0.2.1",
      "sameAs": "https://github.com/anthonynorth/roxyglobals"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "testthat",
      "name": "testthat",
      "version": ">= 3.0.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=testthat"
    }
  ],
  "softwareRequirements": {
    "1": {
      "@type": "SoftwareApplication",
      "identifier": "httr",
      "name": "httr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=httr"
    },
    "2": {
      "@type": "SoftwareApplication",
      "identifier": "jsonlite",
      "name": "jsonlite",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=jsonlite"
    },
    "3": {
      "@type": "SoftwareApplication",
      "identifier": "purrr",
      "name": "purrr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=purrr"
    },
    "4": {
      "@type": "SoftwareApplication",
      "identifier": "stringr",
      "name": "stringr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=stringr"
    },
    "5": {
      "@type": "SoftwareApplication",
      "identifier": "dplyr",
      "name": "dplyr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=dplyr"
    },
    "SystemRequirements": null
  },
  "fileSize": "76.136KB",
  "readme": "https://github.com/buehlk/fouRplebsAPI/blob/main/README.md",
  "developmentStatus": "https://lifecycle.r-lib.org/articles/stages.html#stable"
}

GitHub Events

Total
Last Year