Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.0%) to scientific vocabulary
Last synced: 6 months ago
·
JSON representation
·
Repository
R implementation for the 4plebs.org API
Basic Info
- Host: GitHub
- Owner: buehlk
- License: other
- Language: R
- Default Branch: main
- Size: 1.41 MB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 1
- Releases: 1
Created almost 4 years ago
· Last pushed over 1 year ago
Metadata Files
Readme
Contributing
License
Citation
Codemeta
README.Rmd
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# fouRplebsAPI
[](https://lifecycle.r-lib.org/articles/stages.html#stable)
[](https://github.com/buehlk/fouRplebsAPI/actions/workflows/R-CMD-check.yaml)
[](https://zenodo.org/badge/latestdoi/499033736)
The R package fouRplebsAPI enables researchers to query the 4chan database archived by [4plebs.org](https://www.4plebs.org/).
This database is the largest ongoing archive of the ever-disappearing posts on the imageboard 4chan. With this package researchers can use the detailed search functionalities offered by 4plebs and retrieve structured data of the communication on 4chan.
The package is based on [4plebs API documentation](https://4plebs.tech/foolfuuka/).
## Citation
If fouRplebsAPI is helpful for your research, please cite as:
Buehling, K. (2022). fouRplebsAPI: R package for accessing 4chan posts via the 4plebs.org API (Version 0.9.0). https://doi.org/10.5281/zenodo.6637440
## Installation
You can install the fouRplebsAPI from GitHub with:
``` r
# install.packages("devtools")
devtools::install_github("buehlk/fouRplebsAPI")
```
The 4chan boards currently covered by 4plebs are:
```{r echo = FALSE, results = 'asis', warning=FALSE}
library(knitr)
kable(data.frame("Board name" = c("Politically Incorrect", "High Resolution", "Traditional Games", "Television & Film", "Paranormal", "Sh*t 4chan Says", "Auto", "Advice", "Travel", "Flash", "Sports", "My Little Politics", "Mecha & Auto"),
"Abbreviation" = c("pol", "hr", "tg", "tv", "x", "s4s", "o", "adv", "trv", "f", "sp", "mlpol", "mo")
)
)
```
## Search the 4chan archive
While this package includes several funtions that allow researchers to query and inspect specific 4chan posts (get_4chan_post) or threads (get_4chan_thread), researchers wanting to collect data from the 4plebs archive will probably be interested in collecting a larger amount of data.
The *first* way of collecting data is by collecting the latest threads in a given board. Let's say you are interested in the 20 latest threads from the "Advice" board (excluding the comments accompanying the opening post), one way of querying the data is:
```{r example1}
library(fouRplebsAPI)
recentAdv <- get_4chan_board_range(board = "adv", page_start = 1, page_stop = 2, latest_comments = FALSE)
str(recentAdv, vec.len = 1, nchar.max = 60)
```
The output description can be found in the function documentations. Theoretically it would be possible to scrape vast ranges of the archive with this function, even though the API has an API rate limit, which slows down the querying process.
A *second* way collecting 4chan data with this package is the search function. 4plebs allows for a very detailed search with many search filter. I will show only simple examples of the data that can be collected with fouRplebsAPI.
The example I show here is rather cheerful, because I would like to avoid the more controversial topics for which 4chan, especially the /pol/ board is notorious. Researchers, for instance the ones interested in the political communication of actors with contentious ideologies, will find it easy to adapt this example.
But this one is about vacation.
Let's find the communication in the "Travel" board that discusses Mallorca, Spain.
First, to get a first impression of the search results, one can inspect a snippet of the 25 most recent posts containing the search term "mallorca".
```{r example2}
mallorca_snippet <- search_4chan_snippet(boards = "trv", start_date = "2021-01-01", end_date = "2022-12-31", text = "mallorca")
str(mallorca_snippet, vec.len = 1, nchar.max = 60)
```
Note that the function search_4chan_snippet() also prints the total number of search results and the estimated time to retrieve them with search_4chan(). This estimation is based on 5 requests per minute API limit.
Users only interested in the number of results can just retrieve them by changing the parameter result_type to "results_num". Now one can compare the number of posts mentioning Mallorca between different time periods. For example, pre-pandemic vs. post-pandemic:
```{r example3}
mallorca_pre <- search_4chan_snippet(boards = "trv", start_date = "2018-01-01", end_date = "2019-12-31", text = "mallorca", result_type = "results_num")
mallorca_post <- search_4chan_snippet(boards = "trv", start_date = "2020-01-01", end_date = "2021-12-31", text = "mallorca", result_type = "results_num")
data.frame("Years" = c("2018 & 2019", "2020 & 2021"),
"Total results" = c(mallorca_pre["total_found"], mallorca_post["total_found"])
)
```
It seems that this island has been mentioned more as people tended to stay home.
Researchers interested in gathering more data than just a snippet of the posts can use the function search_4chan(). Staying with the example of posts mentioning Mallorca over time, one could be inclined to ask, whether the image of Mallorca has changed during the pandemic. Apart from simply getting *all* the posts mentioning a search term, it is for example possible to filter for posts containing image data:
```{r example4}
mallorca_pre_pics <- search_4chan(boards = "trv", start_date = "2018-01-01", end_date = "2019-12-31", text = "mallorca", show_only = "image")
mallorca_post_pics <- search_4chan_snippet(boards = "trv", start_date = "2018-01-01", end_date = "2019-12-31", text = "mallorca", show_only = "image")
head(mallorca_post_pics$media_link)
```
The column media_link provides the downloadable image links of the posts retrieved.
## Citation
If fouRplebsAPI is helpful for your research, please cite as:
Buehling, K. (2022). fouRplebsAPI: R package for accessing 4chan posts via the 4plebs.org API (Version 0.9.0). https://doi.org/10.5281/zenodo.6637440
Owner
- Login: buehlk
- Kind: user
- Repositories: 1
- Profile: https://github.com/buehlk
Citation (CITATION.cff)
# -----------------------------------------------------------
# CITATION file created with {cffr} R package, v0.2.2
# See also: https://docs.ropensci.org/cffr/
# -----------------------------------------------------------
cff-version: 1.2.0
message: 'To cite package "fouRplebsAPI" in publications use:'
type: software
license: MIT
title: 'fouRplebsAPI: R package for accessing 4chan posts via the 4plebs.org API'
version: 0.9.0
doi: 10.5281/zenodo.6637440
abstract: This package allows queries to the 4plebs.org API. Researchers can thereby
access the largest public 4chan archive, called 4plebs.org. Visual and text data
posted in 4chan can be searched whithin and across 4chan boards and can also be
accessed with detailed queries and stored in a structured manner.
authors:
- family-names: Buehling
given-names: Kilian
email: kilian.buehling@fu-berlin.de
orcid: 0000-0002-5244-7547
preferred-citation:
type: manual
title: 'fouRplebsAPI: R package for accessing 4chan posts via the 4plebs.org API'
authors:
- family-names: Buehling
given-names: Kilian
email: kilian.buehling@fu-berlin.de
orcid: 0000-0002-5244-7547
doi: 10.5281/zenodo.6637440
url: https://www.doi.org/10.5281/zenodo.6637440
version: 0.9.0
abstract: This package allows queries to the 4plebs.org API. Researchers can thereby
access the largest public 4chan archive, called 4plebs.org. Visual and text data
posted in 4chan can be searched whithin and across 4chan boards and can also be
accessed with detailed queries and stored in a structured manner.
repository-code: https://github.com/buehlk/fouRplebsAPI
url: https://github.com/buehlk/fouRplebsAPI
contact:
- family-names: Buehling
given-names: Kilian
email: kilian.buehling@fu-berlin.de
orcid: 0000-0002-5244-7547
license: MIT
year: '2022'
repository-code: https://github.com/buehlk/fouRplebsAPI
url: https://github.com/buehlk/fouRplebsAPI
contact:
- family-names: Buehling
given-names: Kilian
email: kilian.buehling@fu-berlin.de
orcid: 0000-0002-5244-7547
references:
- type: software
title: httr
abstract: 'httr: Tools for Working with URLs and HTTP'
notes: Imports
authors:
- family-names: Wickham
given-names: Hadley
email: hadley@rstudio.com
year: '2022'
url: https://CRAN.R-project.org/package=httr
- type: software
title: jsonlite
abstract: 'jsonlite: A Simple and Robust JSON Parser and Generator for R'
notes: Imports
authors:
- family-names: Ooms
given-names: Jeroen
email: jeroen@berkeley.edu
orcid: https://orcid.org/0000-0002-4035-0289
year: '2022'
url: https://CRAN.R-project.org/package=jsonlite
- type: software
title: purrr
abstract: 'purrr: Functional Programming Tools'
notes: Imports
authors:
- family-names: Henry
given-names: Lionel
email: lionel@rstudio.com
- family-names: Wickham
given-names: Hadley
email: hadley@rstudio.com
year: '2022'
url: https://CRAN.R-project.org/package=purrr
- type: software
title: stringr
abstract: 'stringr: Simple, Consistent Wrappers for Common String Operations'
notes: Imports
authors:
- family-names: Wickham
given-names: Hadley
email: hadley@rstudio.com
year: '2022'
url: https://CRAN.R-project.org/package=stringr
- type: software
title: dplyr
abstract: 'dplyr: A Grammar of Data Manipulation'
notes: Imports
authors:
- family-names: Wickham
given-names: Hadley
email: hadley@rstudio.com
orcid: https://orcid.org/0000-0003-4757-117X
- family-names: François
given-names: Romain
orcid: https://orcid.org/0000-0002-2444-4226
- family-names: Henry
given-names: Lionel
- family-names: Müller
given-names: Kirill
orcid: https://orcid.org/0000-0002-1416-3412
year: '2022'
url: https://CRAN.R-project.org/package=dplyr
- type: software
title: covr
abstract: 'covr: Test Coverage for Packages'
notes: Suggests
authors:
- family-names: Hester
given-names: Jim
email: james.f.hester@gmail.com
year: '2022'
url: https://CRAN.R-project.org/package=covr
- type: software
title: knitr
abstract: 'knitr: A General-Purpose Package for Dynamic Report Generation in R'
notes: Suggests
authors:
- family-names: Xie
given-names: Yihui
email: xie@yihui.name
orcid: https://orcid.org/0000-0003-0645-5666
year: '2022'
url: https://CRAN.R-project.org/package=knitr
- type: software
title: rmarkdown
abstract: 'rmarkdown: Dynamic Documents for R'
notes: Suggests
authors:
- family-names: Allaire
given-names: JJ
email: jj@rstudio.com
- family-names: Xie
given-names: Yihui
email: xie@yihui.name
orcid: https://orcid.org/0000-0003-0645-5666
- family-names: McPherson
given-names: Jonathan
email: jonathan@rstudio.com
- family-names: Luraschi
given-names: Javier
email: javier@rstudio.com
- family-names: Ushey
given-names: Kevin
email: kevin@rstudio.com
- family-names: Atkins
given-names: Aron
email: aron@rstudio.com
- family-names: Wickham
given-names: Hadley
email: hadley@rstudio.com
- family-names: Cheng
given-names: Joe
email: joe@rstudio.com
- family-names: Chang
given-names: Winston
email: winston@rstudio.com
- family-names: Iannone
given-names: Richard
email: rich@rstudio.com
orcid: https://orcid.org/0000-0003-3925-190X
year: '2022'
url: https://CRAN.R-project.org/package=rmarkdown
- type: software
title: roxyglobals
abstract: 'roxyglobals: Roxgen2 Global Variable Declarations'
notes: Suggests
authors:
- family-names: North
given-names: Anthony
email: anthony.jl.north@gmail.com
year: '2022'
url: https://github.com/anthonynorth/roxyglobals
version: '>= 0.2.1'
- type: software
title: testthat
abstract: 'testthat: Unit Testing for R'
notes: Suggests
authors:
- family-names: Wickham
given-names: Hadley
email: hadley@rstudio.com
year: '2022'
url: https://CRAN.R-project.org/package=testthat
version: '>= 3.0.0'
CodeMeta (codemeta.json)
{
"@context": "https://doi.org/10.5063/schema/codemeta-2.0",
"@type": "SoftwareSourceCode",
"identifier": "fouRplebsAPI",
"description": "This package allows queries to the 4plebs.org API. Researchers can thereby access the largest public 4chan archive, called 4plebs.org. Visual and text data posted in 4chan can be searched whithin and across 4chan channels and can also be accessed with detailed queries and stored in a structured manner.",
"name": "fouRplebsAPI: R implementation for the 4plebs.org API",
"codeRepository": "https://github.com/buehlk/fouRplebsAPI",
"issueTracker": "https://github.com/buehlk/fouRplebsAPI/issues",
"license": "https://spdx.org/licenses/MIT",
"version": "0.1.0",
"programmingLanguage": {
"@type": "ComputerLanguage",
"name": "R",
"url": "https://r-project.org"
},
"runtimePlatform": "R version 4.1.1 (2021-08-10)",
"author": [
{
"@type": "Organization",
"name": "person"
},
{
"@type": "Person",
"givenName": [
"email",
"="
],
"familyName": "kilian.buehling@fu-berlin.de)"
}
],
"maintainer": [
{
"@type": "Person",
"givenName": [
"The",
"package"
],
"familyName": "maintainer",
"email": "yourself@somewhere.net"
}
],
"softwareSuggestions": [
{
"@type": "SoftwareApplication",
"identifier": "covr",
"name": "covr",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=covr"
},
{
"@type": "SoftwareApplication",
"identifier": "roxyglobals",
"name": "roxyglobals",
"version": ">= 0.2.1",
"sameAs": "https://github.com/anthonynorth/roxyglobals"
},
{
"@type": "SoftwareApplication",
"identifier": "testthat",
"name": "testthat",
"version": ">= 3.0.0",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=testthat"
}
],
"softwareRequirements": {
"1": {
"@type": "SoftwareApplication",
"identifier": "httr",
"name": "httr",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=httr"
},
"2": {
"@type": "SoftwareApplication",
"identifier": "jsonlite",
"name": "jsonlite",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=jsonlite"
},
"3": {
"@type": "SoftwareApplication",
"identifier": "purrr",
"name": "purrr",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=purrr"
},
"4": {
"@type": "SoftwareApplication",
"identifier": "stringr",
"name": "stringr",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=stringr"
},
"5": {
"@type": "SoftwareApplication",
"identifier": "dplyr",
"name": "dplyr",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=dplyr"
},
"SystemRequirements": null
},
"fileSize": "76.136KB",
"readme": "https://github.com/buehlk/fouRplebsAPI/blob/main/README.md",
"developmentStatus": "https://lifecycle.r-lib.org/articles/stages.html#stable"
}