Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.6%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Avisblatt

Basic Info
  • Host: GitHub
  • Owner: Avisblatt
  • Language: R
  • Default Branch: main
  • Size: 30.5 MB
Statistics
  • Stars: 4
  • Watchers: 4
  • Forks: 1
  • Open Issues: 17
  • Releases: 0
Created over 6 years ago · Last pushed over 1 year ago
Metadata Files
Readme Citation

README.md

{avisblatt} R package - Read & Process Data from Printed Markets - The Basel Avisblatt (1729-1845)

“Printed Markets” is a Swiss research project (see also avisblatt.ch) on a new form of marketplace that emerged in Europe during the seventeenth and, for the most part, in the eighteenth century: the printed advertising market of the so-called “Intelligenzblätter” (intelligencers). Using the example of the Basel Avisblatt (published 1729-1844/45), “Printed Markets” employs digital history and data science methods to systematically open up an extensive serial source, to shine a light on the socioeconomic transformations of the “Sattelzeit”: The Avisblatt reflects myriads of ways to organise economic exchange, to interlink persons of complementary interests, to spin the socioeconomic web of a town in transition, from early modernity to the industrial age.

“Printed Markets” is a project of the Department of History at the University of Basel. It was financed by the Swiss National Science Foundation SNSF (grant 182156, 2018-2023) and led by Prof. Dr. Susanna Burghartz.

Getting Started

The Avisblatt project makes over 116 years of classified ads accessible in machine friendly fashion. The project does not stop at scanning and images, but provides far more than 100'000 text recognized (HTR, using the software Transkribus) ads and enhances these ads with a wide range of meta information, using our R package. This additional information categorizes ads, adds information obtained through digital text processing methods as well as relations to other ads.

Installation from GitHub

Assuming you have a working installation of R and a state-of-the-art IDE such as RStudio or Visual Studio Code going, use install_github from the {devtools} R package to install the latest, bleeding-edge version of the {avisblatt} R package.

remotes::install_github("Avisblatt/avisblatt")

Note 1: The above installation only needs to be performed once (and for package updates), the below R commands in 'Reading Data' to load the library are necessary every time you intend to use the package.

Reading Data (Single Years)

The Avisblatt project does not only provide the {avisblatt} R package but also the avisdata data repository. The data repository contains cleaned data that went through the HTR process, HTR corrections and enrichment with meta information. While the package allows to reproduce all of these steps, the most common use case is to read in the latest release of cleaned data from the avisdata repository. First, clone the data repository to your local machine using your favorite method, e.g., terminal via ssh:

git clone https://github.com/Avisblatt/avisdata.git

or simply download the a zipfile from GitHub if you are not familiar with git.

In your R session, load the {avisblatt} R package and read a collection into your session.

```r library(avisblatt) col <- readcollection("../avisdata/collections/yearly1729")

```

The col object is a collection class that contains a meta slot for data description and a corpus slot for the ad text itself.

Reading Data & Merging Collections

```r library(avisblatt) library(data.table) ay <- 1730:1740

path to data without trailing '/'

m <- gatheryearlycollections(ay, path = "../avisdata/collections")

m is a collection that contains all data from 1730-1740

table(year(m$meta$date))

> 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740

3083 3500 3613 3440 4098 4183 4010 4095 4015 3968 3767

```

Data Analysis

This section shows a simple data analysis and visualization example.

```r library(avisblatt) library(ggplot2)

select a few years start, middle end of sample

ay <- c(1730, 1800, 1840) yc <- gatheryearlycollections(ay, path = "../avisdata/collections") totalcountbym <- countrecordsbydate(coll = yc, level = "month", colnames = "N")

compare ad count during the course of a year

gg <- ggplot(totalcountbym) gg + geomline(aes(x = as.factor(month), y = N, group = year, col = as.factor(year))) + scalecolorviridisd() + thememinimal() + theme(axis.title.x = elementblank(), panel.grid.major.x = elementblank() ) + labs(col = "Year") + ggtitle("Number of ads over the course of a year")

```

The resulting plot compares the total nummer of ads during the course of three different years. Albeit the volume increased substantially over time the course of the season seems relatively similar.

Related Reads

Owner

  • Name: Project "Printed Markets: The Basel Avisblatt, 1729-1845"
  • Login: Avisblatt
  • Kind: organization

Citation (CITATION.cff)

# -----------------------------------------------------------
# CITATION file created with {cffr} R package, v0.4.1
# See also: https://docs.ropensci.org/cffr/
# -----------------------------------------------------------
 
cff-version: 1.2.0
message: 'To cite package "avisblatt" in publications use:'
type: software
license: GPL-2.0-only
title: 'avisblatt: Process Avisblatt Data'
version: '0.3'
abstract: Processing ads level data from the Basler Avisblatt.
authors:
- family-names: Bannert'
  given-names: '''Matthias'
  email: matthias.bannert@gmail.com
- family-names: Bannert
  given-names: '"Matthias'
  email: matthias.bannert@gmail.com
- family-names: Engel
  given-names: Alexander
- family-names: Reimann
  given-names: Anna
- family-names: Serif"
  given-names: Ina
repository-code: https://github.com/mbannert/avisblatt
url: https://github.com/mbannert/avisblatt
date-released: '2020-06-03'
contact:
- family-names: Bannert'
  given-names: '''Matthias'
  email: matthias.bannert@gmail.com
- family-names: Bannert
  given-names: '"Matthias'
  email: matthias.bannert@gmail.com
references:
- type: software
  title: 'R: A Language and Environment for Statistical Computing'
  notes: Depends
  url: https://www.R-project.org/
  authors:
  - name: R Core Team
  location:
    name: Vienna, Austria
  year: '2023'
  institution:
    name: R Foundation for Statistical Computing
  version: '>= 3.5'
- type: software
  title: jsonlite
  abstract: 'jsonlite: A Simple and Robust JSON Parser and Generator for R'
  notes: Imports
  url: https://arxiv.org/abs/1403.2805
  repository: https://CRAN.R-project.org/package=jsonlite
  authors:
  - family-names: Ooms
    given-names: Jeroen
    email: jeroen@berkeley.edu
    orcid: https://orcid.org/0000-0002-4035-0289
  year: '2023'
  version: '>= 1.1'
- type: software
  title: R6
  abstract: 'R6: Encapsulated Classes with Reference Semantics'
  notes: Imports
  url: https://r6.r-lib.org
  repository: https://CRAN.R-project.org/package=R6
  authors:
  - family-names: Chang
    given-names: Winston
    email: winston@stdout.org
  year: '2023'
- type: software
  title: data.table
  abstract: 'data.table: Extension of `data.frame`'
  notes: Imports
  url: https://r-datatable.com
  repository: https://CRAN.R-project.org/package=data.table
  authors:
  - family-names: Dowle
    given-names: Matt
    email: mattjdowle@gmail.com
  - family-names: Srinivasan
    given-names: Arun
    email: asrini@pm.me
  year: '2023'
- type: software
  title: quanteda
  abstract: 'quanteda: Quantitative Analysis of Textual Data'
  notes: Imports
  url: https://quanteda.io
  repository: https://CRAN.R-project.org/package=quanteda
  authors:
  - family-names: Benoit
    given-names: Kenneth
    email: kbenoit@lse.ac.uk
    orcid: https://orcid.org/0000-0002-0797-564X
  - family-names: Watanabe
    given-names: Kohei
    email: watanabe.kohei@gmail.com
    orcid: https://orcid.org/0000-0001-6519-5265
  - family-names: Wang
    given-names: Haiyan
    email: whyinsa@yahoo.com
    orcid: https://orcid.org/0000-0003-4992-4311
  - family-names: Nulty
    given-names: Paul
    email: paul.nulty@gmail.com
    orcid: https://orcid.org/0000-0002-7214-4666
  - family-names: Obeng
    given-names: Adam
    email: quanteda@binaryeagle.com
    orcid: https://orcid.org/0000-0002-2906-4775
  - family-names: Müller
    given-names: Stefan
    email: mullers@tcd.ie
    orcid: https://orcid.org/0000-0002-6315-4125
  - family-names: Matsuo
    given-names: Akitaka
    email: a.matsuo@lse.ac.uk
    orcid: https://orcid.org/0000-0002-3323-6330
  - family-names: Lowe
    given-names: William
    email: wlowe@princeton.edu
    orcid: https://orcid.org/0000-0002-1549-6163
  year: '2023'
  version: '>= 2.0'
- type: software
  title: quanteda.textstats
  abstract: 'quanteda.textstats: Textual Statistics for the Quantitative Analysis
    of Textual Data'
  notes: Imports
  url: https://quanteda.io
  repository: https://CRAN.R-project.org/package=quanteda.textstats
  authors:
  - family-names: Benoit
    given-names: Kenneth
    email: kbenoit@lse.ac.uk
    orcid: https://orcid.org/0000-0002-0797-564X
  - family-names: Watanabe
    given-names: Kohei
    email: watanabe.kohei@gmail.com
    orcid: https://orcid.org/0000-0001-6519-5265
  - family-names: Wang
    given-names: Haiyan
    email: whyinsa@yahoo.com
    orcid: https://orcid.org/0000-0003-4992-4311
  - family-names: Lua
    given-names: Jiong Wei
    email: J.W.Lua@lse.ac.uk
  - family-names: Kuha
    given-names: Jouni
    email: j.kuha@lse.ac.uk
    orcid: https://orcid.org/0000-0002-1156-8465
  year: '2023'
- type: software
  title: textcat
  abstract: 'textcat: N-Gram Based Text Categorization'
  notes: Imports
  repository: https://CRAN.R-project.org/package=textcat
  authors:
  - family-names: Hornik
    given-names: Kurt
    email: Kurt.Hornik@R-project.org
    orcid: https://orcid.org/0000-0003-4198-9911
  - family-names: Rauch
    given-names: Johannes
  - family-names: Buchta
    given-names: Christian
    email: christian.buchta@wu.ac.at
  - family-names: Feinerer
    given-names: Ingo
    email: feinerer@logic.at
  year: '2023'
- type: software
  title: textutils
  abstract: 'textutils: Utilities for Handling Strings and Text'
  notes: Imports
  url: http://enricoschumann.net/R/packages/textutils/
  repository: https://CRAN.R-project.org/package=textutils
  authors:
  - family-names: Schumann
    given-names: Enrico
    email: es@enricoschumann.net
    orcid: https://orcid.org/0000-0001-7601-6576
  year: '2023'
- type: software
  title: knitr
  abstract: 'knitr: A General-Purpose Package for Dynamic Report Generation in R'
  notes: Suggests
  url: https://yihui.org/knitr/
  repository: https://CRAN.R-project.org/package=knitr
  authors:
  - family-names: Xie
    given-names: Yihui
    email: xie@yihui.name
    orcid: https://orcid.org/0000-0003-0645-5666
  year: '2023'
- type: software
  title: testthat
  abstract: 'testthat: Unit Testing for R'
  notes: Suggests
  url: https://testthat.r-lib.org
  repository: https://CRAN.R-project.org/package=testthat
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
  year: '2023'

GitHub Events

Total
  • Push event: 1
Last Year
  • Push event: 1

Issues and Pull Requests

Last synced: almost 2 years ago

All Time
  • Total issues: 69
  • Total pull requests: 31
  • Average time to close issues: 6 months
  • Average time to close pull requests: 10 days
  • Total issue authors: 6
  • Total pull request authors: 4
  • Average comments per issue: 2.04
  • Average comments per pull request: 0.61
  • Merged pull requests: 29
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 6
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 3 days
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.33
  • Average comments per pull request: 1.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • mbannert (41)
  • aengel17 (14)
  • annareimann (10)
  • HomoCodens (2)
  • wissen-ist-acht (1)
  • LarsDIK (1)
Pull Request Authors
  • aengel17 (11)
  • annareimann (8)
  • wissen-ist-acht (7)
  • mbannert (5)
Top Labels
Issue Labels
enhancement (7) bug (3) question (3) discussion (2) help wanted (2) wontfix (1)
Pull Request Labels

Dependencies

DESCRIPTION cran
  • R >= 4.1 depends
  • R6 * imports
  • XML * imports
  • data.table * imports
  • dplyr * imports
  • jsonlite >= 1.1 imports
  • magick * imports
  • quanteda >= 2.0 imports
  • quanteda.textplots * imports
  • quanteda.textstats * imports
  • stringi * imports
  • textcat * imports
  • textutils * imports
  • knitr * suggests
  • rmarkdown * suggests
  • testthat * suggests