partypositions-wikitags
Estimation of party positions from Wikipedia tags (see Herrmann/Döring 2021)
Science Score: 39.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 9 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.1%) to scientific vocabulary
Scientific Fields
Repository
Estimation of party positions from Wikipedia tags (see Herrmann/Döring 2021)
Basic Info
- Host: GitHub
- Owner: hdigital
- License: mit
- Language: HTML
- Default Branch: main
- Size: 52 MB
Statistics
- Stars: 10
- Watchers: 1
- Forks: 2
- Open Issues: 0
- Releases: 5
Metadata Files
README.md
Party positions from Wikipedia classifications
Herrmann, Michael, and Holger Döring. 2023. “Party Positions from Wikipedia Classifications of Party Ideology.” Political Analysis 31(1): 22–41. — doi: 10.1017/pan.2021.28
Holger Döring, and Michael Herrmann. [YEAR] “Party Positions from Wikipedia Tags.” — doi: 10.5281/zenodo.7043510
- Holger Döring — holger.doering@gesis.org
- Michael Herrmann — michael.herrmann@uni-konstanz.de
Results
- party positions and tags in party-position-tags.csv
- tag positions in tag-position.csv
- visualization of parties by country and tags
Install
Running all scripts requires R, Python and Stan.
We use Docker as a replication environment. It includes R, RStudio, Python, Stan and all packages (see Dockerfile).
```sh docker-compose up -d # start container in detached mode
docker-compose down # shut down container ```
http://localhost:8787/ — RStudio in a browser with all dependencies
Project structure
Note — Using RStudio project workflow – 0-wp-data.Rproj. All R scripts use project root as base path and file paths are based on it.
- z-run-all.R — stepwise execution all scripts (R and Python)
- data-files-docs.csv — documentation all datasets (path, type, description)
Folders
- 01-data-sources
- 01-partyfacts — Party Facts data
- 02-wikipedia — Wikipedia data and infobox tags
- 03-party-positions — party position data for validation (CHES, DALP, Manifesto, WVS)
- 02-data-preparation — create datasets for analysis
- 03-estimation — estimation of models and post-estimation
- 04-data-final — datasets with party and tags positions (only M2)
- 05-validation — validation of party positions (only M2)
- 06-figures-tables — visualization of results (only M2)
Tag harmonization
A dataset of Wikipedia tags is created in 02-data-preparation/01-wp-infobox.R.
- some minor harmonization of category names
- selects only categories that are used twice
The dataset used for the analysis is created in 02-data-preparation/02-wp-data.R.
- filter most frequent tags — see parameter
- create dataset in wide format with tags as variable names
Estimation
Model 2 (and Model 1) can be estimated in 03-estimation.
We use only Model 2 for post-estimation and the succeeding preparation of final data, figures and tables.
Party positions
We include party position data for validation — see 01-data-sources/03-party-positions/
- Chapel Hill Expert Survey (CHES) – trend file 1999–2019
- Democratic Accountability and Linkages Project (DALP) expert survey (Kitschelt 2013)
- Manifesto Project (MP) – left-right (rile) scores
- World Values Survey (WVS) — voters left-right self-placement, Wave 6, 2010–2014
Changes
Differences of revised code with paper-based code used in replication material:
Herrmann, Michael, and Holger Döring. 2021. “Replication Data for: Party Positions from Wikipedia Classifications of Party Ideology.” — doi: 10.7910/DVN/1JHZIU
Data
- new (revised) main final dataset — 04-descriptives/party-tags-positions.csv
- remove historical and faction tags sections
Code
- Stan statistical computing platform used for estimation (JAGS deprecated)
- new folder structure with index numbers
- fewer R packages dependencies
- focus on Model 2 (Model 1 estimation only)
- removed tables and figures only relevant for paper
- revised documentation all scripts

License
MIT — Copyright (c) 2022 Holger Döring and Michael Herrmann
Owner
- Login: hdigital
- Kind: user
- Repositories: 3
- Profile: https://github.com/hdigital
GitHub Events
Total
- Release event: 1
- Watch event: 1
- Push event: 1
Last Year
- Release event: 1
- Watch event: 1
- Push event: 1
Dependencies
- pandas ==1.
- requests *
- wikitextparser *
- wptools *
- certifi ==2020.12.5
- charset-normalizer ==2.0.12
- html2text ==2020.1.16
- idna ==3.3
- lxml ==4.6.2
- numpy ==1.20.1
- pandas ==1.2.3
- pycurl ==7.43.0.6
- python-dateutil ==2.8.1
- pytz ==2021.1
- regex ==2020.11.13
- requests ==2.28.0
- six ==1.15.0
- urllib3 ==1.26.9
- wcwidth ==0.2.5
- wikitextparser ==0.47.3
- wptools ==0.4.17
- rocker/tidyverse 4.1.3 build