demo-regularisation
Regularisation and Cross-Validation of Determinants of Egalitarian Democracy: Demonstration for R
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 15 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.8%) to scientific vocabulary
Keywords
Repository
Regularisation and Cross-Validation of Determinants of Egalitarian Democracy: Demonstration for R
Basic Info
- Host: GitHub
- Owner: bgonzalezbustamante
- License: cc-by-4.0
- Language: R
- Default Branch: main
- Homepage: https://doi.org/10.5281/zenodo.5708892
- Size: 533 KB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 5
- Releases: 3
Topics
Metadata Files
README.md
demo-regularisation
Regularisation and Cross-Validation of Determinants of Egalitarian Democracy: Demonstration for R
Overview
This repository contains a demonstration for R of regularisation, cross-validation, and shrinkage methods. The application of these techniques is for illustrative purposes only on determinants of egalitarian democracy. The results should not be considered for any interpretation since it is necessary to develop a causal identification strategy and apply a number of controls, adjustments to standard errors, and robustness checks. In addition, it contains a merged, sliced data set from V-Dem and World Bank (N = 27,013). This set includes country-year observations from 1789 to 2019 of 202 countries.
Chunks of this code for extraction and handling V-Dem and World Bank data were used in the following article (download BibTeX file):
- González-Bustamante, B. (2021). Early Government Responses to COVID-19 in South America. World Development, 137, 105180. DOI: 10.1016/j.worlddev.2020.105180
Further details and different applications on this GitHub repository and this OSF-Project (DOI: 10.17605/OSF.IO/6FM7X).
Metadata and Preservation
This code is stored with version control on a GitHub repository. Furthermore, a Digital Object Identifier is provided by Zenodo. The structure of the repository is detailed below.
r
demo-regularisation
|-- .gitignore
|-- CHANGELOG.md
|-- demo-regularisation.Rproj
|-- CITATION.cff
|-- CODE_OF_CONDUCT.md
|-- LICENSE.md
|-- README.md
|-- STATUS.md
|-- code
|-- stage_1_data_cleaning.R
|-- stage_2_regularisation.R
|-- data
|-- raw
|-- vdem_wb.csv
|-- tidy
|-- vdem_wb.csv
|-- demo
|-- regularisation_demo.md
|-- regularisation_demo.Rmd
|-- regularisation_demo_files
|-- figure-gfm
|-- lasso-1.png
|-- lasso-2.png
|-- ols-1.png
|-- ridge-1.png
|-- ridge-2.png
|-- results
|-- tables
|-- table_01.html
|-- table_01.tex
|-- table_02.html
|-- table_02.tex
|-- refs
|-- BIB-World-Development.bib
10 directories and 24 files.
In addition, this README file in Markdown (MD) format provides specific information to ensure the replicability of the code.
Storage
The GitHub repository has controlled access with Two-Factor Authentication (2FA) with two physical USB security devices (Bastián González-Bustamante, ORCID iD 0000-0003-1510-6820). Both USB keys issue one-time passwords to generate a cryptographic authentication FIDO2 and U2F.
Getting Started
Software
We use R version 4.1.0 (2021-05-18) -- "Camp Pontanezen".
Required R libraries are: "broom", "caret", "coefplot", "glmnet", "tidyverse", "stargazer", and "wbstats".
We recommend that users run replication code and scripts from the root directory using the R project "demo-regularisation.Rproj".
Replication Instructions
Folder "code" contains the R scripts.
Folder "demo" contains a demonstration in RMD and MD formats (regularisation_demo.md). On the other hand, "results/tables" includes all tables provided as HTML and TeX files.
These files will be overwritten if you reproduced the steps described below.
Stage 1. Run R script "stage1data_cleaning.R" from the "code" folder. This script splits V-Dem data[^1] and merges them with World Bank indicators[^2] on GDP growth and inflation. Then, a significantly smaller and more manageable data set is saved in CSV UTF-8 format (1.57 MB) on this repository.
Stage 2. Run R script "stage2regularisation.R" from the "code" folder. This script contains the demonstration for R. Alternatively, it is possible to run "regularisation_demo.Rmd" from "demo" and the files in "demo/regularisationdemofiles" subfolder will be overwritten.
It is possible to run the code from the second stage onward to check the methods directly. Considering the volume of V-Dem data (182 MB), running the first script takes some time.
Codebook
The file "vdem_wb.csv" in "data/tidy" subfolder is the merged, sliced data set from V-Dem and World Bank (N = 27,013). This set contains country-year observations from 1789 to 2019 of 202 countries.[^3]
country (country_name in V-Dem). Country name.
year. Year variable.
egal_dem (v2x_egaldem in V-Dem). The egalitarian democracy index considers freedoms protected across all social groups, resources distributed equally across all social groups, and equal access to power. It also takes into account the level of electoral democracy.
corruption (v2x_corr in V-Dem). The political corruption index measures how pervasive is political corruption and considers measures of six distinct types of corruption from different areas of the political field, distinguishing between executive, legislative, and judicial corruption.
military (v2xexmilitary in V-Dem). The military dimension index measures if the military determines the chief executive's power base based on appointments made through a coup or rebellions and if the military can remove them.
free_exp (v2x_freexp in V-Dem). The freedom of expression index reflects the government's level of respect for press and media freedom, the freedom to discuss political matters in the public sphere, and freedom of academic and cultural expressions.
fed_uni (v2x_feduni in V-Dem). The division of power index reflects if the local and regional governments are elected and the level of independence in the decision-making process.
inflation (FP.CPI.TOTL.ZG in World Bank API). Inflation based on consumer prices (annual percentage).
gdp (NY.GDP.MKTP.KD.ZG in World Bank API). GDP growth (annual percentage).
gdp_pc (NY.GDP.PCAP.KD.ZG in World Bank API). GDP per capita growth (annual percentage).
License
This R code and merged, sliced data set from V-Dem and World Bank are released under a Creative Commons Attribution 4.0 International license (CC BY 4.0). This open-access license allows the data to be shared, reused, adapted as long as appropriate acknowledgement is given.
Contribute
Contributions are entirely welcome. You just need to open an issue with your comment or idea.
For more substantial contributions, please fork this repository and make changes. Pull requests are also welcome.
Please read our code of conduct first. Minor contributions will be acknowledged, and significant ones will be considered on our contributor roles taxonomy.
Citation
González-Bustamante, B. (2022). Regularisation and Cross-Validation of Determinants of Egalitarian Democracy: Demonstration for R (Version 1.2.1 -- White Waterfall) [Computer software]. DOI: 10.5281/zenodo.5708892
Author
Bastián González-Bustamante \ bastian.gonzalezbustamante@politics.ox.ac.uk \ ORCID iD 0000-0003-1510-6820 \ https://bgonzalezbustamante.com
CRediT - Contributor Roles Taxonomy
Bastián González-Bustamante (ORCID iD 0000-0003-1510-6820): Conceptualisation, data curation, formal analysis, methodology, project administration, resources, software, validation, and visualisation.
Latest Revision
[^1]: V-Dem [Country–Year/Country–Date] Dataset v10 (Coppedge et al., 2020; DOI: 10.23696/vdemds20) is downloaded from our OSF-project on COVID-19 in South America (DOI: 10.17605/OSF.IO/6FM7X) during the first stage. However, it is important to bear in mind that the current data set is v11.1. [^2]: Data downloaded during the first stage from the World Bank API. [^3]: This data set has been updated running the first stage code on February 12, 2022 (v1.2.1 -- White Waterfall). Therefore, there are slight differences with the previous version compiled on November 30, 2021 (v1.1.1 – Autumn Mode). The latter is used in the demonstration and is stored in the “data/raw” subfolder.
Owner
- Name: Bastián González-Bustamante
- Login: bgonzalezbustamante
- Kind: user
- Location: Oxford
- Company: University of Oxford
- Website: https://bgonzalezbustamante.com
- Twitter: bastiangb
- Repositories: 8
- Profile: https://github.com/bgonzalezbustamante
DPhil (PhD) in Politics programme, Department of Politics and International Relations and St Hilda's College, University of Oxford.
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this code, please cite it as below." authors: - family-names: "González-Bustamante" given-names: "Bastián" orcid: "https://orcid.org/0000-0003-1510-6820" title: "Regularisation and Cross-Validation of Determinants of Egalitarian Democracy: Demonstration for R" version: "1.2.1 -- White Waterfall" doi: 10.5281/zenodo.5708892 date-released: 2022-02-12 url: "https://github.com/bgonzalezbustamante/demo-regularisation" type: software-code
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: 11 months ago
All Time
- Total issues: 24
- Total pull requests: 0
- Average time to close issues: 11 days
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 0.13
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- bgonzalezbustamante (19)