pandemic-immigrant-mortality
Replication kit for "Immigrant Mortality Advantage in the United States during the First Year of the COVID-19 Pandemic" in Demographic Research
https://github.com/eugeniopaglino/pandemic-immigrant-mortality
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.8%) to scientific vocabulary
Keywords
immigration
mortality
r
Last synced: 6 months ago
·
JSON representation
·
Repository
Replication kit for "Immigrant Mortality Advantage in the United States during the First Year of the COVID-19 Pandemic" in Demographic Research
Basic Info
- Host: GitHub
- Owner: eugeniopaglino
- Default Branch: main
- Homepage: https://www.demographic-research.org/articles/volume/50/7
- Size: 25.1 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
immigration
mortality
r
Created over 3 years ago
· Last pushed about 2 years ago
Metadata Files
Readme
Citation
README.Rmd
---
title: "Immigrant Mortality Advantage in the United States during the First Year of the COVID-19 Pandemic"
author: "Eugenio Paglino and Irma T. Elo"
output: github_document
always_allow_html: true
---
```{r, echo=F, include=F}
knitr::opts_chunk$set(
echo = FALSE,
warning = FALSE,
message = FALSE
)
```
```{r}
# Loading necessary packages
library(here)
library(vtable)
library(labelled)
library(fs)
library(tidyverse)
```
```{r}
# Do not rely on this to completely clean your environment
# Better to do a full restart of R before running
rm(list=ls())
i_am('README.Rmd')
```
# Immigrant Mortality during the COVID-19 Pandemic
We used death records for 2017-2020 from the National Center for Health Statistics (NCHS) under a data user agreement. We classified deaths by sex, age, bridged race, Hispanic origin, nativity, and cause of death. Deaths of US-born residents include deaths of individuals who were born and resided in the 50 US states, and deaths of foreign-born residents include deaths of foreign-born individuals residing in the 50 US states.
We obtained US population counts by age, bridged race, Hispanic origin, and sex for 2017-2020 from CDC WONDER. To estimate population by nativity we pooled the 2017-2019 American Community Survey (ACS) 1-year files to estimate the proportion foreign-born by age, sex, race, and Hispanic origin. We then applied these proportions to the population counts to obtain populations by 10-year age groups, sex, race, Hispanic origin, and nativity. Because the pandemic affected the 2020 ACS data collection, we applied the 2019 proportions to the 2020 Census population estimates. Supplementary Tables 1a and 1b present the number of deaths and populations by nativity, race, Hispanic origin, and sex for broad age groups in 2017-2019 and 2020.
We included nine exhaustive and mutually exclusive cause-of-death categories based on the underlying cause of death. These are respiratory diseases, circulatory diseases, cancers, Alzheimer disease and other dementias, diabetes, COVID-19, external causes, and all other causes combined (Supplementary Table 2). We chose these causes because mortality from them increased during the pandemic.
Using the average 2017-2020 age distribution as the standard, we calculated age-standardized death rates (ASDR) by sex, race, Hispanic-origin, and nativity for all causes combined, COVID-19, and all causes other than COVID-19 at ages 25+, 25-64, and 65+ and by the more detailed causes at ages 25+ for 2017-2019 and 2020. We pooled three pre-pandemic years to adjust for year-to-year fluctuations in death rates. For each ASDR, we computed standard errors (SD) and coefficients of variation (SD/ASDR), which are included in the Supplementary Material. Because the size our smallest group (US-born Asian females in 2020) exceeds one million, all standard errors are small, and the largest coefficient of variation is 3.75%.
Based on these data, we examined the contribution to COVID-19 and all other causes of death, other than COVID-19, to the change in all-cause mortality between 2017-2019 and 2020 by sex, nativity, and race/ethnicity at ages 25+, 25-64 and 65+. We further decomposed the cause-of-death contributions by the more detailed cause-of-death categories to US-born and foreign-born difference in ASDRs at ages 25+ by race/ethnicity and sex in 2017-2019 and 2020 and their contributions to the change in these differences between 2017-2019 and 2020.
## Repository Structure
In this repository we stored all the codes needed to reproduce the figures and tables in the paper as well as supplementary material. Here is the repository structure.
```{r}
dir_tree(
path=here(),
regexp='ACS20201Year|ACS20211Year|mort2021|mortPop|utilities|CDCComorbidities|presentation',
invert=T
)
```
The `data/output` folder contains all the datasets produced in the project. The most important ones are the `ageStdRates*.csv` files that store the age standardized crude death rates by sex, race/ethnicity, nativity, cause of death, and period together with uncertainty measures for each rate. The structure of these files is described below. The remaining files are produced in the project as intermediate steps. The `figures` folder contains all figures and tables of the project. Finally, the `R` folder contains all the code used to clean and analyse the data. This repository does not contain the individual level death records because they are not publicly available. The public version of these data, in which information on place of birth has been masked, can be obtained from the [Vital Statistics Online Data Portal](https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm#Mortality_Multiple). They have the same structure as the data we used.
## Content of Main Data Files
The table below details the structure of the `ageStdRates*.csv` files.
```{r variable descriptions}
variableDescriptions <-
list(
period = 'Years or year for which the mortality measure was computed.',
sex = 'Sex',
raceEthnicity ='Race/Ethnicity',
foreign = 'Whether the mortality measure was computed for the Foreign-Born or the US-Born population.',
majorCauses = 'Group of the underlying cause of death reported on the death certificates.',
age = 'Ages over which the mortality measure was computed, 100 is used as an upper bound.',
deaths = 'Number of deaths.',
pop = 'Mid-period population.',
ASDR = 'Age-Standardized Crude Death Rate.',
logASDR = 'Logarithm of the Age-Standardized Crude Death Rate.',
ASDR.SD = 'Standard deviation of the ASDR computed using the formula in Chiang (1984) “The Life Table and Its Application” (pp. 105-106).',
coefOfVar = 'Coefficient of variation for the ASDR computed as ASDR.SD/ASDR.'
)
```
```{r data dictionary function}
# Code adapted from an original script by Joe Wasserman https://github.com/mymil
autoDictionary <- function(.file_path,
.labels = NA,
parent_dir = FALSE,
...) {
.title <- str_remove(basename(.file_path), ".csv")
if(parent_dir) {
dir <- str_extract(dirname(.file_path), "[[:alpha:]]+$")
.title <- glue::glue("{dir}/{.title}")
}
message(glue::glue("Processing { .title }"))
.data <- readr::read_csv(
file.path(.file_path),
lazy = TRUE
)
datLabelled <- .data %>%
labelled::set_variable_labels(
.,
.labels = .labels,
.strict = FALSE,
...
)
rm(.data)
datLabelled %>%
vtable::vtable(
.,
out = "kable",
data.title = .title,
lush = TRUE,
desc = "",
...
) %>%
kableExtra::kable_styling(
font_size = 12
) %>%
print()
rm(datLabelled)
}
```
```{r print dictionary, results='asis'}
fileNames <- list.files(
here::here('data', 'output'),
pattern = 'age.+\\.csv$',
full.names = TRUE,
recursive = TRUE
)
walk(
fileNames,
~ autoDictionary(
.x,
.labels = variableDescriptions,
parent_dir = TRUE
)
)
```
## Description of R Code
Below we briefly describe each of the R files. Ideally, they would be run in the order in whcih they are listed below.
- `createMortTable.Rmd`: Reads and cleans the death records creating death counts.
- `createPopTableCDC.Rmd`: Reads and cleans the population data from CDC.
- `createPropTable.Rmd`: Reads and cleans the ACS data to compute proportion foreign-born by sex, race, Hispanic origin, age, and year to be applied to the CDC population estimates.
- `createFinalTable.Rmd`: Combines the death and population counts in a unique cleaned file.
- `sampleAndSummaryTables.Rmd`: Computes age-specific and age-standardized rates and creates the sample and summary tables (Supplementary Tables 1a, 1b, 3a, and 3b).
- `mortalityDifferentialsPlotsNoRatios.Rmd`: Creates Figure 1 and Figure 2.
- `compWithWhitesPlots.Rmd`: Creates Figure 3 and the corresponding Supplementary Table 4.
- `contributionsPlots.Rmd`: Creates Figure 4a and 4b and the corresponding Supplementary Tables 5a and 5b.
As a note to help readers navigate the scripts, we used a functional approach, defining the functions to create each figure and table at the start of the file and then running the functions and saving the output below.
Owner
- Name: Eugenio Paglino
- Login: eugeniopaglino
- Kind: user
- Repositories: 4
- Profile: https://github.com/eugeniopaglino
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: ' Replication kit for "Immigrant mortality advantage in the United States during the first year of the COVID-19 pandemic"'
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Eugenio
family-names: Paglino
email: paglino@sas.upenn.edu
affiliation: University of Pennsylvania
orcid: 'https://orcid.org/0000-0001-8733-6460'
- given-names: Irma T.
family-names: Elo
email: popelo@pop.upenn.edu
affiliation: University of Pennsylvania
repository-code: >-
https://github.com/eugeniopaglino/pandemic-immigrant-mortality
abstract: >-
OBJECTIVES
Eugenio Paglino1 Irma T. Elo2
To investigate the mortality impact of the COVID-19
pandemic on US-born and foreign- born populations by race
and Hispanic origin in the United States in 2020.
METHODS
Death records from the National Center for Health
Statistics and population data from CDC WONDER were used
to estimate (1) age-standardized all-cause and
cause-specific mortality at ages 25+, 25–64, and 65+ in
2017–2019 and 2020 by nativity, race, Hispanic origin, and
sex; (2) changes in mortality between these two periods;
and (3) the cause- specific contributions to these
changes.
RESULTS
Mortality increased in 2020 relative to 2017–2019 for all
racial and Hispanic-origin groups. Adjusting for age,
mortality increases were larger at ages 25+ among foreign-
born males (390 deaths for 100,000) and females (189) than
among US-born males (223) and females (144). The large
mortality rise among foreign-born Hispanic men (593)
contributed to the narrowing of their mortality advantage
relative to White men, from 426 to 134. An increase in
mortality among both foreign-born and US-born Black males
and females increased the Black–White mortality
disparities by 318 for males and by 180 for females.
Although COVID-19 mortality was the main driver of the
increase among foreign-born residents, circulatory
diseases and malignant neoplasms also contributed.
CONTRIBUTION
We show that the COVID-19 pandemic had a greater impact on
foreign-born populations than on their US-born
counterparts. These findings highlight the need to address
the underlying inequalities and unique challenges faced by
foreign-born populations.
keywords:
- Mortality
- Immigrant Mortality Advantage
- COVID-19