Combining a Probability and a Non-Probability Sample in a Capture-Recapture Setting
Combining a Probability and a Non-Probability Sample in a Capture-Recapture Setting - Published in JOSS (2018)
Science Score: 95.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org -
✓Committers with academic emails
1 of 2 committers (50.0%) from academic institutions -
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Last synced: 6 months ago
·
JSON representation
Repository
Blending a Probablilty Sample and a Non-Probability Sample in a Capture-Recapture Setting
Basic Info
- Host: GitHub
- Owner: williamsbenjamin
- License: gpl-3.0
- Language: R
- Default Branch: master
- Size: 338 KB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 1
- Open Issues: 0
- Releases: 2
Created over 7 years ago
· Last pushed almost 5 years ago
Metadata Files
Readme
License
README.Rmd
---
output: github_document
---
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```
# blendR
[](https://doi.org/10.21105/joss.00886)
The goal of blendR is to provide statistically valid estimators of total (and standard errors) when blending a non-probability sample with a probability sample. The two samples are considered to follow capture-recapture methodology, with the capture sample being the non-probability sample and the recapture sample being the probability sample. This package is based upon research by Liu et al (2017), Breidt, Opsomer, and Huang (2018), and dissertation research by the package author. Additional estimators will be released in future versions.
These estimators are important and currently used by Texas Parks and Wildlife to estimate the total number of Red Snapper fish caught. Combining data sources is an important research area due to the prevalence of big data in both industry and academia. These estimators can easily extend to other areas of interest, for example, the internet of things, insurance claims, and estimation of the death toll due to a natural disaster.
## Installation
You can install blendR from github with:
```{r gh-installation, eval = FALSE}
# install.packages("devtools")
devtools::install_github("williamsbenjamin/blendR")
```
## Example
An example using data from a capture-recapture sampling program in 2016 by Texas Parks and Wildlife
Captains could voluntarily self-report (via a smartphone app) about their catch of Red Snapper fish (non-probability sample) and could be sampled in a dockside intercept sample (probability sample). The self-reports are the capture sample and the dockside intercept is the recapture, probability sample.
```{r example}
library(tibble)
library(blendR)
## Dataset for boats sampled in the dockside intercept, if their captains also self-reported, that data included as well
red_snapper_sampled
## Dataset for the self-reported boats
self_reports
s_design <- survey::svydesign(id = ~psu,
strat = ~stratum,
prob = ~prob,
nest = T,
data = red_snapper_sampled)
t_p(data = red_snapper_sampled,
recapture_total = number_caught_ps,
captured = captured_indicator,
survey_design = s_design,
capture_units = nrow(self_reports))
```
Owner
- Name: Benjamin Williams
- Login: williamsbenjamin
- Kind: user
- Location: Denver
- Website: www.statswithben.com
- Repositories: 2
- Profile: https://github.com/williamsbenjamin
Assistant Professor Business Information & Analytics University of Denver
JOSS Publication
Combining a Probability and a Non-Probability Sample in a Capture-Recapture Setting
Published
August 14, 2018
Volume 3, Issue 28, Page 886
Tags
non-probability sampling combining data sources capture-recapture samplingGitHub Events
Total
Last Year
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| williamsbenjamin | b****n@s****u | 66 |
| mark padgham | m****m@e****m | 1 |
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: about 1 year
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 1.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- mpadge (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
DESCRIPTION
cran
- stats * imports
- knitr * suggests
- rmarkdown * suggests
- testthat * suggests
- tibble * suggests
