Combining a Probability and a Non-Probability Sample in a Capture-Recapture Setting

Combining a Probability and a Non-Probability Sample in a Capture-Recapture Setting - Published in JOSS (2018)

https://github.com/williamsbenjamin/blendr

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 4 DOI reference(s) in README and JOSS metadata
✓
Academic publication links
Links to: joss.theoj.org
✓
Committers with academic emails
1 of 2 committers (50.0%) from academic institutions
○
Institutional organization owner
✓
JOSS paper metadata
Published in Journal of Open Source Software

Last synced: 11 months ago · JSON representation

Repository

Blending a Probablilty Sample and a Non-Probability Sample in a Capture-Recapture Setting

Basic Info

Host: GitHub
Owner: williamsbenjamin
License: gpl-3.0
Language: R
Default Branch: master
Size: 338 KB

Statistics

Stars: 0
Watchers: 0
Forks: 1
Open Issues: 0
Releases: 2

Created about 8 years ago · Last pushed over 5 years ago

Metadata Files

Readme License

README.Rmd

---
output: github_document
---



```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)
```

# blendR
[![DOI](http://joss.theoj.org/papers/10.21105/joss.00886/status.svg)](https://doi.org/10.21105/joss.00886)

The goal of blendR is to provide statistically valid estimators of total (and standard errors) when blending a non-probability sample with a probability sample. The two samples are considered to follow capture-recapture methodology, with the capture sample being the non-probability sample and the recapture sample being the probability sample. This package is based upon research by Liu et al (2017), Breidt, Opsomer, and Huang (2018), and dissertation research by the package author. Additional estimators will be released in future versions.

These estimators are important and currently used by Texas Parks and Wildlife to estimate the total number of Red Snapper fish caught. Combining data sources is an important research area due to the prevalence of big data in both industry and academia. These estimators can easily extend to other areas of interest, for example, the internet of things, insurance claims, and estimation of the death toll due to a natural disaster.

## Installation

You can install blendR from github with:

```{r gh-installation, eval = FALSE}
# install.packages("devtools")
devtools::install_github("williamsbenjamin/blendR")
```

## Example

An example using data from a capture-recapture sampling program in 2016 by Texas Parks and Wildlife
Captains could voluntarily self-report (via a smartphone app) about their catch of Red Snapper fish (non-probability sample) and could be sampled in a dockside intercept sample (probability sample). The self-reports are the capture sample and the dockside intercept is the recapture, probability sample.

```{r example}

library(tibble)
library(blendR)

## Dataset for boats sampled in the dockside intercept, if their captains also self-reported, that data included as well

red_snapper_sampled

## Dataset for the self-reported boats

self_reports 

s_design <- survey::svydesign(id = ~psu,
                              strat = ~stratum,
                              prob = ~prob,
                              nest = T,
                              data = red_snapper_sampled)

t_p(data = red_snapper_sampled,
    recapture_total = number_caught_ps,
    captured = captured_indicator,
    survey_design = s_design,
    capture_units = nrow(self_reports))
```

Owner

Name: Benjamin Williams
Login: williamsbenjamin
Kind: user
Location: Denver

Website: www.statswithben.com
Repositories: 2
Profile: https://github.com/williamsbenjamin

Assistant Professor Business Information & Analytics University of Denver

JOSS Publication

Combining a Probability and a Non-Probability Sample in a Capture-Recapture Setting

Published

August 14, 2018

DOI

10.21105/joss.00886

Volume 3, Issue 28, Page 886

Authors

Benjamin Williams

Department of Statistical Science, Southern Methodist University

Editor

Thomas J. Leeper

GitHub Events

Total

Last Year

Committers

Last synced: 12 months ago

All Time

Total Commits: 67
Total Committers: 2
Avg Commits per committer: 33.5
Development Distribution Score (DDS): 0.015

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
williamsbenjamin	b**n@s**u	66
mark padgham	m**m@e**m	1

Committer Domains (Top 20 + Academic)

email.com: 1 smu.edu: 1

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 0
Total pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: about 1 year
Total issue authors: 0
Total pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 1.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

mpadge (1)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

DESCRIPTION cran

stats * imports
knitr * suggests
rmarkdown * suggests
testthat * suggests
tibble * suggests

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

Combining a Probability and a Non-Probability Sample in a Capture-Recapture Setting

Science Score: 95.0%

Repository

Basic Info

Statistics

Metadata Files

README.Rmd

Owner

JOSS Publication

Combining a Probability and a Non-Probability Sample in a Capture-Recapture Setting

Authors

Editor

Tags

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies