cohortconstructor

https://github.com/ohdsi/cohortconstructor

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (18.4%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: OHDSI
License: apache-2.0
Language: R
Default Branch: main
Homepage: https://ohdsi.github.io/CohortConstructor/
Size: 73 MB

Statistics

Stars: 7
Watchers: 42
Forks: 1
Open Issues: 44
Releases: 6

Created almost 3 years ago · Last pushed 11 months ago

Metadata Files

Readme Changelog License

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE, warn = FALSE, message = FALSE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# CohortConstructor 


[![CRAN status](https://www.r-pkg.org/badges/version/CohortConstructor)](https://CRAN.R-project.org/package=CohortConstructor)
[![R-CMD-check](https://github.com/OHDSI/CohortConstructor/workflows/R-CMD-check/badge.svg)](https://github.com/OHDSI/CohortConstructor/actions)
[![Codecov test coverage](https://codecov.io/gh/OHDSI/CohortConstructor/branch/main/graph/badge.svg)](https://app.codecov.io/gh/OHDSI/CohortConstructor?branch=main)
[![Lifecycle:Experimental](https://img.shields.io/badge/Lifecycle-Experimental-339999)](https://lifecycle.r-lib.org/articles/stages.html#experimental)



The goal of CohortConstructor is to support the creation and manipulation of study cohorts in data mapped to the OMOP CDM.  

## Installation

The package can be installed from CRAN:
```{r, eval = FALSE}
install.packages("CohortConstructor")
```

Or you can install the development version of the package from GitHub:

```{r, eval = FALSE}
# install.packages("devtools")
devtools::install_github("ohdsi/CohortConstructor")
```

## Creating and manipulating cohorts

To illustrate the functionality provided by CohortConstructor let's create a cohort of people with a fracture using the Eunomia dataset. We'll first load required packages and create a cdm reference for the data.

```{r, message=FALSE, warning=FALSE}
library(omopgenerics)
library(CDMConnector)
library(PatientProfiles)
library(dplyr)
library(CohortConstructor)
library(CohortCharacteristics)
```

```{r, message=TRUE, warning=FALSE}
con <- DBI::dbConnect(duckdb::duckdb(), dbdir = eunomiaDir())
cdm <- cdmFromCon(con, cdmSchema = "main", 
                    writeSchema = c(prefix = "my_study_", schema = "main"))
cdm
```

### Generating concept-based fracture cohorts

We will first need to identify codes that could be used to represent fractures of interest. To find these we'll use the CodelistGenerator package (note, we will just find a few codes because we are using synthetic data with a subset of the full vocabularies). 

```{r, message=TRUE}
library(CodelistGenerator)

hip_fx_codes <- getCandidateCodes(cdm, "hip fracture")
forearm_fx_codes <- getCandidateCodes(cdm, "forearm fracture")

fx_codes <- newCodelist(list("hip_fracture" = hip_fx_codes$concept_id,
                             "forearm_fracture"= forearm_fx_codes$concept_id))
fx_codes
```

Now we can quickly create our cohorts. For this we only need to provide the codes we have defined and we will get a cohort back, where we start by setting cohort exit as the same day as event start (the date of the fracture).

```{r}
cdm$fractures <- cdm |> 
  conceptCohort(conceptSet = fx_codes, 
                exit = "event_start_date", 
                name = "fractures")
```

After creating our initial cohort we will update it so that exit is set at up to 180 days after start (so long as individuals' observation end date is on or after this - if not, exit will be at observation period end).

```{r}
cdm$fractures <- cdm$fractures |> 
  padCohortEnd(days = 180)
```

We can see that our starting cohorts, before we add any additional restrictions, have the following associated settings, counts, and attrition.

```{r}
settings(cdm$fractures) |> glimpse()
cohortCount(cdm$fractures) |> glimpse()
attrition(cdm$fractures) |> glimpse()
```

### Create an overall fracture cohort

So far we have created three separate fracture cohorts. Let's say we also want a cohort of people with any of the fractures. We could union our three cohorts to create this overall cohort like so:

```{r, message=FALSE}
cdm$fractures <- unionCohorts(cdm$fractures,
                              cohortName = "any_fracture", 
                              keepOriginalCohorts = TRUE,
                              name ="fractures")
```

```{r, message=FALSE}
settings(cdm$fractures)
cohortCount(cdm$fractures)
```

### Require in date range

Once we have created our base fracture cohort, we can then start applying additional cohort requirements. For example, first we can require that individuals' cohort start date fall within a certain date range.

```{r}
cdm$fractures <- cdm$fractures |> 
  requireInDateRange(dateRange = as.Date(c("2000-01-01", "2020-01-01")))
```

Now that we've applied this date restriction, we can see that our cohort attributes have been updated

```{r}
cohortCount(cdm$fractures) |> glimpse()
attrition(cdm$fractures) |> 
  filter(reason == "cohort_start_date between 2000-01-01 & 2020-01-01") |> 
  glimpse()
```

### Applying demographic requirements

We can also add restrictions on patient characteristics such as age (on cohort start date by default) and sex.

```{r}
cdm$fractures <- cdm$fractures |> 
  requireDemographics(ageRange = list(c(40, 65)),
                      sex = "Female")
```

Again we can see how many individuals we've lost after applying these criteria.

```{r}
attrition(cdm$fractures) |> 
  filter(reason == "Age requirement: 40 to 65") |> 
  glimpse()

attrition(cdm$fractures) |> 
  filter(reason == "Sex requirement: Female") |> 
  glimpse()
```

### Require presence in another cohort

We can also require that individuals are (or are not) in another cohort over some window. Here for example we require that study participants are in a GI bleed cohort any time prior up to their entry in the fractures cohort.

```{r}
cdm$gibleed <- cdm |> 
  conceptCohort(conceptSet = list("gibleed" = 192671L),
  name = "gibleed")

cdm$fractures <- cdm$fractures |> 
  requireCohortIntersect(targetCohortTable = "gibleed",
                         intersections = 0,
                         window = c(-Inf, 0))
```

```{r}
attrition(cdm$fractures) |> 
  filter(reason == "Not in cohort gibleed between -Inf & 0 days relative to cohort_start_date") |> 
  glimpse()
```

```{r}
cdmDisconnect(cdm)
```

### More information
CohortConstructor provides much more functionality for creating and manipulating cohorts. See the package vignettes for more details.

Owner

Name: Observational Health Data Sciences and Informatics
Login: OHDSI
Kind: organization

Website: http://ohdsi.org
Repositories: 285
Profile: https://github.com/OHDSI

GitHub Events

Total

Create event: 84
Release event: 3
Issues event: 211
Watch event: 4
Delete event: 64
Issue comment event: 93
Push event: 306
Pull request review event: 10
Pull request review comment event: 9
Pull request event: 171

Last Year

Create event: 84
Release event: 3
Issues event: 211
Watch event: 4
Delete event: 64
Issue comment event: 93
Push event: 306
Pull request review event: 10
Pull request review comment event: 9
Pull request event: 171

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 281
Total pull requests: 326
Average time to close issues: about 1 month
Average time to close pull requests: 2 days
Total issue authors: 17
Total pull request authors: 10
Average comments per issue: 0.63
Average comments per pull request: 0.19
Merged pull requests: 257
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 132
Pull requests: 188
Average time to close issues: 13 days
Average time to close pull requests: 2 days
Issue authors: 14
Pull request authors: 7
Average comments per issue: 0.29
Average comments per pull request: 0.14
Merged pull requests: 158
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

edward-burn (126)
nmercadeb (62)
catalamarti (32)
martaalcalde (15)
KimLopezGuell (15)
elinrow (6)
ilovemane (5)
daniellenewby (4)
annasaura (3)
xihang-chen (3)
zcuccu (3)
patripedregal (2)
martapineda (1)
cecicampanile (1)
haugmarkus (1)

Pull Request Authors

nmercadeb (135)
edward-burn (108)
catalamarti (24)
elinrow (20)
martaalcalde (12)
cecicampanile (10)
ilovemane (7)
xihang-chen (4)
KimLopezGuell (3)
MimiYuchenGuo (2)
sharmake124 (1)

Top Labels

Issue Labels

enhancement (25) documentation (20) needs discussion (10) bug (9) question (4)

Pull Request Labels

bug (1) enhancement (1) needs discussion (1)

Packages

Total packages: 1
Total downloads:
- cran 1,427 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 12
Total maintainers: 1

cran.r-project.org: CohortConstructor

Build and Manipulate Study Cohorts Using a Common Data Model

Homepage: https://ohdsi.github.io/CohortConstructor/
Documentation: http://cran.r-project.org/web/packages/CohortConstructor/CohortConstructor.pdf
License: Apache License (≥ 2)
Latest release: 0.5.0
published 11 months ago

Versions: 12
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 1,427 Last month

Rankings

Dependent packages count: 28.8%

Dependent repos count: 35.5%

Average: 49.9%

Downloads: 85.4%

Maintainers (1)

edward.burn@ndorms.ox.ac.uk