Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.2%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Basic Info
Statistics
  • Stars: 5
  • Watchers: 5
  • Forks: 1
  • Open Issues: 25
  • Releases: 7
Created almost 2 years ago · Last pushed 10 months ago
Metadata Files
Readme License

README.Rmd

---
output: github_document
editor_options: 
  markdown: 
    wrap: 72
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE, warn = FALSE, message = FALSE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# PhenotypeR 



[![CRAN
status](https://www.r-pkg.org/badges/version/PhenotypeR)](https://CRAN.R-project.org/package=PhenotypeR)
[![R-CMD-check](https://github.com/ohdsi/PhenotypeR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ohdsi/PhenotypeR/actions/workflows/R-CMD-check.yaml)
[![Lifecycle:Experimental](https://img.shields.io/badge/Lifecycle-Experimental-339999)](https://lifecycle.r-lib.org/articles/stages.html#experimental)



The PhenotypeR package helps us to assess the research-readiness of a
set of cohorts we have defined. This assessment includes:

-   ***Database diagnostics*** which help us to better understand the
    database in which they have been created. This includes information
    about the size of the data, the time period covered, the number of
    people in the data as a whole. More granular information that may
    influence analytic decisions, such as the number of observation
    periods per person, is also described.\
-   ***Codelist diagnostics*** which help to answer questions like what
    concepts from our codelist are used in the database? What concepts
    were present led to individuals' entry in the cohort? Are there any
    concepts being used in the database that we didn't include in our
    codelist but maybe we should have?\
-   ***Cohort diagnostics*** which help to answer questions like how
    many individuals did we include in our cohort and how many were
    excluded because of our inclusion criteria? If we have multiple
    cohorts, is there overlap between them and when do people enter one
    cohort relative to another? What is the incidence of cohort entry
    and what is the prevalence of the cohort in the database? It can also
    compare our study cohorts to the general population by matching people
    with similar age and sex.\
-   ***Population diagnostics*** which estimates the frequency of our
    study cohorts in the database in terms of their incidence rates and
    prevalence.

## Installation

You can install PhenotypeR from CRAN:

```{r, eval = FALSE}
install.packages("PhenotypeR")
```

Or you can install the development version from GitHub:

```{r, eval = FALSE}
# install.packages("remotes")
remotes::install_github("OHDSI/PhenotypeR")
```

## Example usage

To illustrate the functionality of PhenotypeR, let's create a cohort
using the Eunomia Synpuf dataset. We'll first load the required packages and
create the cdm reference for the data.

```{r, message=FALSE, warning=FALSE}
library(dplyr)
library(CohortConstructor)
library(PhenotypeR)
library(CodelistGenerator)
library(duckdb)
library(CDMConnector)
library(DBI)
```

```{r, message=FALSE, warning=FALSE}
# Connect to the database and create the cdm object
con <- dbConnect(duckdb(), dbdir = eunomiaDir("synpuf-1k", "5.3"))
cdm <- CDMConnector::cdmFromCon(con = con, 
                                cdmName = "Eunomia Synpuf",
                                cdmSchema   = "main",
                                writeSchema = "main",
                                achillesSchema = "main")
```

Note that we've included achilles results in our cdm reference. Where we can we'll use these precomputed counts to speed up our analysis.

```{r, message=TRUE, warning=FALSE}
cdm
```

```{r, message=FALSE, warning=FALSE}
# Create a code lists
codes <- list("user_of_warfarin" = c(1310149L, 40163554L),
              "user_of_acetaminophen" = c(1125315L, 1127078L, 1127433L, 40229134L, 40231925L, 40162522L, 19133768L),
              "user_of_morphine" = c(1110410L, 35605858L, 40169988L),
              "measurements_cohort" = c(40660437L, 2617206L, 4034850L,  2617239L, 4098179L))

# Instantiate cohorts with CohortConstructor
cdm$my_cohort <- conceptCohort(cdm = cdm,
                               conceptSet = codes, 
                               exit = "event_end_date",
                               overlap = "merge",
                               name = "my_cohort")
```

We can easily run all the analyses explained above (**database diagnostics**, **codelist diagnostics**, **cohort diagnostics**, and **population diagnostics**) using
`phenotypeDiagnostics()`:

```{r, message = FALSE}
result <- phenotypeDiagnostics(cdm$my_cohort, survival = TRUE)
```

You can also create a table with the expected results, so you can compare later with the actual results.

```{r, message = FALSE}
expectations <- tibble(
  "cohort_name" = c("warfarin", "acetaminophen", "morphine", "measurements_cohort"),
  "estimate" = c("Male percentage", "Survival probability after 5y", "Median age", "Median age"),
  "value" = c("56%", "96%", "57-58", "42-45"),
  "source" = c("A clinician", "A clinician", "A clinician", "A clinician"),
  "diagnostic" = c("cohort_characteristics", "cohort_survival", "cohort_characteristics", "cohort_characteristics") 
)
```
Or alternatively, you can use AI to generate expectations
```{r, message = FALSE}
library(ellmer)
# Notice that you may need to generate an google gemini API with https://aistudio.google.com/app/apikey and add it to your R environment:
# usethis::edit_r_environ()
# GEMINI_API_KEY = "your API"

chat <- chat("google_gemini")

expectations <- getCohortExpectations(chat = chat, 
                      phenotypes = result)
```

Once we have our results we can quickly view them in an interactive
application. Here we'll apply a minimum cell count of 10 to our results and save our shiny app to a temporary directory.

```{r, eval=FALSE}
shinyDiagnostics(result = result, minCellCount = 2, directory = tempdir(), expectations = expectations)
```

See the shiny app generated from the example cohort in
[here](https://dpa-pde-oxford.shinyapps.io/PhenotypeRShiny/).

### More information

To see more details regarding each one of the analyses, please refer to
the package vignettes.

Owner

  • Name: Observational Health Data Sciences and Informatics
  • Login: OHDSI
  • Kind: organization

GitHub Events

Total
  • Create event: 138
  • Release event: 5
  • Issues event: 269
  • Watch event: 4
  • Delete event: 101
  • Member event: 3
  • Issue comment event: 77
  • Push event: 460
  • Pull request review event: 12
  • Pull request review comment event: 16
  • Pull request event: 294
  • Fork event: 1
Last Year
  • Create event: 138
  • Release event: 5
  • Issues event: 269
  • Watch event: 4
  • Delete event: 101
  • Member event: 3
  • Issue comment event: 77
  • Push event: 460
  • Pull request review event: 12
  • Pull request review comment event: 16
  • Pull request event: 294
  • Fork event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 177
  • Total pull requests: 272
  • Average time to close issues: 22 days
  • Average time to close pull requests: about 17 hours
  • Total issue authors: 12
  • Total pull request authors: 6
  • Average comments per issue: 0.39
  • Average comments per pull request: 0.03
  • Merged pull requests: 228
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 165
  • Pull requests: 259
  • Average time to close issues: 19 days
  • Average time to close pull requests: about 18 hours
  • Issue authors: 12
  • Pull request authors: 6
  • Average comments per issue: 0.39
  • Average comments per pull request: 0.03
  • Merged pull requests: 215
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • edward-burn (118)
  • martaalcalde (37)
  • catalamarti (10)
  • xihang-chen (8)
  • nmercadeb (5)
  • daniellenewby (4)
  • daniprietoalhambra (3)
  • martapineda (2)
  • wanningwang (2)
  • elinrow (2)
  • ablack3 (1)
  • albertpratsu (1)
Pull Request Authors
  • edward-burn (191)
  • martaalcalde (108)
  • catalamarti (14)
  • xihang-chen (10)
  • nmercadeb (9)
  • cecicampanile (2)
  • daniellenewby (1)
Top Labels
Issue Labels
enhancement (9) documentation (4) bug (2) needs discussion (1) duplicate (1) good first issue (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 403 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 8
  • Total maintainers: 1
cran.r-project.org: PhenotypeR

Assess Study Cohorts Using a Common Data Model

  • Versions: 8
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 403 Last month
Rankings
Dependent packages count: 27.6%
Dependent repos count: 34.0%
Average: 49.5%
Downloads: 86.9%
Maintainers (1)
Last synced: 10 months ago

Dependencies

.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v4 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pkgdown.yaml actions
  • JamesIves/github-pages-deploy-action v4.5.0 composite
  • actions/checkout v4 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION cran
  • CDMConnector * imports
  • CohortCharacteristics * imports
  • CohortConstructor * imports
  • PatientProfiles * imports
  • cli * imports
  • dplyr * imports
  • here * imports
  • magrittr * imports
  • omopgenerics * imports
  • rlang * imports
  • rmarkdown * imports
  • visOmopResults * imports
  • DBI * suggests
  • duckdb * suggests
  • gt * suggests
  • knitr * suggests
  • omock * suggests
  • testthat >= 3.0.0 suggests