PhenotypeR
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.2%) to scientific vocabulary
Last synced: 9 months ago
·
JSON representation
Repository
Basic Info
- Host: GitHub
- Owner: OHDSI
- License: apache-2.0
- Language: R
- Default Branch: main
- Homepage: https://ohdsi.github.io/PhenotypeR/
- Size: 32.6 MB
Statistics
- Stars: 5
- Watchers: 5
- Forks: 1
- Open Issues: 25
- Releases: 7
Created almost 2 years ago
· Last pushed 10 months ago
Metadata Files
Readme
License
README.Rmd
---
output: github_document
editor_options:
markdown:
wrap: 72
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE, warn = FALSE, message = FALSE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# PhenotypeR
[](https://CRAN.R-project.org/package=PhenotypeR)
[](https://github.com/ohdsi/PhenotypeR/actions/workflows/R-CMD-check.yaml)
[](https://lifecycle.r-lib.org/articles/stages.html#experimental)
The PhenotypeR package helps us to assess the research-readiness of a
set of cohorts we have defined. This assessment includes:
- ***Database diagnostics*** which help us to better understand the
database in which they have been created. This includes information
about the size of the data, the time period covered, the number of
people in the data as a whole. More granular information that may
influence analytic decisions, such as the number of observation
periods per person, is also described.\
- ***Codelist diagnostics*** which help to answer questions like what
concepts from our codelist are used in the database? What concepts
were present led to individuals' entry in the cohort? Are there any
concepts being used in the database that we didn't include in our
codelist but maybe we should have?\
- ***Cohort diagnostics*** which help to answer questions like how
many individuals did we include in our cohort and how many were
excluded because of our inclusion criteria? If we have multiple
cohorts, is there overlap between them and when do people enter one
cohort relative to another? What is the incidence of cohort entry
and what is the prevalence of the cohort in the database? It can also
compare our study cohorts to the general population by matching people
with similar age and sex.\
- ***Population diagnostics*** which estimates the frequency of our
study cohorts in the database in terms of their incidence rates and
prevalence.
## Installation
You can install PhenotypeR from CRAN:
```{r, eval = FALSE}
install.packages("PhenotypeR")
```
Or you can install the development version from GitHub:
```{r, eval = FALSE}
# install.packages("remotes")
remotes::install_github("OHDSI/PhenotypeR")
```
## Example usage
To illustrate the functionality of PhenotypeR, let's create a cohort
using the Eunomia Synpuf dataset. We'll first load the required packages and
create the cdm reference for the data.
```{r, message=FALSE, warning=FALSE}
library(dplyr)
library(CohortConstructor)
library(PhenotypeR)
library(CodelistGenerator)
library(duckdb)
library(CDMConnector)
library(DBI)
```
```{r, message=FALSE, warning=FALSE}
# Connect to the database and create the cdm object
con <- dbConnect(duckdb(), dbdir = eunomiaDir("synpuf-1k", "5.3"))
cdm <- CDMConnector::cdmFromCon(con = con,
cdmName = "Eunomia Synpuf",
cdmSchema = "main",
writeSchema = "main",
achillesSchema = "main")
```
Note that we've included achilles results in our cdm reference. Where we can we'll use these precomputed counts to speed up our analysis.
```{r, message=TRUE, warning=FALSE}
cdm
```
```{r, message=FALSE, warning=FALSE}
# Create a code lists
codes <- list("user_of_warfarin" = c(1310149L, 40163554L),
"user_of_acetaminophen" = c(1125315L, 1127078L, 1127433L, 40229134L, 40231925L, 40162522L, 19133768L),
"user_of_morphine" = c(1110410L, 35605858L, 40169988L),
"measurements_cohort" = c(40660437L, 2617206L, 4034850L, 2617239L, 4098179L))
# Instantiate cohorts with CohortConstructor
cdm$my_cohort <- conceptCohort(cdm = cdm,
conceptSet = codes,
exit = "event_end_date",
overlap = "merge",
name = "my_cohort")
```
We can easily run all the analyses explained above (**database diagnostics**, **codelist diagnostics**, **cohort diagnostics**, and **population diagnostics**) using
`phenotypeDiagnostics()`:
```{r, message = FALSE}
result <- phenotypeDiagnostics(cdm$my_cohort, survival = TRUE)
```
You can also create a table with the expected results, so you can compare later with the actual results.
```{r, message = FALSE}
expectations <- tibble(
"cohort_name" = c("warfarin", "acetaminophen", "morphine", "measurements_cohort"),
"estimate" = c("Male percentage", "Survival probability after 5y", "Median age", "Median age"),
"value" = c("56%", "96%", "57-58", "42-45"),
"source" = c("A clinician", "A clinician", "A clinician", "A clinician"),
"diagnostic" = c("cohort_characteristics", "cohort_survival", "cohort_characteristics", "cohort_characteristics")
)
```
Or alternatively, you can use AI to generate expectations
```{r, message = FALSE}
library(ellmer)
# Notice that you may need to generate an google gemini API with https://aistudio.google.com/app/apikey and add it to your R environment:
# usethis::edit_r_environ()
# GEMINI_API_KEY = "your API"
chat <- chat("google_gemini")
expectations <- getCohortExpectations(chat = chat,
phenotypes = result)
```
Once we have our results we can quickly view them in an interactive
application. Here we'll apply a minimum cell count of 10 to our results and save our shiny app to a temporary directory.
```{r, eval=FALSE}
shinyDiagnostics(result = result, minCellCount = 2, directory = tempdir(), expectations = expectations)
```
See the shiny app generated from the example cohort in
[here](https://dpa-pde-oxford.shinyapps.io/PhenotypeRShiny/).
### More information
To see more details regarding each one of the analyses, please refer to
the package vignettes.
Owner
- Name: Observational Health Data Sciences and Informatics
- Login: OHDSI
- Kind: organization
- Website: http://ohdsi.org
- Repositories: 285
- Profile: https://github.com/OHDSI
GitHub Events
Total
- Create event: 138
- Release event: 5
- Issues event: 269
- Watch event: 4
- Delete event: 101
- Member event: 3
- Issue comment event: 77
- Push event: 460
- Pull request review event: 12
- Pull request review comment event: 16
- Pull request event: 294
- Fork event: 1
Last Year
- Create event: 138
- Release event: 5
- Issues event: 269
- Watch event: 4
- Delete event: 101
- Member event: 3
- Issue comment event: 77
- Push event: 460
- Pull request review event: 12
- Pull request review comment event: 16
- Pull request event: 294
- Fork event: 1
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 177
- Total pull requests: 272
- Average time to close issues: 22 days
- Average time to close pull requests: about 17 hours
- Total issue authors: 12
- Total pull request authors: 6
- Average comments per issue: 0.39
- Average comments per pull request: 0.03
- Merged pull requests: 228
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 165
- Pull requests: 259
- Average time to close issues: 19 days
- Average time to close pull requests: about 18 hours
- Issue authors: 12
- Pull request authors: 6
- Average comments per issue: 0.39
- Average comments per pull request: 0.03
- Merged pull requests: 215
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- edward-burn (118)
- martaalcalde (37)
- catalamarti (10)
- xihang-chen (8)
- nmercadeb (5)
- daniellenewby (4)
- daniprietoalhambra (3)
- martapineda (2)
- wanningwang (2)
- elinrow (2)
- ablack3 (1)
- albertpratsu (1)
Pull Request Authors
- edward-burn (191)
- martaalcalde (108)
- catalamarti (14)
- xihang-chen (10)
- nmercadeb (9)
- cecicampanile (2)
- daniellenewby (1)
Top Labels
Issue Labels
enhancement (9)
documentation (4)
bug (2)
needs discussion (1)
duplicate (1)
good first issue (1)
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 403 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 8
- Total maintainers: 1
cran.r-project.org: PhenotypeR
Assess Study Cohorts Using a Common Data Model
- Homepage: https://ohdsi.github.io/PhenotypeR/
- Documentation: http://cran.r-project.org/web/packages/PhenotypeR/PhenotypeR.pdf
- License: Apache License (≥ 2)
-
Latest release: 0.2.0
published 11 months ago
Rankings
Dependent packages count: 27.6%
Dependent repos count: 34.0%
Average: 49.5%
Downloads: 86.9%
Maintainers (1)
Last synced:
10 months ago
Dependencies
.github/workflows/R-CMD-check.yaml
actions
- actions/checkout v4 composite
- r-lib/actions/check-r-package v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pkgdown.yaml
actions
- JamesIves/github-pages-deploy-action v4.5.0 composite
- actions/checkout v4 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION
cran
- CDMConnector * imports
- CohortCharacteristics * imports
- CohortConstructor * imports
- PatientProfiles * imports
- cli * imports
- dplyr * imports
- here * imports
- magrittr * imports
- omopgenerics * imports
- rlang * imports
- rmarkdown * imports
- visOmopResults * imports
- DBI * suggests
- duckdb * suggests
- gt * suggests
- knitr * suggests
- omock * suggests
- testthat >= 3.0.0 suggests