gtregression

https://github.com/thinkdenominator/gtregression

Last synced: 9 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: ThinkDenominator
License: other
Language: R
Default Branch: main
Size: 9.7 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 1

Created about 1 year ago · Last pushed 10 months ago

Metadata Files

Readme Changelog License Zenodo

README.Rmd

---
output: github_document
editor_options: 
  markdown: 
    wrap: 72
---

```{r, include = FALSE}
# %\VignetteIndexEntry{Getting Started with gtregression}
# %\VignetteEngine{knitr::rmarkdown}
# %\VignetteEncoding{UTF-8}
```







# gtregression

[![R-CMD-check](https://github.com/ThinkDenominator/gtregression/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ThinkDenominator/gtregression/actions/workflows/R-CMD-check.yaml)
[![pkgdown](https://github.com/ThinkDenominator/gtregression/actions/workflows/pkgdown.yaml/badge.svg)](https://ThinkDenominator.github.io/gtregression/)
[![CRAN status](https://www.r-pkg.org/badges/version/gtregression)](https://CRAN.R-project.org/package=gtregression)
[![CRAN checks](https://badges.cranchecks.info/worst/gtregression.svg)](https://cran.r-project.org/web/checks/check_results_gtregression.html)

[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE.md)
[![Downloads - month](https://cranlogs.r-pkg.org/badges/last-month/gtregression)](https://cran.r-project.org/package=gtregression)
[![Downloads - total](https://cranlogs.r-pkg.org/badges/grand-total/gtregression)](https://cran.r-project.org/package=gtregression)
[![Codecov](https://codecov.io/gh/ThinkDenominator/gtregression/branch/main/graph/badge.svg)](https://app.codecov.io/gh/ThinkDenominator/gtregression)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.16905350.svg)](https://doi.org/10.5281/zenodo.16905350)


## Motivation 

Many academics and public health professionals in low- and middle-income 
countries (LMICs) hesitate to use R due to its steep learning curve. Instead, 
they often rely on menu-driven software like SPSS or Epi Info, which limits 
their ability to perform reproducible and advanced analyses. As a step towards 
addressing this gap, we created the gtregression package to simplify regression 
modelling in R. The package offers user-friendly syntax, intuitive functions, 
and publication-ready outputs—empowering analysts to adopt open-source tools 
with confidence.

## About the Package

`gtregression` is an R package that simplifies regression modeling and
generates publication-ready tables using the `gtsummary` ecosystem. It
supports a variety of regression approaches with built-in tools for
model diagnostics, selection, and confounder identification—all designed
to provide beginner and intermediate R users with clean, interpretable
output.

This package was created with the aim of empowering R users in low- and
middle-income countries (LMICs) by offering a simpler and more
accessible coding experience. We sincerely thank the authors and
contributors of foundational R packages such as `gtsummary`, `MASS`,
`RISKS`, `dplyr`, and others—without whom this project would not have
been possible.

## Table of Contents

-   [Vision](#vision)
-   [Features](#features)
-   [Installation](#installation)
-   [Quick Start](#quick-start)
-   [Key Functions](#key-functions)
-   [Contributing](#contributing)
-   [Authors](#authors)
-   [License](#license)

## Vision {#vision}

At its core, `gtregression` is more than just a statistical tool—it is a
commitment to open access, simplicity, and inclusivity in health data
science. Our team is driven by the vision of empowering researchers,
students, and public health professionals in LMICs through
user-friendly, well-documented tools that minimize coding burden and
maximize interpretability.

We believe in the democratization of data science and aim to promote
open-source resources for impactful and equitable research globally.

## Features {#features}

-   Supports multiple regression approaches:
    -   Logistic (logit)
    -   Log-binomial
    -   Poisson / Robust Poisson
    -   Negative Binomial
    -   Linear Regression
-   Univariable and multivariable regression
-   Confounder identification using crude and adjusted estimates
-   Stepwise model selection (AIC/BIC/adjusted R²)
-   Stratified regression support
-   Formatted outputs using `gtsummary`
-   Built-in example datasets: `PimaIndiansDiabetes2`, `birthwt`, `epil`

## Installation {#installation}

``` r
# Install from CRAN
install.packages("gtregression")

# Or install the development version from GitHub
devtools::install_github("ThinkDenominator/gtregression")
```

## Quick Start {#quick-start}

``` r
# Load necessary libraries
library(gtregression)

# Load example dataset
data("data_PimaIndiansDiabetes", package="gtregression")

# Convert diabetes outcome to binary and create categorical variables
pima_data <- data_PimaIndiansDiabetes |>
  mutate(diabetes = ifelse(diabetes == "pos", 1, 0)) |>
  mutate(bmi = case_when(
    mass < 25 ~ "Normal",
    mass >= 25 & mass < 30 ~ "Overweight",
    mass >= 30 ~ "Obese",
    TRUE ~ NA_character_),                                       
    bmi = factor(bmi, levels = c("Normal", "Overweight", "Obese")),
    age_cat = case_when(
      age < 30 ~ "Young",
      age >= 30 & age < 50 ~ "Middle-aged",
      age >= 50 ~ "Older"),
    age_cat = factor(age_cat, levels = c("Young", "Middle-aged", "Older")),
    npreg_cat = ifelse(pregnant > 2, "High parity", "Low parity"),
    npreg_cat = factor(npreg_cat, levels = c("Low parity", "High parity")),
    glucose_cat= case_when(glucose<=140~ "Normal", glucose>140~"High"),
    glucose_cat= factor(glucose_cat, levels = c("Normal", "High")),
    bp_cat = case_when(
      pressure < 80 ~ "Normal",
      pressure >= 80 ~ "High"
    ),
    bp_cat= factor(bp_cat, levels = c("Normal", "High")),
    triceps_cat = case_when(
      triceps < 23 ~ "Normal",
      triceps >= 23 ~ "High"
    ),
    triceps_cat= factor(triceps_cat, levels = c("Normal", "High")),
    insulin_cat = case_when(
      insulin < 30 ~ "Low",
      insulin >= 30 & insulin < 150 ~ "Normal",
      insulin >= 150 ~ "High"
    ),
    insulin_cat = factor(insulin_cat, levels = c("Low", "Normal", "High"))
  ) |>
  mutate(
    dpf_cat = case_when(
      pedigree <= 0.2 ~ "Low Genetic Risk",
      pedigree > 0.2 & pedigree <= 0.5 ~ "Moderate Genetic Risk",
      pedigree > 0.5 ~ "High Genetic Risk"
    )
  ) |>
  mutate(dpf_cat = factor(dpf_cat, 
              levels = c("Low Genetic Risk", 
                          "Moderate Genetic Risk", 
                          "High Genetic Risk"))) |>
  mutate(diabetes_cat= case_when(diabetes== 1~ "Diabetes positive", 
                                TRUE~ "Diabetes negative")) |>
  mutate(diabetes_cat= factor(diabetes_cat, 
                        levels = c("Diabetes negative","Diabetes positive" )))

# Descriptive statistics table
exposures <- c("bmi", "age_cat", "npreg_cat", "bp_cat", "triceps_cat",
               "insulin_cat", "dpf_cat")

# Create a descriptive table by diabetes category
des_tbl = descriptive_table(data= pima_data, 
                             exposures = exposures, 
                             by= "diabetes_cat")
                             
# Check the data compatibility
dissect(pima_data)

# Univariable regression
uni_tbl = uni_reg(
  data = pima_data,
  outcome = "diabetes",
  exposures = exposures,
  approach = "logit"
)

# check models and summaries
uni_tbl$models
uni_tbl$model_summaries

# Plot univariable regression results
plot_reg(uni_tbl, 
         title = "Univariable Regression Results")
         
# multivariable regression
multi_tbl = multi_reg(
  data = pima_data,
  outcome = "diabetes",
  exposures = exposures,
  approach = "logit"
)

# check models and summaries
multi_tbl$models
multi_tbl$model_summaries

# Plot univariable regression results
plot_reg(multi_tbl, 
         title = "Multivariable Regression Results")

# combined plots
plot_reg_combine(
  uni_tbl, 
  multi_tbl, 
  title = "Univariable vs Multivariable Regression Results")
  
# combine the tables
merge_table(des_tbl, uni_tbl, multi_tbl, 
            spanners = c("**Descriptive**",
            "**Univariate**", 
            "**Multivariable**"))

# Save the table as a Word document
save_table(des_tbl, filename = "des_tbl", format = "docx")

save_docx(
  tables = list(des_tbl, uni_tbl, multi_tbl),
  filename = "Outputs.docx")
  
# Stratified regression
stratified_uni_reg(pima_data,
                     outcome= "diabetes",
                     exposures =c("bmi", "insulin_cat", "age_cat", "dpf_cat"),
                     approach = "logit",
                     stratifier = "glucose_cat")
                     
stratified_multi_reg(pima_data,
                     outcome= "diabetes",
                     exposures =c("bmi", "insulin_cat", "age_cat", "dpf_cat"),
                     approach = "logit",
                     stratifier = "glucose_cat")
                     
# Check model convergence
check_convergence(pima_data, 
                  exposures = exposures, 
                  outcome = "diabetes", 
                  approach = "logit", 
                  multivariate = F)
                  
check_convergence(pima_data, 
                  exposures = exposures, 
                  outcome = "diabetes", 
                  approach = "logit", 
                  multivariate = T)


# identify confounders
identify_confounder(pima_data,
                    outcome = "diabetes",
                    exposure = "npreg_cat",
                    potential_confounder = "bp_cat",
                    approach = "logit")
                     
# check interactions
interaction_models(pima_data,
                   outcome,
                   exposure = "bmi",
                   effect_modifier = "glucose_cat",
                   covariates = c("insulin_cat", "age_cat", "dpf_cat"),
                   approach = "logit")
```

## Key Functions {#key-functions}

### Descriptive & Compatibility Tools

| Function Name        | Purpose                               |
|----------------------|---------------------------------------|
| `descriptive_table()`| Summarise exposures by outcome groups |
| `dissect()`          | Check outcome-exposure compatibility  |

### Regression Functions - Fit univariate and multivariable models

| Function Name | Purpose                              |
|---------------|--------------------------------------|
| `uni_reg()`   | Univariable regression (OR/RR/IRR/β) |
| `multi_reg()` | Multivariable regression             |

### Regression Functions by stratifier

| Function Name            | Purpose                             |
|--------------------------|-------------------------------------|
| `stratified_uni_reg()`   | Stratified univariable regression   |
| `stratified_multi_reg()` | Stratified multivariable regression |

### Model Diagnostics & Selection

| Function Name         | Purpose                                          |
|-----------------------|--------------------------------------------------|
| `check_convergence()` | Evaluate model convergence and max fitted values |
| `select_models()`     | Stepwise model selection (AIC/BIC/adjusted R²)   |

### Confounding & Interaction

| Function Name           | Purpose                                           |
|------------------------|----------------------------------------------------|
| `identify_confounder()` | Confounding assessment via % change or MH method  |
| `interaction_models()`  | Compare models with and without interaction terms |

### Plots & Exports

| Function Name        | Purpose                                        |
|----------------------|------------------------------------------------|
| `plot_reg()`         | Forest plot for a single regression model      |
| `plot_reg_combine()` | Side-by-side forest plots for uni/multi models |
| `modify_table()`     | Customize column labels or output structure    |
| `save_table()`       | Export table to `.html`, `.csv`, `.docx`       |
| `save_docx()`        | Save table as Word document (`.docx`)          |
| `save_plot()`        | Save plot as `.png`, `.pdf`, etc.              |
| `merge_tables()`      | Combine descriptive and regression tables      |

## Contributing {#contributing}

We welcome issues, feature requests, and pull requests.

1.  Fork the repository
2.  Create a new branch: `git checkout -b feature/my-feature`
3.  Commit your changes: `git commit -m "Add feature"`
4.  Push to GitHub: `git push origin feature/my-feature`
5.  Open a Pull Request

## Authors {#authors}

The `gtregression` package is developed and maintained by a
collaborative team committed to making regression modeling accessible,
especially for public health professionals and researchers in LMICs.

-   **Rubeshkumar Polani**\
    [rubesh.pc\@gmail.com](mailto:rubesh@thinkdenominator.com){.email}\
    ORCID: [0000-0002-0418-7592](https://orcid.org/0000-0002-0418-7592)\
    *Creator and Author*

-   **Salin K Eliyas**\
    [salins13\@gmail.com](mailto:salins13@gmail.com){.email}\
    ORCID: [0000-0002-8020-5860](https://orcid.org/0000-0002-8020-5860)\
    *Author*

-   **Manikandanesan Sakthivel**\
    [nesanmbbs\@gmail.com](mailto:nesanmbbs@gmail.com){.email}\
    ORCID: [0000-0002-5438-3970](https://orcid.org/0000-0002-5438-3970)\
    *Author*

-   **Yuvaraj Krishnamoorthy**\
    [yuvaraj\@propulevidence.org](mailto:yuvaraj@propulevidence.org){.email}\
    ORCID: [0000-0003-4688-510X](https://orcid.org/0000-0003-4688-510X)\
    *Author*

-   **Marie Gilbert Majella**\
    [gilbert2691\@gmail.com](mailto:gilbert2691@gmail.com){.email}\
    ORCID: [0000-0003-4036-5162](https://orcid.org/0000-0003-4036-5162)\
    *Author*

## License {#license}

MIT License. See LICENSE for details.

## Citation

If you use `gtregression` in your work, please cite it as:

> Rubeshkumar, P., Eliyas, S. K., Sakthivel, M., Krishnamoorthy, Y., & Majella, M. G. (2025). *ThinkDenominator/gtregression: CRAN v1.1.0 (CRAN)*. Zenodo. https://doi.org/10.5281/zenodo.16905350

## Acknowledgements

The gtregression package icon uses the **“Hearts”** symbol created by 
[Kim Sun Young](https://thenounproject.com/creator/hookeeak/) from 
[The Noun Project](https://thenounproject.com), used under the Creative Commons 
Attribution (CC BY 3.0) license.

Owner

Name: ThinkDenominator
Login: ThinkDenominator
Kind: organization
Email: rubesh@thinkdenominator.com
Location: United Kingdom

Repositories: 1
Profile: https://github.com/ThinkDenominator

GitHub Events

Total

Push event: 29
Pull request event: 1
Create event: 2

Last Year

Push event: 29
Pull request event: 1
Create event: 2

Committers

Last synced: 10 months ago

All Time

Total Commits: 58
Total Committers: 2
Avg Commits per committer: 29.0
Development Distribution Score (DDS): 0.207

Past Year

Commits: 58
Committers: 2
Avg Commits per committer: 29.0
Development Distribution Score (DDS): 0.207

Top Committers

Name	Email	Commits
Rubeshkumar Polani	r**c@g**m	46
Rubeshkumar Polani	8**h@u**m	12

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 0
Total pull requests: 2
Average time to close issues: N/A
Average time to close pull requests: 7 days
Total issue authors: 0
Total pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 2
Average time to close issues: N/A
Average time to close pull requests: 7 days
Issue authors: 0
Pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

drrubesh (2)

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- cran 222 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 1
Total maintainers: 1

cran.r-project.org: gtregression

Tools for Creating Publication-Ready Regression Tables

Homepage: https://thinkdenominator.github.io/gtregression/
Documentation: http://cran.r-project.org/web/packages/gtregression/gtregression.pdf
License: MIT + file LICENSE
Latest release: 1.0.0
published 10 months ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 222 Last month

Rankings

Dependent packages count: 25.7%

Dependent repos count: 31.6%

Average: 47.6%

Downloads: 85.4%

Maintainers (1)

rubesh@thinkdenominator.com

Last synced: 10 months ago

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

gtregression

Science Score: 49.0%

Repository

Basic Info

Statistics

Metadata Files

README.Rmd

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

cran.r-project.org: gtregression

Rankings

Maintainers (1)