midr

midr: Learning from Black-Box Models by Maximum Interpretation Decomposition

https://github.com/ryo-asashi/midr

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.5%) to scientific vocabulary

Keywords

iml interpretable-machine-learning r r-package xai
Last synced: 6 months ago · JSON representation

Repository

midr: Learning from Black-Box Models by Maximum Interpretation Decomposition

Basic Info
Statistics
  • Stars: 5
  • Watchers: 1
  • Forks: 0
  • Open Issues: 2
  • Releases: 1
Topics
iml interpretable-machine-learning r r-package xai
Created over 1 year ago · Last pushed 6 months ago
Metadata Files
Readme Changelog License

README.Rmd

---
output: github_document
---



```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%",
  message = FALSE,
  warnings = FALSE
)
```

# midr 



[![R-CMD-check](https://github.com/ryo-asashi/midr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ryo-asashi/midr/actions/workflows/R-CMD-check.yaml) [![CRAN status](https://www.r-pkg.org/badges/version/midr)](https://CRAN.R-project.org/package=midr)



The goal of 'midr' is to provide a model-agnostic method for interpreting and explaining black-box predictive models by creating a globally interpretable surrogate model. The package implements 'Maximum Interpretation Decomposition' (MID), a functional decomposition technique that finds an optimal additive approximation of the original model. This approximation is achieved by minimizing the squared error between the predictions of the black-box model and the surrogate model. The theoretical foundations of MID are described in Iwasawa & Matsumori (2025) [Forthcoming], and the package itself is detailed in [Asashiba et al. (2025)](https://arxiv.org/abs/2506.08338).

## Installation

You can install the released version of midr from [CRAN](https://cran.r-project.org/) with:

``` r
install.packages("midr")
```

and the development version from [GitHub](https://github.com/) with:

``` r
# install.packages("devtools")
devtools::install_github("ryo-asashi/midr")
```

## Examples

In the following example, we fit a random forest model to the `Boston` dataset included in ISLR2, and then attempt to interpret it using the functions of midr.

```{r data_and_model}
# load required packages
library(midr)
library(ggplot2)
library(gridExtra)
library(ISLR2)
library(ranger)
theme_set(theme_midr())
# split the Boston dataset
data("Boston", package = "ISLR2")
set.seed(42)
idx <- sample(nrow(Boston), nrow(Boston) * .75)
train <- Boston[ idx, ]
valid <- Boston[-idx, ]
# fit a random forest model
rf <- ranger(medv ~ ., train, mtry = 5)
preds_rf <- predict(rf, valid)$predictions
cat("RMSE: ", weighted.loss(valid$medv, preds_rf))
```

The first step is to create a MID model as a global surrogate of the target model using `interpret()`.

```{r interpret}
# fit a two-dimensional MID model
mid <- interpret(medv ~ .^2, train, rf, lambda = .1)
mid
preds_mid <- predict(mid, valid)
cat("RMSE: ", weighted.loss(preds_rf, preds_mid))
cat("RMSE: ", weighted.loss(valid$medv, preds_mid))
```

To visualize the main and interaction effects of the variables, apply `ggmid()` or `plot()` to the fitted MID model.

```{r ggmid}
# visualize the main and interaction effects of the MID model
grid.arrange(
  ggmid(mid, "lstat") +
    ggtitle("main effect of lstat"),
  ggmid(mid, "dis") +
    ggtitle("main effect of dis"),
  ggmid(mid, "lstat:dis") +
    ggtitle("interaction of lstat:dis"),
  ggmid(mid, "lstat:dis", main.effects = TRUE, type = "compound") +
    ggtitle("interaction + main effects")
)
# visualize all main effects
grid.arrange(grobs = mid.plots(mid), nrow = 3)
```

`mid.importance()` helps to compute and compare the importance of main and interaction effects.

```{r mid_importance}
# visualize the MID importance of the component functions
imp <- mid.importance(mid)
grid.arrange(nrow = 1L,
  ggmid(imp, "dotchart", theme = "highlight") +
    theme(legend.position = "bottom") +
    ggtitle("importance of variable effects"),
  ggmid(imp, "heatmap") +
    theme(legend.position = "bottom") +
    ggtitle("heatmap of variable importance")
)
```

`mid.breakdown()` provides a way to analyze individual predictions by decomposing the differences between the intercept and the predicted value into variable effects.

```{r mid_breakdown}
# visualize the MID breakdown of the model predictions
bd1 <- mid.breakdown(mid, data = train, row = 1L)
bd9 <- mid.breakdown(mid, data = train, row = 9L)
grid.arrange(nrow = 1L,
  ggmid(bd1, "waterfall", theme = "midr", max.nterms = 14L) +
    theme(legend.position = "bottom") +
    ggtitle("breakdown of prediction 1"),
  ggmid(bd9, "waterfall", theme = "midr", max.nterms = 14L) +
    theme(legend.position = "bottom") +
    ggtitle("breakdown of prediction 9")
)
```

`mid.conditional()` can be used to compute the ICE curves (Goldstein et al. 2015) of the fitted MID model, as well as the breakdown of the ICE curves by main and interaction effects.

```{r mid_conditional}
# visualize the ICE curves of the MID model
ice <- mid.conditional(mid, "lstat")
grid.arrange(
  ggmid(ice, alpha = .1) +
    ggtitle("ICE of lstat"),
  ggmid(ice, "centered", "mako", var.color = dis) +
    ggtitle("c-ICE of lstat"),
  ggmid(ice, term = "lstat:dis", theme = "mako", var.color = dis) +
    ggtitle("ICE of interaction with dis"),
  ggmid(ice, term = "lstat:age", theme = "mako", var.color = age) +
    ggtitle("ICE of interaction with age")
)
```

## References

[1] Iwasawa, H. & Matsumori, Y. (2025). "A Functional Decomposition Approach to Maximize the Interpretability of Black-Box Models". [Forthcoming]

[2] Asashiba, R., Kozuma, R. & Iwasawa, H. (2025). "midr: Learning from Black-Box Models by Maximum Interpretation Decomposition". 

[3] Goldstein, A., Kapelner, A., Bleich, J., & Pitkin, E. (2015). "Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation". *Journal of Computational and Graphical Statistics*, *24*(1), 44–65. 

Owner

  • Name: Ryoichi Asashiba
  • Login: ryo-asashi
  • Kind: user
  • Location: Yokohama, Japan

Actuary working in the Far East. R, Python, Data Visualization, History and philosophy of science.

GitHub Events

Total
  • Create event: 3
  • Release event: 1
  • Issues event: 12
  • Watch event: 8
  • Member event: 2
  • Issue comment event: 16
  • Push event: 161
Last Year
  • Create event: 3
  • Release event: 1
  • Issues event: 12
  • Watch event: 8
  • Member event: 2
  • Issue comment event: 16
  • Push event: 161

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 6
  • Total pull requests: 0
  • Average time to close issues: about 2 months
  • Average time to close pull requests: N/A
  • Total issue authors: 2
  • Total pull request authors: 0
  • Average comments per issue: 1.5
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 6
  • Pull requests: 0
  • Average time to close issues: about 2 months
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 0
  • Average comments per issue: 1.5
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • ryo-asashi (5)
  • rktkdm (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 180 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 3
  • Total maintainers: 1
cran.r-project.org: midr

Learning from Black-Box Models by Maximum Interpretation Decomposition

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 180 Last month
Rankings
Dependent packages count: 26.2%
Dependent repos count: 32.2%
Average: 48.3%
Downloads: 86.5%
Maintainers (1)
Last synced: 6 months ago