modelStudio

modelStudio: Interactive Studio with Explanations for ML Predictive Models - Published in JOSS (2019)

https://github.com/modeloriented/modelstudio

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 8 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
    1 of 5 committers (20.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

ai explainable explainable-ai explainable-machine-learning explanatory-model-analysis human iml interactive interactivity interpretability interpretable interpretable-machine-learning learning machine model model-visualization r visualization xai

Scientific Fields

Earth and Environmental Sciences Physical Sciences - 40% confidence
Engineering Computer Science - 40% confidence
Last synced: 4 months ago · JSON representation

Repository

📍 Interactive Studio for Explanatory Model Analysis

Basic Info
Statistics
  • Stars: 333
  • Watchers: 20
  • Forks: 32
  • Open Issues: 4
  • Releases: 14
Topics
ai explainable explainable-ai explainable-machine-learning explanatory-model-analysis human iml interactive interactivity interpretability interpretable interpretable-machine-learning learning machine model model-visualization r visualization xai
Created over 6 years ago · Last pushed over 2 years ago
Metadata Files
Readme Changelog Contributing Funding License

README.md

Interactive Studio for Explanatory Model Analysis

CRAN_Status_Badge R build status Codecov test coverage JOSS-status

Overview

The modelStudio package automates the explanatory analysis of machine learning predictive models. It generates advanced interactive model explanations in the form of a serverless HTML site with only one line of code. This tool is model-agnostic, therefore compatible with most of the black-box predictive models and frameworks (e.g. mlr/mlr3, xgboost, caret, h2o, parsnip, tidymodels, scikit-learn, lightgbm, keras/tensorflow).

The main modelStudio() function computes various (instance and model-level) explanations and produces a customisable dashboard, which consists of multiple panels for plots with their short descriptions. It is possible to easily save the dashboard and share it with others. Tools for Explanatory Model Analysis unite with tools for Exploratory Data Analysis to give a broad overview of the model behavior.

explain COVID-19R & Python examplesMore resourcesInteractive EMA

[![](man/figures/demo_small.gif)](https://modelstudio.drwhy.ai/demo.html) The `modelStudio` package is a part of the [**DrWhy.AI**](http://drwhy.ai) universe. ## Installation ```r # Install from CRAN: install.packages("modelStudio") # Install the development version from GitHub: devtools::install_github("ModelOriented/modelStudio") ``` ## Simple demo ```r library("DALEX") library("ranger") library("modelStudio") # fit a model model <- ranger(score ~., data = happiness_train) # create an explainer for the model explainer <- explain(model, data = happiness_test, y = happiness_test$score, label = "Random Forest") # make a studio for the model modelStudio(explainer) ``` [Save the output](https://modelstudio.drwhy.ai/#save--share) in the form of a HTML file - [**Demo Dashboard**](https://modelstudio.drwhy.ai/demo.html). [![](man/figures/demo_big.gif)](https://modelstudio.drwhy.ai/demo.html) ## R & Python examples [more](https://modelstudio.drwhy.ai/articles/ms-r-python-examples.html) ------------------------------- The `modelStudio()` function uses `DALEX` explainers created with `DALEX::explain()` or `DALEXtra::explain_*()`. ```r # packages for the explainer objects install.packages("DALEX") install.packages("DALEXtra") ``` ### mlr [dashboard](https://modelstudio.drwhy.ai/mlr.html) Make a studio for the regression `ranger` model on the `apartments` data.
code ```r # load packages and data library(mlr) library(DALEXtra) library(modelStudio) data <- DALEX::apartments # split the data index <- sample(1:nrow(data), 0.7*nrow(data)) train <- data[index,] test <- data[-index,] # fit a model task <- makeRegrTask(id = "apartments", data = train, target = "m2.price") learner <- makeLearner("regr.ranger", predict.type = "response") model <- train(learner, task) # create an explainer for the model explainer <- explain_mlr(model, data = test, y = test$m2.price, label = "mlr") # pick observations new_observation <- test[1:2,] rownames(new_observation) <- c("id1", "id2") # make a studio for the model modelStudio(explainer, new_observation) ```
### xgboost [dashboard](https://modelstudio.drwhy.ai/xgboost.html) Make a studio for the classification `xgboost` model on the `titanic` data.
code ```r # load packages and data library(xgboost) library(DALEX) library(modelStudio) data <- DALEX::titanic_imputed # split the data index <- sample(1:nrow(data), 0.7*nrow(data)) train <- data[index,] test <- data[-index,] train_matrix <- model.matrix(survived ~.-1, train) test_matrix <- model.matrix(survived ~.-1, test) # fit a model xgb_matrix <- xgb.DMatrix(train_matrix, label = train$survived) params <- list(max_depth = 3, objective = "binary:logistic", eval_metric = "auc") model <- xgb.train(params, xgb_matrix, nrounds = 500) # create an explainer for the model explainer <- explain(model, data = test_matrix, y = test$survived, type = "classification", label = "xgboost") # pick observations new_observation <- test_matrix[1:2, , drop=FALSE] rownames(new_observation) <- c("id1", "id2") # make a studio for the model modelStudio(explainer, new_observation) ```
------------------------- The `modelStudio()` function uses `dalex` explainers created with `dalex.Explainer()`. ```console :: package for the Explainer object pip install dalex -U ``` Use `pickle` Python module and `reticulate` R package to easily make a studio for a model. ```r # package for pickle load install.packages("reticulate") ``` ### scikit-learn [dashboard](https://modelstudio.drwhy.ai/scikitlearn.html) Make a studio for the regression `Pipeline SVR` model on the `fifa` data.
code First, use `dalex` in Python: ```python # load packages and data import dalex as dx from sklearn.model_selection import train_test_split from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.svm import SVR from numpy import log data = dx.datasets.load_fifa() X = data.drop(columns=['overall', 'potential', 'value_eur', 'wage_eur', 'nationality'], axis=1) y = log(data.value_eur) # split the data X_train, X_test, y_train, y_test = train_test_split(X, y) # fit a pipeline model model = Pipeline([('scale', StandardScaler()), ('svm', SVR())]) model.fit(X_train, y_train) # create an explainer for the model explainer = dx.Explainer(model, data=X_test, y=y_test, label='scikit-learn') # pack the explainer into a pickle file explainer.dump(open('explainer_scikitlearn.pickle', 'wb')) ``` Then, use `modelStudio` in R: ```r # load the explainer from the pickle file library(reticulate) explainer <- py_load_object("explainer_scikitlearn.pickle", pickle = "pickle") # make a studio for the model library(modelStudio) modelStudio(explainer, B = 5) ```
### lightgbm [dashboard](https://modelstudio.drwhy.ai/lightgbm.html) Make a studio for the classification `Pipeline LGBMClassifier` model on the `titanic` data.
code First, use `dalex` in Python: ```python # load packages and data import dalex as dx from sklearn.model_selection import train_test_split from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler, OneHotEncoder from sklearn.impute import SimpleImputer from sklearn.compose import ColumnTransformer from lightgbm import LGBMClassifier data = dx.datasets.load_titanic() X = data.drop(columns='survived') y = data.survived # split the data X_train, X_test, y_train, y_test = train_test_split(X, y) # fit a pipeline model numerical_features = ['age', 'fare', 'sibsp', 'parch'] numerical_transformer = Pipeline( steps=[ ('imputer', SimpleImputer(strategy='median')), ('scaler', StandardScaler()) ] ) categorical_features = ['gender', 'class', 'embarked'] categorical_transformer = Pipeline( steps=[ ('imputer', SimpleImputer(strategy='constant', fill_value='missing')), ('onehot', OneHotEncoder(handle_unknown='ignore')) ] ) preprocessor = ColumnTransformer( transformers=[ ('num', numerical_transformer, numerical_features), ('cat', categorical_transformer, categorical_features) ] ) classifier = LGBMClassifier(n_estimators=300) model = Pipeline( steps=[ ('preprocessor', preprocessor), ('classifier', classifier) ] ) model.fit(X_train, y_train) # create an explainer for the model explainer = dx.Explainer(model, data=X_test, y=y_test, label='lightgbm') # pack the explainer into a pickle file explainer.dump(open('explainer_lightgbm.pickle', 'wb')) ``` Then, use `modelStudio` in R: ```r # load the explainer from the pickle file library(reticulate) explainer <- py_load_object("explainer_lightgbm.pickle", pickle = "pickle") # make a studio for the model library(modelStudio) modelStudio(explainer) ```
------------------------------- ## Save & share Save `modelStudio` as a HTML file using buttons on the top of the RStudio Viewer or with [`r2d3::save_d3_html()`](https://rstudio.github.io/r2d3/articles/publishing.html#save-as-html).

Citations

If you use modelStudio, please cite our JOSS article:

@article{baniecki2019modelstudio, title = {{modelStudio: Interactive Studio with Explanations for ML Predictive Models}}, author = {Hubert Baniecki and Przemyslaw Biecek}, journal = {Journal of Open Source Software}, year = {2019}, volume = {4}, number = {43}, pages = {1798}, url = {https://doi.org/10.21105/joss.01798} }

For a description and evaluation of the Interactive EMA process, refer to our DAMI article:

@article{baniecki2023grammar, title = {The grammar of interactive explanatory model analysis}, author = {Hubert Baniecki and Dariusz Parzych and Przemyslaw Biecek}, journal = {Data Mining and Knowledge Discovery}, year = {2023}, pages = {1--37}, url = {https://doi.org/10.1007/s10618-023-00924-w} }

More resources

Acknowledgments

Work on this package was financially supported by the National Science Centre (Poland) grant 2016/21/B/ST6/02176 and National Centre for Research and Development grant POIR.01.01.01-00-0328/17.

Owner

  • Name: Model Oriented
  • Login: ModelOriented
  • Kind: organization
  • Location: MI2DataLab @ Warsaw University of Technology

JOSS Publication

modelStudio: Interactive Studio with Explanations for ML Predictive Models
Published
November 05, 2019
Volume 4, Issue 43, Page 1798
Authors
Hubert Baniecki ORCID
Faculty of Mathematics and Information Science, Warsaw University of Technology
Przemyslaw Biecek ORCID
Faculty of Mathematics and Information Science, Warsaw University of Technology
Editor
Yuan Tang ORCID
Tags
automated data analysis model visualization explainable artificial intelligence predictive modeling interpretable machine learning

Papers & Mentions

Total mentions: 1

Feature Importance of Stabilised Rammed Earth Components Affecting the Compressive Strength Calculated with Explainable Artificial Intelligence Tools
Last synced: 3 months ago

GitHub Events

Total
  • Watch event: 8
Last Year
  • Watch event: 8

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 346
  • Total Committers: 5
  • Avg Commits per committer: 69.2
  • Development Distribution Score (DDS): 0.46
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
hbaniecki h****i@s****l 187
Hubert Baniecki h****i@g****m 147
Przemysław Biecek p****k@g****m 8
Kyle Niemeyer k****r@g****m 3
Piotr Piątyszek 8****k 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 81
  • Total pull requests: 20
  • Average time to close issues: 23 days
  • Average time to close pull requests: about 3 hours
  • Total issue authors: 19
  • Total pull request authors: 4
  • Average comments per issue: 1.07
  • Average comments per pull request: 0.4
  • Merged pull requests: 18
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • hbaniecki (59)
  • pbiecek (3)
  • andreassot10 (2)
  • expectopatronum (2)
  • evoree (1)
  • MJimitater (1)
  • CoolShades (1)
  • ananya231284 (1)
  • fkgruber (1)
  • set92 (1)
  • nutle (1)
  • agosiewska (1)
  • arodionoff (1)
  • JohnsonHsieh (1)
  • bgu1997 (1)
Pull Request Authors
  • hbaniecki (14)
  • kyleniemeyer (3)
  • piotrpiatyszek (2)
  • dominik-aigora (1)
Top Labels
Issue Labels
feature 💡 (28) short term ⏰ (22) before release 📌 (11) bug 💣 (10) documentation 📚 (10) question ❔ (6) long term 📆 (5) maintenance 🔨 (5) invalid ❕ (4) wontfix (3)
Pull Request Labels

Packages

  • Total packages: 3
  • Total downloads:
    • cran 428 last-month
  • Total docker downloads: 10
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 1
    (may contain duplicates)
  • Total versions: 46
  • Total maintainers: 1
proxy.golang.org: github.com/ModelOriented/modelStudio
  • Versions: 14
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.5%
Average: 5.7%
Dependent repos count: 5.8%
Last synced: 4 months ago
proxy.golang.org: github.com/modeloriented/modelstudio
  • Versions: 14
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.5%
Average: 5.7%
Dependent repos count: 5.8%
Last synced: 4 months ago
cran.r-project.org: modelStudio

Interactive Studio for Explanatory Model Analysis

  • Versions: 18
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 428 Last month
  • Docker Downloads: 10
Rankings
Stargazers count: 1.2%
Forks count: 2.4%
Average: 18.2%
Dependent repos count: 24.0%
Downloads: 25.5%
Docker downloads count: 27.4%
Dependent packages count: 28.8%
Maintainers (1)
Last synced: 4 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.6 depends
  • DALEX >= 2.2.1 imports
  • digest * imports
  • iBreakDown >= 2.0.1 imports
  • ingredients >= 2.2.0 imports
  • jsonlite * imports
  • progress * imports
  • r2d3 * imports
  • knitr * suggests
  • parallelMap * suggests
  • ranger * suggests
  • rmarkdown * suggests
  • spelling * suggests
  • testthat * suggests
  • xgboost * suggests