modelStudio
modelStudio: Interactive Studio with Explanations for ML Predictive Models - Published in JOSS (2019)
Science Score: 95.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 8 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org -
✓Committers with academic emails
1 of 5 committers (20.0%) from academic institutions -
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Scientific Fields
Repository
📍 Interactive Studio for Explanatory Model Analysis
Basic Info
- Host: GitHub
- Owner: ModelOriented
- License: gpl-3.0
- Language: R
- Default Branch: master
- Homepage: https://doi.org/10.1007/s10618-023-00924-w
- Size: 36.2 MB
Statistics
- Stars: 333
- Watchers: 20
- Forks: 32
- Open Issues: 4
- Releases: 14
Topics
Metadata Files
README.md
Interactive Studio for Explanatory Model Analysis
Overview
The modelStudio package automates the explanatory analysis of machine learning predictive models. It generates advanced interactive model explanations in the form of a serverless HTML site with only one line of code. This tool is model-agnostic, therefore compatible with most of the black-box predictive models and frameworks (e.g. mlr/mlr3, xgboost, caret, h2o, parsnip, tidymodels, scikit-learn, lightgbm, keras/tensorflow).
The main modelStudio() function computes various (instance and model-level) explanations and produces a customisable dashboard, which consists of multiple panels for plots with their short descriptions. It is possible to easily save the dashboard and share it with others. Tools for Explanatory Model Analysis unite with tools for Exploratory Data Analysis to give a broad overview of the model behavior.
explain COVID-19 R & Python examples More resources Interactive EMA
[](https://modelstudio.drwhy.ai/demo.html) The `modelStudio` package is a part of the [**DrWhy.AI**](http://drwhy.ai) universe. ## Installation ```r # Install from CRAN: install.packages("modelStudio") # Install the development version from GitHub: devtools::install_github("ModelOriented/modelStudio") ``` ## Simple demo ```r library("DALEX") library("ranger") library("modelStudio") # fit a model model <- ranger(score ~., data = happiness_train) # create an explainer for the model explainer <- explain(model, data = happiness_test, y = happiness_test$score, label = "Random Forest") # make a studio for the model modelStudio(explainer) ``` [Save the output](https://modelstudio.drwhy.ai/#save--share) in the form of a HTML file - [**Demo Dashboard**](https://modelstudio.drwhy.ai/demo.html). [](https://modelstudio.drwhy.ai/demo.html) ## R & Python examples [more](https://modelstudio.drwhy.ai/articles/ms-r-python-examples.html) ------------------------------- The `modelStudio()` function uses `DALEX` explainers created with `DALEX::explain()` or `DALEXtra::explain_*()`. ```r # packages for the explainer objects install.packages("DALEX") install.packages("DALEXtra") ``` ### mlr [dashboard](https://modelstudio.drwhy.ai/mlr.html) Make a studio for the regression `ranger` model on the `apartments` data.code
```r # load packages and data library(mlr) library(DALEXtra) library(modelStudio) data <- DALEX::apartments # split the data index <- sample(1:nrow(data), 0.7*nrow(data)) train <- data[index,] test <- data[-index,] # fit a model task <- makeRegrTask(id = "apartments", data = train, target = "m2.price") learner <- makeLearner("regr.ranger", predict.type = "response") model <- train(learner, task) # create an explainer for the model explainer <- explain_mlr(model, data = test, y = test$m2.price, label = "mlr") # pick observations new_observation <- test[1:2,] rownames(new_observation) <- c("id1", "id2") # make a studio for the model modelStudio(explainer, new_observation) ```code
```r # load packages and data library(xgboost) library(DALEX) library(modelStudio) data <- DALEX::titanic_imputed # split the data index <- sample(1:nrow(data), 0.7*nrow(data)) train <- data[index,] test <- data[-index,] train_matrix <- model.matrix(survived ~.-1, train) test_matrix <- model.matrix(survived ~.-1, test) # fit a model xgb_matrix <- xgb.DMatrix(train_matrix, label = train$survived) params <- list(max_depth = 3, objective = "binary:logistic", eval_metric = "auc") model <- xgb.train(params, xgb_matrix, nrounds = 500) # create an explainer for the model explainer <- explain(model, data = test_matrix, y = test$survived, type = "classification", label = "xgboost") # pick observations new_observation <- test_matrix[1:2, , drop=FALSE] rownames(new_observation) <- c("id1", "id2") # make a studio for the model modelStudio(explainer, new_observation) ```code
First, use `dalex` in Python: ```python # load packages and data import dalex as dx from sklearn.model_selection import train_test_split from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.svm import SVR from numpy import log data = dx.datasets.load_fifa() X = data.drop(columns=['overall', 'potential', 'value_eur', 'wage_eur', 'nationality'], axis=1) y = log(data.value_eur) # split the data X_train, X_test, y_train, y_test = train_test_split(X, y) # fit a pipeline model model = Pipeline([('scale', StandardScaler()), ('svm', SVR())]) model.fit(X_train, y_train) # create an explainer for the model explainer = dx.Explainer(model, data=X_test, y=y_test, label='scikit-learn') # pack the explainer into a pickle file explainer.dump(open('explainer_scikitlearn.pickle', 'wb')) ``` Then, use `modelStudio` in R: ```r # load the explainer from the pickle file library(reticulate) explainer <- py_load_object("explainer_scikitlearn.pickle", pickle = "pickle") # make a studio for the model library(modelStudio) modelStudio(explainer, B = 5) ```code
First, use `dalex` in Python: ```python # load packages and data import dalex as dx from sklearn.model_selection import train_test_split from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler, OneHotEncoder from sklearn.impute import SimpleImputer from sklearn.compose import ColumnTransformer from lightgbm import LGBMClassifier data = dx.datasets.load_titanic() X = data.drop(columns='survived') y = data.survived # split the data X_train, X_test, y_train, y_test = train_test_split(X, y) # fit a pipeline model numerical_features = ['age', 'fare', 'sibsp', 'parch'] numerical_transformer = Pipeline( steps=[ ('imputer', SimpleImputer(strategy='median')), ('scaler', StandardScaler()) ] ) categorical_features = ['gender', 'class', 'embarked'] categorical_transformer = Pipeline( steps=[ ('imputer', SimpleImputer(strategy='constant', fill_value='missing')), ('onehot', OneHotEncoder(handle_unknown='ignore')) ] ) preprocessor = ColumnTransformer( transformers=[ ('num', numerical_transformer, numerical_features), ('cat', categorical_transformer, categorical_features) ] ) classifier = LGBMClassifier(n_estimators=300) model = Pipeline( steps=[ ('preprocessor', preprocessor), ('classifier', classifier) ] ) model.fit(X_train, y_train) # create an explainer for the model explainer = dx.Explainer(model, data=X_test, y=y_test, label='lightgbm') # pack the explainer into a pickle file explainer.dump(open('explainer_lightgbm.pickle', 'wb')) ``` Then, use `modelStudio` in R: ```r # load the explainer from the pickle file library(reticulate) explainer <- py_load_object("explainer_lightgbm.pickle", pickle = "pickle") # make a studio for the model library(modelStudio) modelStudio(explainer) ```
Citations
If you use modelStudio, please cite our JOSS article:
@article{baniecki2019modelstudio,
title = {{modelStudio: Interactive Studio with Explanations for ML Predictive Models}},
author = {Hubert Baniecki and Przemyslaw Biecek},
journal = {Journal of Open Source Software},
year = {2019},
volume = {4},
number = {43},
pages = {1798},
url = {https://doi.org/10.21105/joss.01798}
}
For a description and evaluation of the Interactive EMA process, refer to our DAMI article:
@article{baniecki2023grammar,
title = {The grammar of interactive explanatory model analysis},
author = {Hubert Baniecki and Dariusz Parzych and Przemyslaw Biecek},
journal = {Data Mining and Knowledge Discovery},
year = {2023},
pages = {1--37},
url = {https://doi.org/10.1007/s10618-023-00924-w}
}
More resources
Introduction to the plots: Explanatory Model Analysis: Explore, Explain, and Examine Predictive Models
Vignettes: perks and features, R & Python examples, modelStudio in R Markdown HTML
Changelog: NEWS
Conference poster: ML in PL 2019
Acknowledgments
Work on this package was financially supported by the National Science Centre (Poland) grant 2016/21/B/ST6/02176 and National Centre for Research and Development grant POIR.01.01.01-00-0328/17.
Owner
- Name: Model Oriented
- Login: ModelOriented
- Kind: organization
- Location: MI2DataLab @ Warsaw University of Technology
- Website: https://mi2.ai/
- Repositories: 41
- Profile: https://github.com/ModelOriented
JOSS Publication
modelStudio: Interactive Studio with Explanations for ML Predictive Models
Authors
Tags
automated data analysis model visualization explainable artificial intelligence predictive modeling interpretable machine learningPapers & Mentions
Total mentions: 1
Feature Importance of Stabilised Rammed Earth Components Affecting the Compressive Strength Calculated with Explainable Artificial Intelligence Tools
- DOI: 10.3390/ma13102317
- OpenAlex ID: https://openalex.org/W3025406257
- Published: May 2020
GitHub Events
Total
- Watch event: 8
Last Year
- Watch event: 8
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| hbaniecki | h****i@s****l | 187 |
| Hubert Baniecki | h****i@g****m | 147 |
| Przemysław Biecek | p****k@g****m | 8 |
| Kyle Niemeyer | k****r@g****m | 3 |
| Piotr Piątyszek | 8****k | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 81
- Total pull requests: 20
- Average time to close issues: 23 days
- Average time to close pull requests: about 3 hours
- Total issue authors: 19
- Total pull request authors: 4
- Average comments per issue: 1.07
- Average comments per pull request: 0.4
- Merged pull requests: 18
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- hbaniecki (59)
- pbiecek (3)
- andreassot10 (2)
- expectopatronum (2)
- evoree (1)
- MJimitater (1)
- CoolShades (1)
- ananya231284 (1)
- fkgruber (1)
- set92 (1)
- nutle (1)
- agosiewska (1)
- arodionoff (1)
- JohnsonHsieh (1)
- bgu1997 (1)
Pull Request Authors
- hbaniecki (14)
- kyleniemeyer (3)
- piotrpiatyszek (2)
- dominik-aigora (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 3
-
Total downloads:
- cran 428 last-month
- Total docker downloads: 10
-
Total dependent packages: 0
(may contain duplicates) -
Total dependent repositories: 1
(may contain duplicates) - Total versions: 46
- Total maintainers: 1
proxy.golang.org: github.com/ModelOriented/modelStudio
- Documentation: https://pkg.go.dev/github.com/ModelOriented/modelStudio#section-documentation
- License: gpl-3.0
-
Latest release: v3.1.2+incompatible
published almost 3 years ago
Rankings
proxy.golang.org: github.com/modeloriented/modelstudio
- Documentation: https://pkg.go.dev/github.com/modeloriented/modelstudio#section-documentation
- License: gpl-3.0
-
Latest release: v3.1.2+incompatible
published almost 3 years ago
Rankings
cran.r-project.org: modelStudio
Interactive Studio for Explanatory Model Analysis
- Homepage: https://modelstudio.drwhy.ai
- Documentation: http://cran.r-project.org/web/packages/modelStudio/modelStudio.pdf
- License: GPL-3
-
Latest release: 3.1.2
published almost 3 years ago
Rankings
Maintainers (1)
Dependencies
- R >= 3.6 depends
- DALEX >= 2.2.1 imports
- digest * imports
- iBreakDown >= 2.0.1 imports
- ingredients >= 2.2.0 imports
- jsonlite * imports
- progress * imports
- r2d3 * imports
- knitr * suggests
- parallelMap * suggests
- ranger * suggests
- rmarkdown * suggests
- spelling * suggests
- testthat * suggests
- xgboost * suggests
