slise

Robust regression algorithm that can be used for explaining black box models (R implementation)

Science Score: 31.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 8 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.6%) to scientific vocabulary

Keywords

black-box-explanations explain-classifiers explainable-ai explainable-ml interpretable-machine-learning local-explanations machine-learning model-agnostic r r-package research research-paper robust-regression slise slise-algorithm sparse-regression

Last synced: 6 months ago · JSON representation ·

Repository

Robust regression algorithm that can be used for explaining black box models (R implementation)

Basic Info

Host: GitHub
Owner: edahelsinki
License: mit
Language: R
Default Branch: master
Homepage: http://edahelsinki.fi/slise/
Size: 3.72 MB

Statistics

Stars: 5
Watchers: 4
Forks: 1
Open Issues: 0
Releases: 0

Topics

Created over 6 years ago · Last pushed over 2 years ago

Metadata Files

Readme License Citation

README.md

SLISE - Sparse Linear Subset Explanations

R implementation of the SLISE algorithm. The SLISE algorithm can be used for both robust regression and to explain outcomes from black box models. For more details see the conference paper, the robust regression paper or the local explanations paper. Alternatively for a more informal overview see the presentation, or the poster. Finally, there is also the documentation.

Björklund A., Henelius A., Oikarinen E., Kallonen K., Puolamäki K. (2019)
Sparse Robust Regression for Explaining Classifiers.
Discovery Science (DS 2019).
Lecture Notes in Computer Science, vol 11828, Springer.
https://doi.org/10.1007/978-3-030-33778-0_27

Björklund A., Henelius A., Oikarinen E., Kallonen K., Puolamäki K. (2022).
Robust regression via error tolerance.
Data Mining and Knowledge Discovery.
https://doi.org/10.1007/s10618-022-00819-2

Björklund A., Henelius A., Oikarinen E., Kallonen K., Puolamäki K. (2023)
Explaining any black box model using real data.
Frontiers in Computer Science 5:1143904.
https://doi.org/10.3389/fcomp.2023.1143904

The idea

In robust regression we fit regression models that can handle data that contains outliers (see the example below for why outliers are problematic for normal regression). SLISE accomplishes this by fitting a model such that the largest possible subset of the data items have an error less than a given value. All items with an error larger than that are considered potential outliers and do not affect the resulting model.

SLISE can also be used to provide local model-agnostic explanations for outcomes from black box models. To do this we replace the ground truth response vector with the predictions from the complex model. Furthermore, we force the model to fit a selected item (making the explanation local). This gives us a local approximation of the complex model with a simpler linear model (this is similar to, e.g., LIME and SHAP). In contrast to other methods SLISE creates explanations using real data (not some discretised and randomly sampled data) so we can be sure that all inputs are valid (follows the same constraints as when the data was generated, e.g., the laws of physics).

Installation

Using the devtools-package (install.packages("devtools")) install the slise package:

R devtools::install_github("edahelsinki/slise")

After installation, load the package using:

R library(slise)

Other Languages

The official Python version can be found here.

Example

In order to use SLISE you need to have your data in a numerical matrix (or something that can be cast to a matrix), and the response as a numerical vector. Below is an example of SLISE being used for robust regression:

```R library(slise) library(ggplot2) set.seed(42)

x <- seq(-1, 1, length.out = 50) y <- -x + rnorm(50, 0, 0.15) x <- c(x, rep(seq(1.6, 1.8, 0.1), 2)) y <- c(y, rep(c(1.8, 1.95), each = 3))

ols <- lm(y ~ x)$coefficients slise <- slise.fit(x, y, epsilon = 0.5)

plot(slise, title = "", partial = TRUE, size = 2) + geomabline(aes(intercept = ols[1], slope = ols[2], color = "OLS", linetype = "OLS"), size = 2) + scalecolormanual(values = c("#1b9e77", "#fda411"), name = NULL) + scalelinetypemanual(values = 2:1, name = NULL) + theme(axis.title.y = elementtext(angle = 0, vjust = 0.5), legend.key.size = grid::unit(2, "line")) + guides(shape = FALSE, color = "legend", linetype = "legend") ``` Robust Regression Example Plot

SLISE can also be used to explain predictions from black box models such as convolutional neural networks:

```R library(slise) set.seed(42)

source("experiments/explanations/data.R") emnist <- data_emnist(digit=2)

slise <- slise.explain(emnist$X, emnist$Y, 0.5, emnist$X[17,], emnist$Y[17], logit=TRUE, lambda1=3, lambda2=6) plot(slise, "image", "", c("not 2", "is 2"), plots = 1) ``` Explanation Example Plot

Dependencies

SLISE depends on the following R-packages:

Rcpp
RcppArmadillo
lbfgs

The following R-packages are optional, but needed for some of the built-in visualisations:

ggplot2
grid
gridExtra
reshape2
wordcloud

Owner

Name: EDA Helsinki
Login: edahelsinki
Kind: organization

Website: https://www.helsinki.fi/en/researchgroups/exploratory-data-analysis
Repositories: 4
Profile: https://github.com/edahelsinki

The Exploratory Data Analysis group, lead by Associate Professor Kai Puolamäki, is located at University of Helsinki (CS and INAR)

Citation (CITATIONS.bib)

@article{bjorklund2023explaining
  author   = {Bj{\"o}rklund, Anton and Henelius, Andreas and Oikarinen, Emilia and Kallonen, Kimmo and Puolam{\"a}ki, Kai},
  title    = {Explaining any black box model using real data},
  year     = {2023},
  journal  = {Frontiers in Computer Science},
  volume   = {5},
  url      = {https://www.frontiersin.org/articles/10.3389/fcomp.2023.1143904},
  doi      = {10.3389/fcomp.2023.1143904},
  issn     = {2624-9898}
}

@article{bjorklund2022robust,
  title   = {Robust regression via error tolerance},
  author  = {Bj{\"o}rklund, Anton and Henelius, Andreas and Oikarinen, Emilia and Kallonen, Kimmo and Puolam{\"a}ki, Kai},
  year    = {2022},
  month   = jan,
  journal = {Data Mining and Knowledge Discovery},
  issn    = {1384-5810, 1573-756X},
  doi     = {10.1007/s10618-022-00819-2}
}

@inproceedings{bjorklund2019sparse,
  title     = {Sparse Robust Regression for Explaining Classifiers},
  booktitle = {Discovery Science},
  author    = {Bj{\"o}rklund, Anton and Henelius, Andreas and Oikarinen, Emilia and Kallonen, Kimmo and Puolam{\"a}ki, Kai},
  year      = {2019},
  series    = {Lecture Notes in Computer Science},
  volume    = {11828},
  pages     = {351--366},
  publisher = {Springer International Publishing},
  doi       = {10.1007/978-3-030-33778-0_27},
  isbn      = {978-3-030-33777-3 978-3-030-33778-0}
}

GitHub Events

Total

Last Year

Dependencies

DESCRIPTION cran

R >= 3.5 depends
Rcpp * depends
base * imports
graphics * imports
lbfgs * imports
methods * imports
stats * imports
utils * imports
R.rsp * suggests
ggplot2 * suggests
grid * suggests
gridExtra * suggests
numDeriv * suggests
reshape2 * suggests
testthat * suggests
wordcloud * suggests

.github/workflows/pkgdown.yaml actions

JamesIves/github-pages-deploy-action v4.4.1 composite
actions/checkout v3 composite
r-lib/actions/setup-pandoc v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

.github/workflows/r_check.yml actions

actions/checkout v2 composite
r-lib/actions/check-r-package v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

slise

Science Score: 31.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

SLISE - Sparse Linear Subset Explanations

The idea

Installation

Other Languages

Example

Dependencies

Owner

Citation (CITATIONS.bib)

GitHub Events

Total

Last Year

Dependencies