SIRUS.jl

SIRUS.jl: Interpretable Machine Learning via Rule Extraction - Published in JOSS (2023)

https://github.com/rikhuijzer/sirus.jl

Science Score: 98.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 5 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

explainable-ai explainable-ml machine-learning

Keywords from Contributors

standardization meshing pde interpretability parallel simulations correlation graphics fluxes hydrology
Last synced: 6 months ago · JSON representation ·

Repository

Interpretable Machine Learning via Rule Extraction

Basic Info
Statistics
  • Stars: 38
  • Watchers: 2
  • Forks: 3
  • Open Issues: 19
  • Releases: 12
Topics
explainable-ai explainable-ml machine-learning
Created over 3 years ago · Last pushed 7 months ago
Metadata Files
Readme License Citation

README.md

Visual representation of the algorithm which converts decision trees to rule sets. Created with DALL·E 3 and Photopea

SIRUS.jl

CI Code Style Blue DOI badge


This package is a pure Julia implementation of the Stable and Interpretable RUle Sets (SIRUS) algorithm. The algorithm was originally created by Clément Bénard, Gérard Biau, Sébastien Da Veiga, and Erwan Scornet (Bénard et al., 2021). SIRUS.jl has implemented both classification and regression, but we found that performance is generally best on classification tasks.

The main benefit of this algorithm is that it is fully explainable. This differs from model-agnostic explainability techniques such as SHAP, which convert the model to a simplified representation. However, the complex model is still used for predictions, which can lead to hidden biases or reliability issues. The SIRUS algorithm fixes this by using a simplified model for both for prediction and explanation.

Installation

```julia julia> ]

pkg> add SIRUS ```

Getting Started

This package defines two rule-based models that satisfy the Machine Learning Julia MLJ.jl interface. The models are StableRulesClassifier and StableRulesRegressor:

Example

```julia julia> using MLJ, SIRUS

julia> X, y = make_blobs(200, 10; centers=2);

julia> X Tables.MatrixTable{Matrix{Float64}} with 200 rows, 10 columns, and schema: :x1 Float64 :x2 Float64 :x3 Float64 :x4 Float64 :x5 Float64 :x6 Float64 :x7 Float64 :x8 Float64 :x9 Float64 :x10 Float64

julia> y 200-element CategoricalArrays.CategoricalArray{Int64,1,UInt32}: 2 1 1 ⋮ 2 1 2

julia> model = StableRulesClassifier();

julia> mach = machine(model, X, y);

julia> fit!(mach);

julia> mach.fitresult StableRules model with 7 rules: if X[i, :x5] < -1.552594 then 0.129 else 0.0 + if X[i, :x8] < 0.72402614 then 0.117 else 0.0 + if X[i, :x2] < 7.1123967 then 0.123 else 0.0 + if X[i, :x8] < 8.840833 then 0.115 else 0.0 + if X[i, :x9] < 7.985747 then 0.0 else 0.001 + if X[i, :x7] < 6.4651833 then 0.107 else 0.0 + if X[i, :x7] < 2.2220817 then 0.119 else 0.024 and 2 classes: [1, 2]. Note: showing only the probability for class 2 since class 1 has probability 1 - p. ```

This is a basic example, in most cases you want to tune the max_depth, max_rules, and lambda hyperparameters. See ?StableRulesClassifier, ?StableRulesRegressor, or the API documentation for more information about the models and their hyperparameters. A full guide through binary classification can be found in the Simple Binary Classification example.

Citation

bibtex @article{huijzer2023sirus, title={{SIRUS.jl}: Interpretable Machine Learning via Rule Extraction}, author={Huijzer, Rik and Blaauw, Frank and den Hartigh, Ruud JR}, journal={Journal of Open Source Software}, volume={8}, number={90}, pages={5786}, year={2023}, doi={10.21105/joss.05786} }

Documentation

Documentation is at sirus.jl.huijzer.xyz.

Contributing

Thank you for your interest in contributing to SIRUS.jl! There are multiple ways to contribute.

Questions and Bug Reports

For questions or bug reports, you can open an issue. Questions can also be asked at the Julia forum or by sending a mail to github@huijzer.xyz. Tag @rikh in the forum to ensure a quick reply.

Pull Requests

To submit patches, use pull requests (PRs) here on GitHub. In general:

  • Try to keep PRs limited to one feature or bug; otherwise they become hard to review/verify.
  • Try to use the code style that is used in the rest of the codebase. See also the Code Style Blue.
  • Try to update documentation when updating code, but feel free to leave documentation updates for a separate PR.
  • When possible, make PRs as easily reversible as possible. Any change that would be easily reversible later provides little risk and can, therefore, more easily be merged.

As long as the PR moves the codebase forward, merging will likely happen.

Owner

  • Name: Rik Huijzer
  • Login: rikhuijzer
  • Kind: user

JOSS Publication

SIRUS.jl: Interpretable Machine Learning via Rule Extraction
Published
October 12, 2023
Volume 8, Issue 90, Page 5786
Authors
Rik Huijzer ORCID
University of Groningen, Groningen, the Netherlands
Frank Blaauw ORCID
Researchable, Assen, the Netherlands
Ruud J.R. den Hartigh ORCID
University of Groningen, Groningen, the Netherlands
Editor
Mehmet Hakan Satman ORCID

Citation (CITATION.cff)

cff-version: "1.2.0"
authors:
- family-names: Huijzer
  given-names: Rik
  orcid: "https://orcid.org/0000-0001-9445-8466"
- family-names: Blaauw
  given-names: Frank
  orcid: "https://orcid.org/0000-0002-6588-5079"
- family-names: Hartigh
  given-names: Ruud J. R.
  name-particle: den
  orcid: "https://orcid.org/0000-0002-0094-8307"
contact:
- family-names: Huijzer
  given-names: Rik
  orcid: "https://orcid.org/0000-0001-9445-8466"
doi: 10.5281/zenodo.8398350
message: "If you use this software, please cite our article in the Journal of Open Source Software."
preferred-citation:
  authors:
  - family-names: Huijzer
    given-names: Rik
    orcid: "https://orcid.org/0000-0001-9445-8466"
  - family-names: Blaauw
    given-names: Frank
    orcid: "https://orcid.org/0000-0002-6588-5079"
  - family-names: Hartigh
    given-names: Ruud J. R.
    name-particle: den
    orcid: "https://orcid.org/0000-0002-0094-8307"
  date-published: 2023-10-12
  doi: 10.21105/joss.05786
  issn: 2475-9066
  issue: 90
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 5786
  title: "SIRUS.jl: Interpretable Machine Learning via Rule Extraction"
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.05786"
  volume: 8
title: "SIRUS.jl: Interpretable Machine Learning via Rule Extraction"

GitHub Events

Total
  • Create event: 3
  • Commit comment event: 2
  • Issues event: 1
  • Watch event: 7
  • Delete event: 1
  • Push event: 10
  • Pull request event: 5
Last Year
  • Create event: 3
  • Commit comment event: 2
  • Issues event: 1
  • Watch event: 7
  • Delete event: 1
  • Push event: 10
  • Pull request event: 5

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 251
  • Total Committers: 5
  • Avg Commits per committer: 50.2
  • Development Distribution Score (DDS): 0.052
Past Year
  • Commits: 7
  • Committers: 1
  • Avg Commits per committer: 7.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Rik Huijzer r****r@p****e 238
dependabot[bot] 4****] 6
github-actions[bot] 4****] 5
Okon Samuel 3****l 1
Jose Storopoli j****e@s****o 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 40
  • Total pull requests: 54
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 6 days
  • Total issue authors: 8
  • Total pull request authors: 6
  • Average comments per issue: 1.73
  • Average comments per pull request: 0.61
  • Merged pull requests: 48
  • Bot issues: 0
  • Bot pull requests: 14
Past Year
  • Issues: 0
  • Pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: about 1 hour
  • Issue authors: 0
  • Pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.5
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 1
Top Authors
Issue Authors
  • rikhuijzer (22)
  • gdalle (6)
  • ablaom (5)
  • sylvaticus (2)
  • OkonSamuel (1)
  • ericphanson (1)
  • JuliaTagBot (1)
  • Zapiano (1)
Pull Request Authors
  • rikhuijzer (36)
  • dependabot[bot] (16)
  • github-actions[bot] (5)
  • jbytecode (2)
  • storopoli (2)
  • OkonSamuel (1)
Top Labels
Issue Labels
enhancement (3)
Pull Request Labels
dependencies (16)

Packages

  • Total packages: 1
  • Total downloads:
    • julia 9 total
  • Total dependent packages: 1
  • Total dependent repositories: 0
  • Total versions: 13
juliahub.com: SIRUS

Interpretable Machine Learning via Rule Extraction

  • Versions: 13
  • Dependent Packages: 1
  • Dependent Repositories: 0
  • Downloads: 9 Total
Rankings
Dependent repos count: 9.9%
Average: 36.9%
Dependent packages count: 38.9%
Stargazers count: 45.1%
Forks count: 53.5%
Last synced: 6 months ago

Dependencies

.github/workflows/CI.yml actions
  • actions/checkout v2 composite
  • julia-actions/cache v1 composite
  • julia-actions/julia-buildpkg latest composite
  • julia-actions/julia-runtest v1 composite
  • julia-actions/setup-julia v1 composite
.github/workflows/CompatHelper.yml actions
  • julia-actions/setup-julia v1 composite
.github/workflows/Docs.yml actions
  • actions/checkout v2 composite
  • julia-actions/julia-buildpkg v1 composite
  • julia-actions/julia-docdeploy v1 composite
.github/workflows/TagBot.yml actions
  • JuliaRegistries/TagBot v1 composite