araucana-xai
Tree-based local explanations of machine learning model predictions
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org, sciencedirect.com, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.0%) to scientific vocabulary
Keywords
Repository
Tree-based local explanations of machine learning model predictions
Basic Info
- Host: GitHub
- Owner: bmi-labmedinfo
- License: mit
- Language: Jupyter Notebook
- Default Branch: master
- Homepage: https://pypi.org/project/araucanaxai/
- Size: 1.06 MB
Statistics
- Stars: 5
- Watchers: 5
- Forks: 3
- Open Issues: 1
- Releases: 8
Topics
Metadata Files
README.md
Araucana XAI
Tree-based local explanations of machine learning model predictions
Repository for the araucanaxai package. Implementation of the pipeline first described in Parimbelli et al., 2023.
Explore the docs »
Report Bug
·
Request Feature
Table of Contents
About The Project
Increasingly complex learning methods such as boosting, bagging and deep learning have made ML models more accurate, but harder to understand and interpret. A tradeoff between performance and intelligibility is often to be faced, especially in high-stakes applications like medicine. This project proposes a novel methodological approach for generating explanations of the predictions of a generic ML model, given a specific instance for which the prediction has been made, that can tackle both classification and regression tasks. Advantages of the proposed XAI approach include improved fidelity to the original model, the ability to deal with non-linear decision boundaries, and native support to both classification and regression problems.
Keywords: explainable AI, explanations, local explanation, fidelity, interpretability, transparency, trustworthy AI, black-box, machine learning, feature importance, decision tree, CART, AIM.
Installation
- Make sure you have the latest version of pip installed
sh pip install --upgrade pip - Install araucanaxai through pip
sh pip install araucanaxai
Usage
Here's a basic example with a built-in toy dataset that illustrates Araucana XAI common usage.
First, train a classifier on the data. Araucana XAI is model-agnostic, you only have to provide a function that takes data as input and outputs binary labels.
Then, declare the example whose classification you want to explain.
Finally, run the Araucana XAI and plot the xai tree to explain model's decision as a set of IF-ELSE rules.
```python import araucanaxai from sklearn.linear_model import LogisticRegression from sklearn import tree from sklearn.metrics import * import matplotlib.pyplot as plt
load toy dataset with both categorical and numerical features
catdata = True # set to False if you don't need categorical features data = araucanaxai.loadbreastcancer(trainsplit=.75, cat=cat_data)
specify which features are categorical
cat = data["featurenames"][0:5] iscat = [x in cat for x in data["feature_names"]] # set to None if you don't need categorical data
train logistic regression classifier: this is the model to explain
classifier = LogisticRegression(randomstate=42, solver='liblinear', penalty='l1', maxiter=500) classifier.fit(data["Xtrain"], data["ytrain"]) ytestpred = classifier.predict(data["X_test"])
print('precision: ' + str(precisionscore(data["ytest"], ytestpred)) + ', recall: ' + str( recallscore(data["ytest"], ytestpred)))
declare the instance we want to explain
index = 65 instance = data["Xtest"][index, :].reshape(1, data["Xtest"].shape[1]) instancepredy = ytestpred[index]
build xai tree to explain the instance classification
the neighbourhood size determines the number of closer instances to consider for local explaination
different oversampling strategies are available for data augmentation: SMOTE, random uniform and random non-uniform (based on sample statistics)
it is possible to control the xai tree pruning in temrs of maximum depth and minimum number of istances in a leaf
xaitree = araucanaxai.run(xtarget=instance, ypredtarget=instancepredy, xtrain=data["Xtrain"],featurenames=data["featurenames"], catlist=iscat, neighbourhoodsize=150, oversampling=True, oversamplingtype="smote", oversamplingsize=100, maxdepth=3, minsamplesleaf=1, predict_fun=classifier.predict)
plot the tree
fig, ax = plt.subplots(figsize=(10, 10)) tree.plottree(xaitree['tree'], featurenames=data["featurenames"], filled=True, classnames=data["targetnames"]) plt.tight_layout() plt.show() ```
You can also check the notebook here.
See the open issues for a full list of proposed features (and known issues).
Publications
List of publications involving AraucanaXAI
E Parimbelli, TM Buonocore, G Nicora, W Michalowski, S Wilk, R Bellazzi - Why did AI get this one wrong? Tree-based explanations of machine learning model predictions - Artificial Intelligence in Medicine, Volume 135, 2023 (link)
TM Buonocore, G Nicora, A Dagliati, E Parimbelli - Evaluation of XAI on ALS 6-months Mortality Prediction - Proceedings of the Working Notes of CLEF 2022, Volume 3180, 2022 (link)
E Parimbelli, G Nicora, S Wilk, W Michalowski, R Bellazzi - Tree-based Local Explanations of Machine Learning Model Predictions - XAI Healthcare workshop, AIME 2021 (link, presentation)
If you use the AraucanaXAI software for your projects, please cite it as:
@software{Buonocore_Araucana_XAI_2022,
author = {Buonocore, Tommaso Mario and Giovanna, Nicora and Enea, Parimbelli},
doi = {10.5281/zenodo.10715476},
month = {9},
title = {{Araucana XAI}},
url = {https://github.com/detsutut/AraucanaXAI},
version = {1.0.0},
year = {2022}
}
Contacts and Useful Links
Project Link: https://github.com/detsutut/AraucanaXAI
Package Link: https://pypi.org/project/araucanaxai/
License
Distributed under MIT License. See LICENSE for more information.
Owner
- Name: BMI "Mario Stefanelli" Lab - UNIPV
- Login: bmi-labmedinfo
- Kind: organization
- Email: labmedinfo@unipv.it
- Location: Italy
- Website: http://www.labmedinfo.org
- Repositories: 1
- Profile: https://github.com/bmi-labmedinfo
Repository for BMI lab code and sw products
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Buonocore" given-names: "Tommaso Mario" orcid: "https://orcid.org/0000-0002-2887-088X" - family-names: "Giovanna" given-names: "Nicora" orcid: "https://orcid.org/0000-0001-7007-0862" - family-names: "Enea" given-names: "Parimbelli" orcid: "https://orcid.org/0000-0003-0679-828X" title: "Araucana XAI" version: 1.0.0 doi: 10.5281/zenodo.1234 date-released: 2022-09-09 url: "https://github.com/detsutut/AraucanaXAI"
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 6
- Total pull requests: 5
- Average time to close issues: 4 months
- Average time to close pull requests: 1 day
- Total issue authors: 3
- Total pull request authors: 4
- Average comments per issue: 0.83
- Average comments per pull request: 0.2
- Merged pull requests: 4
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- KuzonFyre (2)
- ripankundu (1)
Pull Request Authors
- LorenzoPeracchio (4)
- marcozullich (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- gower *
- imblearn *
- numpy *
- pandas *
- sklearn *