data_impact_on_xai

Project to analyze how dataset characteristics impact agreement levels and costs of XAI techniques

https://github.com/flodercc/data_impact_on_xai

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (5.6%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Project to analyze how dataset characteristics impact agreement levels and costs of XAI techniques

Basic Info

Host: GitHub
Owner: FloderCC
Language: Python
Default Branch: main
Size: 4.75 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 2

Created about 2 years ago · Last pushed over 1 year ago

Metadata Files

Readme Citation

Data Impact On XAI

Repository of the work entitled "Understanding the Influence of Data Characteristics on Explainable AI"

Structure

This repository has the following structure: ├── src/datasets ├── src/plots ├── src/results ├── src/dataset_descriptors.py ├── src/dataset_setup.py ├── src/dataset_utils.py ├── src/dnn_model.py ├── src/experiment.py ├── src/model_utils.py ├── src/results_analyzer.py ├── src/xai_methods.py

src/datasets/ contains the datasets and its sources.
src/plots/ contains all plots generated by the script src/results_analyzers.py
src/results/ contains all logs from the experimental process
src/dataset_descriptors.py contains the dataset descriptors
src/dataset_setup defines the datasets to be used in the experiments
src/dataset_utils.py contains the dataset loading and preprocessing functions
src/dnn_model.py has the Deep Neural Networks code
src/experiment.py has the main code of the experiment
src/model_utils.py contains auxiliary methods to create and evaluate the models
src/results_analyzer.py contains the script for generating the plots
src/xai_methods.py contains the XAI methods code

Hyperparameter space explored for the ML models

| Model | Hyperparameters | |------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | LR | C: {0.1, 1, 10}
penalty: {l1, l2, None}
classweight: {None, balanced} | | DT | criterion: {gini, entropy, logloss}
maxdepth: {None, 10, 15, 20}
maxfeatures: {None, sqrt, log2}
classweight: {None, balanced} | | ExtraTree | criterion: {gini, entropy, logloss}
maxfeatures: {None, sqrt, log2}
maxdepth: {None, 10, 15, 20}
classweight: {None, balanced} | | SVM | kernel: {poly, rbf}
C: {0.1, 1, 10}
classweight: {None, balanced} | | GaussianNB | varsmoothing: {1e-12, 1e-10, 1e-8, 1e-6, 1e-4, 1e-2, 1} | | BernoulliNB | alpha: {0.1, 1, 10} | | RF | criterion: {gini, entropy, logloss}
maxfeatures: {None, sqrt, log2}
classweight: {None, balanced}
nestimators: {25, 50, 100, 200}
maxdepth: {None, 10, 15, 20} | | ExtraTrees | criterion: {gini, entropy, logloss}
maxfeatures: {None, sqrt, log2}
classweight: {None, balanced}
nestimators: {25, 50, 100, 200}
maxdepth: {None, 10, 15, 20} | | AdaBoost | nestimators: {25, 50, 100, 200}
learningrate: {0.1, 0.5, 1} | | GradientBoosting | criterion: {friedmanmse, squarederror}
nestimators: {50, 100, 200}
learningrate: {0.1, 0.5, 1}
maxdepth: {None, 10, 15, 20}
maxfeatures: {None, sqrt, log2} | | Bagging | nestimators: {10, 50, 100}
maxsamples: {1.0} | | MLP | hiddenlayersizes: {(50,), (100,)}
activation: {logistic, relu}
alpha: {0.0001, 0.001, 0.01} | | DNN | hiddenlayerssize: {16, 16, 16; 8, 16, 8; 32, 16, 8}
activationfunction: {sigmoid, tanh, relu, silu}
lossfunction: {categoricalcrossentropy}
nnoptimizer: {Adam, RMSprop, Nadam}
epochs: {100, 500}
batchsize: {10, 50} |

Summary of Best Hyperparameter Settings

Owner

Login: FloderCC
Kind: user

Repositories: 1
Profile: https://github.com/FloderCC

Citation (CITATION.cff)

cff-version: 1.2.0
message: If you use this software, please cite it using these metadata.
title: "To be defined"
authors:
  - family-names: Corona
    given-names: Julio
  - family-names: Teixeira
    given-names: Rafael
  - family-names: Antunes
    given-names: Mário
  - family-names: Aguiar
    given-names: Rui L.
version: "1.0" # Adjust the version accordingly
doi: "To be defined" # Adjust the DOI accordingly
date-released: "To be defined" # Adjust the release date if needed
license: Apache-2.0 # Adjust the license if needed
url: "To be defined" # Add your repository URL

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science