data_impact_on_xai

Project to analyze how dataset characteristics impact agreement levels and costs of XAI techniques

https://github.com/flodercc/data_impact_on_xai

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.6%) to scientific vocabulary
Last synced: 7 months ago · JSON representation ·

Repository

Project to analyze how dataset characteristics impact agreement levels and costs of XAI techniques

Basic Info
  • Host: GitHub
  • Owner: FloderCC
  • Language: Python
  • Default Branch: main
  • Size: 4.75 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 2
Created almost 2 years ago · Last pushed about 1 year ago
Metadata Files
Readme Citation

README.md

Data Impact On XAI

Repository of the work entitled "Understanding the Influence of Data Characteristics on Explainable AI"

DOI

Structure

This repository has the following structure: ├── src/datasets ├── src/plots ├── src/results ├── src/dataset_descriptors.py ├── src/dataset_setup.py ├── src/dataset_utils.py ├── src/dnn_model.py ├── src/experiment.py ├── src/model_utils.py ├── src/results_analyzer.py ├── src/xai_methods.py

  • src/datasets/ contains the datasets and its sources.
  • src/plots/ contains all plots generated by the script src/results_analyzers.py
  • src/results/ contains all logs from the experimental process
  • src/dataset_descriptors.py contains the dataset descriptors
  • src/dataset_setup defines the datasets to be used in the experiments
  • src/dataset_utils.py contains the dataset loading and preprocessing functions
  • src/dnn_model.py has the Deep Neural Networks code
  • src/experiment.py has the main code of the experiment
  • src/model_utils.py contains auxiliary methods to create and evaluate the models
  • src/results_analyzer.py contains the script for generating the plots
  • src/xai_methods.py contains the XAI methods code

Hyperparameter space explored for the ML models

| Model | Hyperparameters | |------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | LR | C: {0.1, 1, 10}
penalty: {l1, l2, None}
classweight: {None, balanced} | | DT | criterion: {gini, entropy, logloss}
maxdepth: {None, 10, 15, 20}
max
features: {None, sqrt, log2}
classweight: {None, balanced} | | ExtraTree | criterion: {gini, entropy, logloss}
maxfeatures: {None, sqrt, log2}
max
depth: {None, 10, 15, 20}
classweight: {None, balanced} | | SVM | kernel: {poly, rbf}
C: {0.1, 1, 10}
class
weight: {None, balanced} | | GaussianNB | varsmoothing: {1e-12, 1e-10, 1e-8, 1e-6, 1e-4, 1e-2, 1} | | BernoulliNB | alpha: {0.1, 1, 10} | | RF | criterion: {gini, entropy, logloss}
maxfeatures: {None, sqrt, log2}
class
weight: {None, balanced}
nestimators: {25, 50, 100, 200}
max
depth: {None, 10, 15, 20} | | ExtraTrees | criterion: {gini, entropy, logloss}
max
features: {None, sqrt, log2}
classweight: {None, balanced}
n
estimators: {25, 50, 100, 200}
maxdepth: {None, 10, 15, 20} | | AdaBoost | nestimators: {25, 50, 100, 200}
learningrate: {0.1, 0.5, 1} | | GradientBoosting | criterion: {friedmanmse, squarederror}
n
estimators: {50, 100, 200}
learningrate: {0.1, 0.5, 1}
max
depth: {None, 10, 15, 20}
maxfeatures: {None, sqrt, log2} | | Bagging | nestimators: {10, 50, 100}
maxsamples: {1.0} | | MLP | hiddenlayersizes: {(50,), (100,)}
activation: {logistic, relu}
alpha: {0.0001, 0.001, 0.01} | | DNN | hidden
layerssize: {16, 16, 16; 8, 16, 8; 32, 16, 8}
activation
function: {sigmoid, tanh, relu, silu}
lossfunction: {categoricalcrossentropy}
nnoptimizer: {Adam, RMSprop, Nadam}
epochs: {100, 500}
batch
size: {10, 50} |

Summary of Best Hyperparameter Settings

Owner

  • Login: FloderCC
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: If you use this software, please cite it using these metadata.
title: "To be defined"
authors:
  - family-names: Corona
    given-names: Julio
  - family-names: Teixeira
    given-names: Rafael
  - family-names: Antunes
    given-names: Mário
  - family-names: Aguiar
    given-names: Rui L.
version: "1.0" # Adjust the version accordingly
doi: "To be defined" # Adjust the DOI accordingly
date-released: "To be defined" # Adjust the release date if needed
license: Apache-2.0 # Adjust the license if needed
url: "To be defined" # Add your repository URL

GitHub Events

Total
  • Release event: 1
  • Push event: 1
  • Create event: 1
Last Year
  • Release event: 1
  • Push event: 1
  • Create event: 1