Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.0%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: mxbraun4
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 10.9 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed 11 months ago
Metadata Files
Readme License Citation

README.md

AutoConvXAI: Interactive Explanations in AI

A Multi-Agent Conversational XAI System

Bachelor's Thesis by Maximilian Braun

Python 3.8+ License: MIT Flask AutoGen

Overview

This project presents a novel approach to intent parsing in Conversational XAI and bridges the gap between complex AI models and human understanding by providing an intuitive, interactive dialogue interface that democratizes access to machine learning explanations.

Architecture

The system implements a clean 3-component pipeline:

Natural Language Query → AutoGen Multi-Agent System → Action Execution → LLM Formatter → Natural Response

Core Components

  1. AutoGen Multi-Agent Decoder (nlu/autogen_decoder.py)

    • Extraction Agent: Analyzes user queries for intentions and entities
    • Validation Agent: Ensures action correctness and handles edge cases
    • Collaborative decision-making with configurable conversation rounds
  2. Action Dispatcher (app/core.py)

    • Routes parsed intents to appropriate explainability functions
    • Manages conversational context and data filtering
    • Handles 20+ specialized explanation actions
  3. LLM Response Formatter (formatter/llm_formatter.py)

    • Transforms technical results into natural language
    • Maintains conversational flow and context
    • Adapts tone and detail level based on user intent

Explainability Actions

| Category | Actions | Description | |----------|---------|-------------| | Data Exploration | data_summary, show_data, feature_stats | Dataset understanding and statistics | | Model Analysis | important, score, mistakes, interaction_effects | Model performance and behavior | | Predictions | predict, prediction_likelihood, what_if | Individual and scenario-based predictions | | Counterfactuals | counterfactual | Alternative scenarios and decision boundaries | | Context Management | filter, define, labels | Data filtering and feature definitions |

Key Features

  • Natural Language Interface: Ask questions like "What are the most important features?" or "What if this patient had lower BMI?"
  • Multi-Agent Intelligence: Two specialized agents collaborate for robust query understanding
  • Comprehensive Explanations: LIME, SHAP, counterfactuals, feature importance, and statistical analysis
  • Context Awareness: Maintains conversation history and applies filters across interactions
  • Real-time Processing: Fast response times with efficient caching
  • Web Interface: Clean, accessible chat interface with sample questions
  • Extensible Design: Easy to add new explanation methods and actions

Quick Start

Docker recommended - see Docker Usage section for easiest setup.

Prerequisites

  • Python 3.8+
  • OpenAI API key
  • 4GB+ RAM recommended

Installation

  1. Clone the repository: bash git clone https://github.com/mxbraun4/AutoConvXAI.git cd AutoConvXAI

  2. Install dependencies: bash pip install -r requirements.txt

  3. Set up environment: bash export OPENAI_API_KEY="your-openai-api-key" export GPT_MODEL="gpt-4o" # Optional, defaults to gpt-4o-mini

  4. Run the application: bash python main.py

  5. Access the interface:

    • Web UI: http://localhost:5000
    • API: POST to /query with JSON {"query": "your question"}

Research Evaluation

The system underwent comprehensive evaluation comparing multi-agent vs single-agent approaches:

Evaluation Metrics

  • Parsing Accuracy: Intent extraction and action parameter accuracy
  • Response Quality: Naturalness and informativeness of explanations
  • Conversation Flow: Context maintenance and coherence
  • Robustness: Handling of edge cases and ambiguous queries

Results Summary

  • Multi-Agent System: 85.47% parsing accuracy
  • Single-Agent Baseline: 84.46% parsing accuracy

Detailed results available in evaluation/results/

Development

Project Structure

├── app/ # Flask application core │ ├── conversation.py # Context and state management │ ├── core.py # Action dispatcher and business logic │ └── routes.py # API endpoints ├── nlu/ # Natural Language Understanding │ └── autogen_decoder.py # Multi-agent intent parser ├── explainability/ # ML Explanation Engine │ ├── actions/ # 20+ explanation functions │ ├── core/ # Base explainer classes │ └── mega_explainer/ # LIME/SHAP implementations ├── formatter/ # Response Generation │ └── llm_formatter.py # Natural language formatter ├── ui/ # Web Interface │ ├── static/ # CSS, JavaScript │ └── templates/ # HTML templates ├── data/ # Datasets and models ├── evaluation/ # Research evaluation scripts └── docs/ # Documentation

Running Tests

bash python evaluation/run_full_evaluation.py python evaluation/parsing_accuracy/autogen_evaluator.py

Academic Context

This work contributes to the growing field of Explainable AI (XAI) and Human-AI Interaction:

Research Contributions

  1. Novel Multi-Agent Architecture for natural language explanations
  2. Comprehensive Evaluation Framework for conversational explainability systems
  3. Practical Implementation demonstrating real-world applicability

Related Work

  • TalkToModel (Slack et al., 2022): Original concept foundation
  • AutoGen (Microsoft, 2023): Multi-agent conversation framework

Based on TalkToModel framework by Slack et al. (2022)

Docker Usage

Prerequisites

  • Docker installed (Docker Desktop recommended)
  • OpenAI API key
  • 4GB+ RAM, 2GB free disk space

Quick Start

```bash

Build images

docker build -t ttm-gpt4 . docker build -t ttm-gpt4-test .

Run web application

docker run -p 5000:5000 -e OPENAIAPIKEY="your-key" ttm-gpt4

Access at http://localhost:5000

Run evaluations

docker run -e OPENAIAPIKEY="your-key" -v $(pwd):/app -w /app ttm-gpt4-test python evaluation/runfullevaluation.py ```

Docker Images

  • ttm-gpt4: Main web application
  • ttm-gpt4-test: Evaluation and testing suite

Contributing

This is a research project developed for academic purposes. For questions about the implementation or research methodology:

  1. Issues: Use GitHub issues for bugs or questions
  2. Extensions: Fork the repository for your own research

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Original TalkToModel framework by Slack, Krishna, Lakkaraju, and Singh
  • Microsoft AutoGen for multi-agent conversation capabilities

Contact

Maximilian Braun
Bachelor's Thesis Project
Interactive Explanations in AI

maximilian3.braun@stud.uni-regensburg.de


This README provides comprehensive documentation for AutoConvXAI, developed as part of a bachelor's thesis exploring conversational XAI for machine learning explainability.

Owner

  • Login: mxbraun4
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this research, please cite it as below."
authors:
- family-names: "Slack"
  given-names: "Dylan"
- family-names: "Krishna"
  given-names: "Satyapriya"
- family-names: "Lakkaraju"
  given-names: "Himabindu"
- family-names: "Singh"
  given-names: "Sameer"
title: "TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations"
date-released: 2022-08-27
url: "https://github.com/dylan-slack/TalkToModel"
preferred-citation:
  type: article
  authors:
  - family-names: "Slack"
    given-names: "Dylan"
  - family-names: "Krishna"
    given-names: "Satyapriya"
  - family-names: "Lakkaraju"
    given-names: "Himabindu"
  - family-names: "Singh"
    given-names: "Sameer"
  journal: "TSRML @ NeurIPS"
  archivePrefix: "tsrml"
  eprint: "2207.04154"
  title: "TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations"
  year: 2022

GitHub Events

Total
  • Push event: 5
Last Year
  • Push event: 5

Dependencies

.github/workflows/python-app.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
Dockerfile docker
  • python 3.9.7 build
requirements.txt pypi
  • Flask ==2.0.3
  • GitPython ==3.1.27
  • Jinja2 ==3.1.1
  • MarkupSafe ==2.1.1
  • Pillow ==9.0.1
  • PyJWT ==2.3.0
  • PyWavelets ==1.3.0
  • PyYAML ==6.0
  • Werkzeug ==2.0.3
  • attrs ==21.4.0
  • boto3 ==1.21.27
  • botocore ==1.24.27
  • certifi ==2021.10.8
  • charset-normalizer ==2.0.12
  • click ==8.0.4
  • cloudpickle ==2.1.0
  • cycler ==0.11.0
  • dice-ml ==0.7.2
  • docker-pycreds ==0.4.0
  • et-xmlfile ==1.1.0
  • filelock ==3.6.0
  • fonttools ==4.31.2
  • gin-config ==0.5.0
  • gitdb ==4.0.9
  • gunicorn ==20.1.0
  • h5py ==3.6.0
  • huggingface-hub ==0.4.0
  • idna ==3.3
  • imageio ==2.16.1
  • inflect ==5.4.0
  • iniconfig ==1.1.1
  • itsdangerous ==2.1.2
  • jmespath ==1.0.0
  • joblib ==1.1.0
  • jsonschema ==4.4.0
  • kiwisolver ==1.4.0
  • lark ==1.1.2
  • lime ==0.2.0.1
  • llvmlite ==0.39.0
  • matplotlib ==3.5.1
  • networkx ==2.7.1
  • nltk ==3.7
  • numba ==0.56.0
  • numpy ==1.21.6
  • openai ==0.16.0
  • openpyxl ==3.0.9
  • packaging ==21.3
  • pandas ==1.4.1
  • pandas-stubs ==1.2.0.53
  • pathtools ==0.1.2
  • patsy ==0.5.2
  • pluggy ==1.0.0
  • promise ==2.3
  • protobuf ==3.20.1
  • psutil ==5.9.1
  • py ==1.11.0
  • pyparsing ==3.0.7
  • pyrsistent ==0.18.1
  • pytest ==7.1.1
  • python-dateutil ==2.8.2
  • pytz ==2022.1
  • regex ==2022.3.15
  • requests ==2.27.1
  • s3transfer ==0.5.2
  • sacremoses ==0.0.49
  • scikit-image ==0.19.2
  • scikit-learn ==1.0.2
  • scipy ==1.8.0
  • sentence-transformers ==2.2.0
  • sentencepiece ==0.1.95
  • sentry-sdk ==1.9.5
  • setproctitle ==1.3.2
  • shap ==0.40.0
  • shortuuid ==1.0.9
  • six ==1.16.0
  • sklearn ==0.0
  • slicer ==0.0.7
  • smmap ==5.0.0
  • statsmodels ==0.13.2
  • threadpoolctl ==3.1.0
  • tifffile ==2022.3.25
  • tokenizers ==0.11.6
  • tomli ==2.0.1
  • torch ==1.12.1
  • torchvision ==0.13.1
  • tqdm ==4.63.1
  • transformers ==4.17.0
  • twilio ==7.8.0
  • typing_extensions ==4.1.1
  • urllib3 ==1.26.12
  • wandb ==0.13.2
  • word2number ==1.1
  • wordninja ==2.0.0
requirements-clean.txt pypi
  • autogen-agentchat ==0.4.0
  • autogen-core ==0.4.0
  • autogen-ext ==0.4.0
  • flask >=2.0.0
  • inflect >=7.0.0
  • lime >=0.2.0.1
  • numpy >=1.24.0
  • openai >=1.0.0
  • pandas >=1.5.0
  • pydantic >=2.0.0
  • scikit-learn >=1.3.0
  • statsmodels >=0.14.0
  • typing-extensions >=4.0.0