autoconvxai
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.0%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: mxbraun4
- License: mit
- Language: Python
- Default Branch: main
- Size: 10.9 MB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
AutoConvXAI: Interactive Explanations in AI
A Multi-Agent Conversational XAI System
Bachelor's Thesis by Maximilian Braun
Overview
This project presents a novel approach to intent parsing in Conversational XAI and bridges the gap between complex AI models and human understanding by providing an intuitive, interactive dialogue interface that democratizes access to machine learning explanations.
Architecture
The system implements a clean 3-component pipeline:
Natural Language Query → AutoGen Multi-Agent System → Action Execution → LLM Formatter → Natural Response
Core Components
AutoGen Multi-Agent Decoder (
nlu/autogen_decoder.py)- Extraction Agent: Analyzes user queries for intentions and entities
- Validation Agent: Ensures action correctness and handles edge cases
- Collaborative decision-making with configurable conversation rounds
Action Dispatcher (
app/core.py)- Routes parsed intents to appropriate explainability functions
- Manages conversational context and data filtering
- Handles 20+ specialized explanation actions
LLM Response Formatter (
formatter/llm_formatter.py)- Transforms technical results into natural language
- Maintains conversational flow and context
- Adapts tone and detail level based on user intent
Explainability Actions
| Category | Actions | Description |
|----------|---------|-------------|
| Data Exploration | data_summary, show_data, feature_stats | Dataset understanding and statistics |
| Model Analysis | important, score, mistakes, interaction_effects | Model performance and behavior |
| Predictions | predict, prediction_likelihood, what_if | Individual and scenario-based predictions |
| Counterfactuals | counterfactual | Alternative scenarios and decision boundaries |
| Context Management | filter, define, labels | Data filtering and feature definitions |
Key Features
- Natural Language Interface: Ask questions like "What are the most important features?" or "What if this patient had lower BMI?"
- Multi-Agent Intelligence: Two specialized agents collaborate for robust query understanding
- Comprehensive Explanations: LIME, SHAP, counterfactuals, feature importance, and statistical analysis
- Context Awareness: Maintains conversation history and applies filters across interactions
- Real-time Processing: Fast response times with efficient caching
- Web Interface: Clean, accessible chat interface with sample questions
- Extensible Design: Easy to add new explanation methods and actions
Quick Start
Docker recommended - see Docker Usage section for easiest setup.
Prerequisites
- Python 3.8+
- OpenAI API key
- 4GB+ RAM recommended
Installation
Clone the repository:
bash git clone https://github.com/mxbraun4/AutoConvXAI.git cd AutoConvXAIInstall dependencies:
bash pip install -r requirements.txtSet up environment:
bash export OPENAI_API_KEY="your-openai-api-key" export GPT_MODEL="gpt-4o" # Optional, defaults to gpt-4o-miniRun the application:
bash python main.pyAccess the interface:
- Web UI: http://localhost:5000
- API: POST to
/querywith JSON{"query": "your question"}
Research Evaluation
The system underwent comprehensive evaluation comparing multi-agent vs single-agent approaches:
Evaluation Metrics
- Parsing Accuracy: Intent extraction and action parameter accuracy
- Response Quality: Naturalness and informativeness of explanations
- Conversation Flow: Context maintenance and coherence
- Robustness: Handling of edge cases and ambiguous queries
Results Summary
- Multi-Agent System: 85.47% parsing accuracy
- Single-Agent Baseline: 84.46% parsing accuracy
Detailed results available in evaluation/results/
Development
Project Structure
├── app/ # Flask application core
│ ├── conversation.py # Context and state management
│ ├── core.py # Action dispatcher and business logic
│ └── routes.py # API endpoints
├── nlu/ # Natural Language Understanding
│ └── autogen_decoder.py # Multi-agent intent parser
├── explainability/ # ML Explanation Engine
│ ├── actions/ # 20+ explanation functions
│ ├── core/ # Base explainer classes
│ └── mega_explainer/ # LIME/SHAP implementations
├── formatter/ # Response Generation
│ └── llm_formatter.py # Natural language formatter
├── ui/ # Web Interface
│ ├── static/ # CSS, JavaScript
│ └── templates/ # HTML templates
├── data/ # Datasets and models
├── evaluation/ # Research evaluation scripts
└── docs/ # Documentation
Running Tests
bash
python evaluation/run_full_evaluation.py
python evaluation/parsing_accuracy/autogen_evaluator.py
Academic Context
This work contributes to the growing field of Explainable AI (XAI) and Human-AI Interaction:
Research Contributions
- Novel Multi-Agent Architecture for natural language explanations
- Comprehensive Evaluation Framework for conversational explainability systems
- Practical Implementation demonstrating real-world applicability
Related Work
- TalkToModel (Slack et al., 2022): Original concept foundation
- AutoGen (Microsoft, 2023): Multi-agent conversation framework
Based on TalkToModel framework by Slack et al. (2022)
Docker Usage
Prerequisites
- Docker installed (Docker Desktop recommended)
- OpenAI API key
- 4GB+ RAM, 2GB free disk space
Quick Start
```bash
Build images
docker build -t ttm-gpt4 . docker build -t ttm-gpt4-test .
Run web application
docker run -p 5000:5000 -e OPENAIAPIKEY="your-key" ttm-gpt4
Access at http://localhost:5000
Run evaluations
docker run -e OPENAIAPIKEY="your-key" -v $(pwd):/app -w /app ttm-gpt4-test python evaluation/runfullevaluation.py ```
Docker Images
ttm-gpt4: Main web applicationttm-gpt4-test: Evaluation and testing suite
Contributing
This is a research project developed for academic purposes. For questions about the implementation or research methodology:
- Issues: Use GitHub issues for bugs or questions
- Extensions: Fork the repository for your own research
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Original TalkToModel framework by Slack, Krishna, Lakkaraju, and Singh
- Microsoft AutoGen for multi-agent conversation capabilities
Contact
Maximilian Braun
Bachelor's Thesis Project
Interactive Explanations in AI
maximilian3.braun@stud.uni-regensburg.de
This README provides comprehensive documentation for AutoConvXAI, developed as part of a bachelor's thesis exploring conversational XAI for machine learning explainability.
Owner
- Login: mxbraun4
- Kind: user
- Repositories: 1
- Profile: https://github.com/mxbraun4
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this research, please cite it as below."
authors:
- family-names: "Slack"
given-names: "Dylan"
- family-names: "Krishna"
given-names: "Satyapriya"
- family-names: "Lakkaraju"
given-names: "Himabindu"
- family-names: "Singh"
given-names: "Sameer"
title: "TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations"
date-released: 2022-08-27
url: "https://github.com/dylan-slack/TalkToModel"
preferred-citation:
type: article
authors:
- family-names: "Slack"
given-names: "Dylan"
- family-names: "Krishna"
given-names: "Satyapriya"
- family-names: "Lakkaraju"
given-names: "Himabindu"
- family-names: "Singh"
given-names: "Sameer"
journal: "TSRML @ NeurIPS"
archivePrefix: "tsrml"
eprint: "2207.04154"
title: "TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations"
year: 2022
GitHub Events
Total
- Push event: 5
Last Year
- Push event: 5
Dependencies
- actions/checkout v3 composite
- actions/setup-python v3 composite
- python 3.9.7 build
- Flask ==2.0.3
- GitPython ==3.1.27
- Jinja2 ==3.1.1
- MarkupSafe ==2.1.1
- Pillow ==9.0.1
- PyJWT ==2.3.0
- PyWavelets ==1.3.0
- PyYAML ==6.0
- Werkzeug ==2.0.3
- attrs ==21.4.0
- boto3 ==1.21.27
- botocore ==1.24.27
- certifi ==2021.10.8
- charset-normalizer ==2.0.12
- click ==8.0.4
- cloudpickle ==2.1.0
- cycler ==0.11.0
- dice-ml ==0.7.2
- docker-pycreds ==0.4.0
- et-xmlfile ==1.1.0
- filelock ==3.6.0
- fonttools ==4.31.2
- gin-config ==0.5.0
- gitdb ==4.0.9
- gunicorn ==20.1.0
- h5py ==3.6.0
- huggingface-hub ==0.4.0
- idna ==3.3
- imageio ==2.16.1
- inflect ==5.4.0
- iniconfig ==1.1.1
- itsdangerous ==2.1.2
- jmespath ==1.0.0
- joblib ==1.1.0
- jsonschema ==4.4.0
- kiwisolver ==1.4.0
- lark ==1.1.2
- lime ==0.2.0.1
- llvmlite ==0.39.0
- matplotlib ==3.5.1
- networkx ==2.7.1
- nltk ==3.7
- numba ==0.56.0
- numpy ==1.21.6
- openai ==0.16.0
- openpyxl ==3.0.9
- packaging ==21.3
- pandas ==1.4.1
- pandas-stubs ==1.2.0.53
- pathtools ==0.1.2
- patsy ==0.5.2
- pluggy ==1.0.0
- promise ==2.3
- protobuf ==3.20.1
- psutil ==5.9.1
- py ==1.11.0
- pyparsing ==3.0.7
- pyrsistent ==0.18.1
- pytest ==7.1.1
- python-dateutil ==2.8.2
- pytz ==2022.1
- regex ==2022.3.15
- requests ==2.27.1
- s3transfer ==0.5.2
- sacremoses ==0.0.49
- scikit-image ==0.19.2
- scikit-learn ==1.0.2
- scipy ==1.8.0
- sentence-transformers ==2.2.0
- sentencepiece ==0.1.95
- sentry-sdk ==1.9.5
- setproctitle ==1.3.2
- shap ==0.40.0
- shortuuid ==1.0.9
- six ==1.16.0
- sklearn ==0.0
- slicer ==0.0.7
- smmap ==5.0.0
- statsmodels ==0.13.2
- threadpoolctl ==3.1.0
- tifffile ==2022.3.25
- tokenizers ==0.11.6
- tomli ==2.0.1
- torch ==1.12.1
- torchvision ==0.13.1
- tqdm ==4.63.1
- transformers ==4.17.0
- twilio ==7.8.0
- typing_extensions ==4.1.1
- urllib3 ==1.26.12
- wandb ==0.13.2
- word2number ==1.1
- wordninja ==2.0.0
- autogen-agentchat ==0.4.0
- autogen-core ==0.4.0
- autogen-ext ==0.4.0
- flask >=2.0.0
- inflect >=7.0.0
- lime >=0.2.0.1
- numpy >=1.24.0
- openai >=1.0.0
- pandas >=1.5.0
- pydantic >=2.0.0
- scikit-learn >=1.3.0
- statsmodels >=0.14.0
- typing-extensions >=4.0.0