banzhaf-ensemble-thesis
Game Theory meets Ensemble ML (2017)
Science Score: 31.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.5%) to scientific vocabulary
Repository
Game Theory meets Ensemble ML (2017)
Basic Info
- Host: GitHub
- Owner: sethuiyer
- License: other
- Language: Python
- Default Branch: main
- Size: 486 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Banzhaf Ensemble Learning
This repository contains the source code and implementation details for my master's thesis with Dr. Jajati Keshari Sahoo. We explored the application of game theoretic methods in Ensemble Learning. The work was completed in 2017 and is now open-sourced under a Creative Commons License.
Overview
This research investigates novel approaches to ensemble learning by incorporating concepts from game theory, specifically the Banzhaf Power Index and Borda count methods. The project introduces innovative techniques for feature selection and ensemble pruning, demonstrating their effectiveness in improving classification performance.
For detailed methodology, results, and theoretical background, refer to the full thesis: report.pdf
Key Concepts
Game Theoretic Methods
- Banzhaf Power Index: A measure of voting power used to evaluate the importance of individual classifiers in an ensemble
- Borda Count: A voting method applied to aggregate predictions from multiple classifiers
- Coalition Games: Framework for analyzing classifier interactions and contributions
Technical Innovations
- Feature Selection: Novel method using conditional mutual information for feature pruning
- Ensemble Pruning: Implementation of Banzhaf Random Forests with strategic classifier selection
- Voting Mechanisms: Integration of Borda count for ensemble prediction aggregation
Project Structure
.
├── Programs/ # Source code implementation
│ ├── Banzhaf Decision Tree/ # Implementation of Banzhaf-based decision trees
│ │ ├── Banzhaf_Decision_Tree.ipynb # Jupyter notebook with examples
│ │ ├── banzhaf_dt.py # Core Banzhaf decision tree implementation
│ │ ├── banzhaf_rf.py # Banzhaf Random Forests implementation
│ │ └── entropy_estimators.py # Entropy calculation utilities
│ │
│ ├── Borda Count/ # Borda count ensemble implementation
│ │ ├── Borda Ensemble.ipynb # Jupyter notebook with examples
│ │ ├── banzhaf_dt.py # Banzhaf decision tree for Borda ensemble
│ │ └── entropy_estimators.py # Entropy calculation utilities
│ │
│ └── SCG Pruning with LAE/ # Strategic Classifier Grouping with Local Accuracy Estimates
│ └── WMG with LAC.ipynb # Weighted Majority Game with Local Accuracy
│
├── report.pdf # Detailed thesis document
└── LICENSE # Creative Commons License
Code Components
1. Banzhaf Decision Tree
banzhaf_dt.py: Core implementation of decision trees using Banzhaf Power Index for feature selectionbanzhaf_rf.py: Implementation of Banzhaf Random Forestsentropy_estimators.py: Utilities for calculating entropy and information gainBanzhaf_Decision_Tree.ipynb: Interactive examples and experiments
2. Borda Count Ensemble
Borda Ensemble.ipynb: Implementation and experiments with Borda count votingbanzhaf_dt.py: Banzhaf decision trees used as base classifiersentropy_estimators.py: Shared entropy calculation utilities
3. SCG Pruning with LAE
WMG with LAC.ipynb: Implementation of Weighted Majority Game with Local Accuracy Estimates- Focuses on ensemble pruning using game theoretic approaches
Key Findings
- The Banzhaf Power Index provides an effective framework for evaluating classifier importance
- Feature selection based on conditional mutual information improves ensemble performance
- Borda count method offers a robust approach to ensemble prediction aggregation
- Banzhaf Random Forests demonstrate competitive performance against traditional ensemble methods
Future Work
Several promising directions for future research include:
Feature Selection Comparison
- Compare the proposed feature selection method with other established methods
- Evaluate performance across different datasets and domains
Ensemble Pruning Optimization
- Compare SCG pruning performance with LAE in AdaBoost contexts
- Investigate hybrid approaches combining multiple pruning strategies
Scalability Improvements
- Optimize computational efficiency for large-scale datasets
- Develop parallel processing capabilities
Theoretical Analysis
- Deepen theoretical understanding of game theoretic approaches
- Establish mathematical bounds for performance guarantees
License
This project is licensed under a Creative Commons License - see the LICENSE file for details.
Citation
If you use this code or reference the work in your research, please cite the original thesis:
@thesis{banzhaf-ensemble-2017,
title={Game Theoretic Approaches in Ensemble Learning},
author={Iyer, Sethu},
year={2017},
type={Master's Thesis},
institution={BITS-Pilani},
keywords={ensemble learning, game theory, Banzhaf Power Index, feature selection, random forests}
}
Owner
- Name: Sethu Iyer
- Login: sethuiyer
- Kind: user
- Repositories: 26
- Profile: https://github.com/sethuiyer
Data Scientist at Reliance Jio. Previously R&D Engineer at Amelia. BITS Pilani Math and CSE
Citation (citation.bib)
@thesis{banzhaf-ensemble-2017,
title={Game Theoretic Approaches in Ensemble Learning},
author={Iyer, Sethu},
year={2017},
type={Master's Thesis},
institution={BITS-Pilani},
abstract={This thesis explores the application of game theoretic methods in Ensemble Learning, specifically focusing on Banzhaf Power Index and Borda count methods for classifier importance evaluation and prediction aggregation. The research introduces novel techniques for feature selection using conditional mutual information and ensemble pruning through Banzhaf Random Forests.},
keywords={ensemble learning, game theory, Banzhaf Power Index, feature selection, random forests}
}
@article{banzhaf1965weighted,
title={Weighted voting doesn't work: A mathematical analysis},
author={Banzhaf III, John F},
journal={Rutgers Law Review},
volume={19},
number={2},
pages={317--343},
year={1965},
publisher={Rutgers Law Review},
abstract={The seminal paper introducing the Banzhaf Power Index, which forms the theoretical foundation for the voting power analysis in this thesis.},
keywords={voting power, game theory, Banzhaf Power Index}
}
@article{breiman2001random,
title={Random forests},
author={Breiman, Leo},
journal={Machine learning},
volume={45},
number={1},
pages={5--32},
year={2001},
publisher={Springer},
doi={10.1023/A:1010933404324},
abstract={The foundational paper on Random Forests, which serves as the base algorithm for the Banzhaf Random Forests implementation in this thesis.},
keywords={random forests, ensemble learning, machine learning}
}
GitHub Events
Total
- Push event: 1
- Create event: 2
Last Year
- Push event: 1
- Create event: 2
Issues and Pull Requests
Last synced: about 1 year ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0