GATree
GATree: Evolutionary decision tree classifier in Python - Published in JOSS (2024)
Science Score: 98.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 6 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org, zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Repository
Evolutionary decision trees
Basic Info
- Host: GitHub
- Owner: lahovniktadej
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://pypi.org/project/gatree
- Size: 2.39 MB
Statistics
- Stars: 11
- Watchers: 2
- Forks: 3
- Open Issues: 0
- Releases: 7
Topics
Metadata Files
README.md
GATree
📋 About • 📦 Installation • 🚀 Usage • 🧬 Genetic Operators • 🫂 Community Guidelines • 📜 License
📋 About
GATree is a Python library designed for implementing evolutionary decision trees using a standard genetic algorithm approach. The library provides functionalities for selection, mutation, and crossover operations within the decision tree structure, allowing users to evolve and optimise decision trees for various classification and clustering tasks. 🌲🧬
The library's core objective is to empower users in creating and fine-tuning decision trees through an evolutionary process, opening avenues for innovative approaches to classification and clustering problems. GATree enables the dynamic growth and adaptation of decision trees, offering a flexible and powerful tool for machine learning enthusiasts and practitioners. 🚀🌿
GATree is currently limited to classification and clustering tasks, with support for regression tasks planned for future releases. 💡
- Free software: MIT license
- Documentation: http://gatree.readthedocs.io
- Python: 3.9, 3.10, 3.11, 3.12
- Dependencies: listed in CONTRIBUTING.md
- Operating systems: Windows, Ubuntu, macOS
📦 Installation
pip
To install GATree using pip, run the following command:
bash
pip install gatree
🚀 Usage
The following example demonstrates how to perform classification of the iris dataset using GATree. More examples can be found in the examples directory.
```python import pandas as pd from sklearn.datasets import loadiris from sklearn.modelselection import traintestsplit from sklearn.metrics import accuracy_score from gatree.methods.gatreeclassifier import GATreeClassifier
Load the iris dataset
iris = loadiris() X = pd.DataFrame(iris.data, columns=iris.featurenames) y = pd.Series(iris.target, name='target')
Split the dataset into training and testing sets
Xtrain, Xtest, ytrain, ytest = traintestsplit( X, y, testsize=0.2, randomstate=¸10)
Create and fit the GATree classifier
gatree = GATreeClassifier(njobs=16, randomstate=32) gatree.fit(X=Xtrain, y=ytrain, populationsize=100, maxiter=100)
Make predictions on the testing set
ypred = gatree.predict(Xtest)
Evaluate the accuracy of the classifier
print(accuracyscore(ytest, y_pred)) ```
🧬 Genetic Operators in GATree
The genetic algorithm for decision trees in GATree involves several key operators: selection, elitism, crossover, and mutation. Each of these operators plays a crucial role in the evolution and optimisation of the decision trees. Below is a detailed description of each operator within the context of the GATree class.
Selection
Selection is the process of choosing parent trees from the current population to produce offspring for the next generation. By default, GATree class uses tournament selection, a method where a subset of the population is randomly chosen, and the best individual from this subset is selected.
Elitism
Elitism ensures that the best-performing individuals (trees) from the current generation are carried over to the next generation without any modification. This guarantees that the quality of the population does not decrease from one generation to the next.
Crossover
Crossover is a genetic operator used to combine the genetic information of two parent trees to generate new offspring. This enables exploration, which helps in creating diversity in the population and combining good traits from both parents.
Mutation
Mutation introduces random changes to a tree to maintain genetic diversity and explore new solutions. This helps in avoiding local optima by introducing new genetic structures.
🫂 Community Guidelines
Contributing
To contribure to the software, please read the contributing guidelines.
Reporting Issues
If you encounter any issues with the library, please report them using the issue tracker. Include a detailed description of the problem, including the steps to reproduce the problem, the stack trace, and details about your operating system and software version.
Seeking Support
If you need support, please first refer to the documentation. If you still require assistance, please open an issue on the issue tracker with the question tag. For private inquiries, you can contact us via e-mail at tadej.lahovnik1@um.si or saso.karakatic@um.si.
📜 License
This package is distributed under the MIT License. This license can be found online at http://www.opensource.org/licenses/MIT.
Disclaimer
This framework is provided as-is, and there are no guarantees that it fits your purposes or that it is bug-free. Use it at your own risk!
Owner
- Name: Tadej Lahovnik
- Login: lahovniktadej
- Kind: user
- Location: Maribor, Slovenia
- Company: UM FERI
- Repositories: 1
- Profile: https://github.com/lahovniktadej
JOSS Publication
GATree: Evolutionary decision tree classifier in Python
Authors
Tags
genetic algorithm evolutionary algorithm classifier machine learningCitation (CITATION.cff)
cff-version: "1.2.0"
authors:
- family-names: Lahovnik
given-names: Tadej
orcid: "https://orcid.org/0009-0005-9689-2991"
- family-names: Karakatič
given-names: Sašo
orcid: "https://orcid.org/0000-0003-4441-9690"
doi: 10.5281/zenodo.13307404
message: If you use this software, please cite our article in the
Journal of Open Source Software.
preferred-citation:
authors:
- family-names: Lahovnik
given-names: Tadej
orcid: "https://orcid.org/0009-0005-9689-2991"
- family-names: Karakatič
given-names: Sašo
orcid: "https://orcid.org/0000-0003-4441-9690"
date-published: 2024-08-12
doi: 10.21105/joss.06748
issn: 2475-9066
issue: 100
journal: Journal of Open Source Software
publisher:
name: Open Journals
start: 6748
title: "GATree: Evolutionary decision tree classifier in Python"
type: article
url: "https://joss.theoj.org/papers/10.21105/joss.06748"
volume: 9
title: "GATree: Evolutionary decision tree classifier in Python"
GitHub Events
Total
- Create event: 1
- Release event: 1
- Issues event: 3
- Watch event: 3
- Push event: 3
- Pull request event: 5
- Fork event: 2
Last Year
- Create event: 1
- Release event: 1
- Issues event: 3
- Watch event: 3
- Push event: 3
- Pull request event: 5
- Fork event: 2
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Tadej Lahovnik | t****k@s****i | 144 |
| zala-lahovnik | z****k@g****m | 12 |
| karakatic | k****c@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 21
- Total pull requests: 12
- Average time to close issues: 29 days
- Average time to close pull requests: about 6 hours
- Total issue authors: 1
- Total pull request authors: 2
- Average comments per issue: 0.86
- Average comments per pull request: 0.17
- Merged pull requests: 12
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 2
- Pull requests: 4
- Average time to close issues: 16 days
- Average time to close pull requests: about 18 hours
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 0.0
- Average comments per pull request: 0.5
- Merged pull requests: 4
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- lahovniktadej (17)
Pull Request Authors
- lahovniktadej (14)
- zala-lahovnik (7)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 20 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 7
- Total maintainers: 1
pypi.org: gatree
- Documentation: https://gatree.readthedocs.io/
- License: mit
-
Latest release: 0.2.0
published about 1 year ago
