genepro
A baseline implementation of genetic programming (using trees to encode programs) with some examples of usage.
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
2 of 5 committers (40.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.1%) to scientific vocabulary
Keywords
Repository
A baseline implementation of genetic programming (using trees to encode programs) with some examples of usage.
Basic Info
Statistics
- Stars: 33
- Watchers: 1
- Forks: 7
- Open Issues: 0
- Releases: 9
Topics
Metadata Files
README.md
genepro
In brief
genepro is a Python library providing a baseline implementation of genetic programming, an evolutionary algorithm specialized to evolve programs.
This library includes a classifier and regressor that are compatible with scitik-learn (see examples of usage below).
Evolving programs are represented as trees. The leaf nodes (also called terminals) of such trees represent some form of input, e.g., a feature for classification or regression, or a type of environmental observation for reinforcement learning. The internal nodes represent possible atomic instructions, e.g., summation, subtraction, multiplication, division, but also if-then-else or similar programming constructs.
Genetic programming operates on a population of trees, typically initialized at random. Every iteration (called generation), promising trees undergo random modifications (e.g., forms of crossover, mutation, and tuning) that result in a population of offspring trees. This new population is then used for the next generation.
Installation
For classification or regression, genepro relies only on a few libraries (numpy, joblib, and scikit-learn).
However, additional libraries (e.g., gym) are required to run the reinforcement learning example.
Thus, you can choose to perform a minimal or full installation.
Minimal installation
To perform a minimal installation, run:
pip install genepro
Full installation
For a full installation, clone this repo locally, and make use of the file requirements.txt, as follows:
git clone https://github.com/marcovirgolin/genepro
cd genepro
pip install -r requirements.txt .
Wish to use conda?
A conda virtual enviroment can easily be set up with:
git clone https://github.com/marcovirgolin/genepro
cd genepro
conda env create
conda activate genepro
pip install .
Examples of usage
Classification and regression
The notebook classification and regression.ipynb shows how to use genepro for classification and regression, via scikit-learn estimators.
These estimators are intended for data sets with a small number of (relevant) features, as the evolved program can be written as a compact (and potentially interpretable) symbolic expression.
...
gen: 39, best of gen fitness: -2952.999, best of gen size: 46
gen: 40, best of gen fitness: -2950.453, best of gen size: 44
The mean squared error on the test set is 2964.646 (respective R^2 score is 0.512)
Obtained by the (simplified) model: 146.527 + -5.797*(-x_2**2 - 4*x_2 - 3*x_3 + 2*x_4 - x_5 - x_6*(x_4 - x_5) + x_6 - 5*x_8)
Example of output of a symbolic regression model discovered for the Diabetes data set.
Reinforcement learning
The notebook gym.ipynb shows how genepro can be used to evolve a controller for the CartPole-v1 environment of the OpenAI gym library.
Citation
If you use this software, please cite it with:
@software{Virgolin_genepro_2022,
author = {Virgolin, Marco},
month = {9},
title = {{genepro}},
url = {https://github.com/marcovirgolin/genepro},
version = {0.1.3},
year = {2024}
}
Owner
- Name: Marco
- Login: marcovirgolin
- Kind: user
- Location: Amsterdam
- Company: Centrum Wiskunde & Informatica (CWI)
- Website: http://marcovirgolin.github.io
- Twitter: MarcoVirgolin
- Repositories: 6
- Profile: https://github.com/marcovirgolin
Researcher on Evolutionary and Explainable Machine Learning @ Dutch National Math & CS center (CWI). Pic: stable diffusion + dreambooth.
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Virgolin" given-names: "Marco" orcid: "https://orcid.org/0000-0001-8905-9313" title: "genepro" version: 0.1.0 date-released: 2022-09-01 url: "https://github.com/marcovirgolin/genepro"
GitHub Events
Total
- Watch event: 5
- Issue comment event: 1
- Fork event: 1
Last Year
- Watch event: 5
- Issue comment event: 1
- Fork event: 1
Committers
Last synced: about 3 years ago
All Time
- Total Commits: 40
- Total Committers: 5
- Avg Commits per committer: 8.0
- Development Distribution Score (DDS): 0.625
Top Committers
| Name | Commits | |
|---|---|---|
| Marco | m****n@u****m | 15 |
| Marco | m****o@M****l | 13 |
| Marco | m****o@d****l | 7 |
| Marco | m****o@d****l | 4 |
| Giorgia Nadizar | g****r@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 3
- Total pull requests: 2
- Average time to close issues: about 2 months
- Average time to close pull requests: about 17 hours
- Total issue authors: 3
- Total pull request authors: 2
- Average comments per issue: 1.33
- Average comments per pull request: 2.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- hengzhe-zhang (1)
- giorgia-nadizar (1)
- chenyuxin1999 (1)
Pull Request Authors
- giorgia-nadizar (1)
- gandreadis (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- gym ==0.22.0
- joblib >=1.1.0
- matplotlib >=3.5.1
- numpy >=1.21.0
- pygame ==2.1.0
- pyglet ==1.5.21
- scikit-learn >=1.0.2
- sympy >=1.9
- joblib >=1.1.0
- numpy >=1.22.2
- scikit-learn >=1.0.2
- imagemagick
- ipykernel
- pip 21.2.4.*
- python 3.10.*