autofeat

Linear Prediction Model with Automated Feature Engineering and Selection Capabilities

https://github.com/cod3licious/autofeat

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.7%) to scientific vocabulary

Keywords

automated-data-science automated-feature-engineering automated-machine-learning automl feature-engineering feature-selection linear-regression machine-learning machine-learning-models
Last synced: 6 months ago · JSON representation

Repository

Linear Prediction Model with Automated Feature Engineering and Selection Capabilities

Basic Info
  • Host: GitHub
  • Owner: cod3licious
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 1 MB
Statistics
  • Stars: 523
  • Watchers: 19
  • Forks: 64
  • Open Issues: 6
  • Releases: 2
Topics
automated-data-science automated-feature-engineering automated-machine-learning automl feature-engineering feature-selection linear-regression machine-learning machine-learning-models
Created about 7 years ago · Last pushed 11 months ago
Metadata Files
Readme License

README.md

Autofeat

Autofeat is a Python library that provides sklearn-compatible linear prediction models with automated feature engineering and selection capabilities.

Overview

Autofeat simplifies the process of improving linear model performance by automating feature generation and selection. It first generates a wide range of non-linear features, then selects a small, robust subset of meaningful features that enhance the predictive power of linear models. This multi-step approach allows you to harness the interpretability of linear models without sacrificing accuracy.

Key Features:

  • Automated Feature Generation and Selection: Automates the process of generating and selecting features for linear models for improved performance.
  • Improved Performance and Interpretability: The generated features improve prediction accuracy while retaining the intuitive interpretability of linear models.
  • Seamless Integration: Fully compatible with scikit-learn pipelines, making it easy to integrate into your existing machine learning workflows.

Use Cases:

  • Ideal for supervised learning tasks where model transparency is crucial for decision-making.
  • Suitable for feature selection in large datasets, automating the discovery of important variables.
  • Useful in scenarios where non-linear features need to be discovered and leveraged without complicating the model.

Note: The code is intended for research purposes. Results may vary depending on the dataset and use case.

Installation

Autofeat is available on PyPI, making it easy to install via pip:

pip install autofeat

Other Dependencies

  • numpy
  • pandas
  • scikit-learn
  • sympy
  • joblib
  • pint
  • numba

Documentation and Resources

| Description | Link | |-------------|------| | Example Notebooks | examples | | Documentation | documentation | | Paper | paper | | Talk | PyData talk |

If any of this code was helpful for your work, please consider citing the paper: @inproceedings{horn2019autofeat, title={The autofeat Python Library for Automated Feature Engineering and Selection}, author={Horn, Franziska and Pack, Robert and Rieger, Michael}, booktitle={Joint European Conference on Machine Learning and Knowledge Discovery in Databases}, pages={111--120}, year={2019}, organization={Springer} }

If you have any questions please don't hesitate to send me an email and of course if you should find any bugs or want to contribute other improvements, pull requests are very welcome!

Acknowledgments

This project was made possible thanks to support by BASF.

Owner

  • Name: franzi
  • Login: cod3licious
  • Kind: user
  • Location: Leipzig

Freelance Data Science Consultant with a PhD in Machine Learning.

GitHub Events

Total
  • Watch event: 30
  • Push event: 2
  • Fork event: 3
Last Year
  • Watch event: 30
  • Push event: 2
  • Fork event: 3

Committers

Last synced: 6 months ago

All Time
  • Total Commits: 69
  • Total Committers: 5
  • Avg Commits per committer: 13.8
  • Development Distribution Score (DDS): 0.145
Past Year
  • Commits: 4
  • Committers: 2
  • Avg Commits per committer: 2.0
  • Development Distribution Score (DDS): 0.5
Top Committers
Name Email Commits
Franziska Horn c****s@g****m 59
Jeethu Rao j****u@j****m 5
OrdoAbChao m****i@g****m 2
LocBDinh 1****h 2
stephanos-stephani s****b@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 36
  • Total pull requests: 11
  • Average time to close issues: 7 months
  • Average time to close pull requests: 3 days
  • Total issue authors: 30
  • Total pull request authors: 5
  • Average comments per issue: 2.81
  • Average comments per pull request: 1.18
  • Merged pull requests: 8
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 3
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 day
  • Issue authors: 0
  • Pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 3.33
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • Sandy4321 (4)
  • VaibhavKharare (2)
  • Muhammad4hmed (2)
  • Antitesla (2)
  • rafmacalaba (1)
  • Data-drone (1)
  • sauravsingh243 (1)
  • janezlapajne (1)
  • nickjtch (1)
  • fitzpk (1)
  • cod3licious (1)
  • stephanos-stephani (1)
  • aclementev (1)
  • gautambak (1)
  • lachhebo (1)
Pull Request Authors
  • jeethu (6)
  • jtimko16 (4)
  • LocBDinh (2)
  • stephanos-stephani (1)
  • mglowacki100 (1)
Top Labels
Issue Labels
enhancement (3) question (2) wontfix (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 2,788 last-month
  • Total dependent packages: 1
  • Total dependent repositories: 7
  • Total versions: 28
  • Total maintainers: 1
pypi.org: autofeat

Automatic Feature Engineering and Selection Linear Prediction Model

  • Versions: 28
  • Dependent Packages: 1
  • Dependent Repositories: 7
  • Downloads: 2,788 Last month
Rankings
Stargazers count: 2.9%
Downloads: 4.1%
Average: 4.6%
Dependent packages count: 4.8%
Forks count: 5.5%
Dependent repos count: 5.5%
Maintainers (1)
Last synced: 6 months ago