https://github.com/ibm/lale

Library for Semi-Automated Data Science

https://github.com/ibm/lale

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 28 committers (3.6%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.6%) to scientific vocabulary

Keywords

artificial-intelligence automated-machine-learning automl data-science dataquality hyperparameter-optimization hyperparameter-search hyperparameter-tuning ibm-research ibm-research-ai interoperability machine-learning pipeline-testing pipeline-tests python scikit-learn

Keywords from Contributors

bias bias-correction bias-detection bias-finder bias-reduction codait discrimination fairness fairness-ai fairness-awareness-model
Last synced: 6 months ago · JSON representation

Repository

Library for Semi-Automated Data Science

Basic Info
  • Host: GitHub
  • Owner: IBM
  • License: apache-2.0
  • Language: Python
  • Default Branch: master
  • Homepage: https://lale.readthedocs.io
  • Size: 40.8 MB
Statistics
  • Stars: 342
  • Watchers: 21
  • Forks: 80
  • Open Issues: 24
  • Releases: 85
Topics
artificial-intelligence automated-machine-learning automl data-science dataquality hyperparameter-optimization hyperparameter-search hyperparameter-tuning ibm-research ibm-research-ai interoperability machine-learning pipeline-testing pipeline-tests python scikit-learn
Created over 6 years ago · Last pushed 10 months ago
Metadata Files
Readme Contributing License

README.md

Lale

Tests Documentation Status PyPI version shields.io Imports: isort Code style: black linting: pylint security: bandit License CII Best Practices
logo

README in other languages: 中文, deutsch, français, or contribute your own.

Lale is a Python library for semi-automated data science. Lale makes it easy to automatically select algorithms and tune hyperparameters of pipelines that are compatible with scikit-learn, in a type-safe fashion. If you are a data scientist who wants to experiment with automated machine learning, this library is for you! Lale adds value beyond scikit-learn along three dimensions: automation, correctness checks, and interoperability. For automation, Lale provides a consistent high-level interface to existing pipeline search tools including Hyperopt, GridSearchCV, and SMAC. For correctness checks, Lale uses JSON Schema to catch mistakes when there is a mismatch between hyperparameters and their type, or between data and operators. And for interoperability, Lale has a growing library of transformers and estimators from popular libraries such as scikit-learn, XGBoost, PyTorch etc. Lale can be installed just like any other Python package and can be edited with off-the-shelf Python tools such as Jupyter notebooks.

The name Lale, pronounced laleh, comes from the Persian word for tulip. Similarly to popular machine-learning libraries such as scikit-learn, Lale is also just a Python library, not a new stand-alone programming language. It does not require users to install new tools nor learn new syntax.

Lale is distributed under the terms of the Apache 2.0 License, see LICENSE.txt. It is currently in an Alpha release, without warranties of any kind.

Owner

  • Name: International Business Machines
  • Login: IBM
  • Kind: organization
  • Email: awesome@ibm.com
  • Location: United States of America

GitHub Events

Total
  • Release event: 1
  • Watch event: 12
  • Issue comment event: 2
  • Push event: 16
  • Pull request event: 34
  • Fork event: 1
  • Create event: 1
Last Year
  • Release event: 1
  • Watch event: 12
  • Issue comment event: 2
  • Push event: 16
  • Pull request event: 34
  • Fork event: 1
  • Create event: 1

Committers

Last synced: 11 months ago

All Time
  • Total Commits: 1,489
  • Total Committers: 28
  • Avg Commits per committer: 53.179
  • Development Distribution Score (DDS): 0.641
Past Year
  • Commits: 20
  • Committers: 3
  • Avg Commits per committer: 6.667
  • Development Distribution Score (DDS): 0.45
Top Committers
Name Email Commits
Martin Hirzel h****l@g****m 534
Avi Shinnar s****r@u****m 421
kiran-kate 4****e 330
Guillaume Baudart g****t@i****m 56
Louis Mandel m****l 51
kakate@us.ibm.com k****e@u****m 20
mfeffer f****8@g****m 18
Chirag Sahni s****g@y****n 10
Pari Ram p****m@g****u 8
Daniel Ryszka 6****M 7
Thomas Parnell t****a@z****m 5
Ingkarat Rak-amnouykit i****w@h****m 5
szymonkucharczyk 7****k 4
Jason Tsay j****y@i****m 3
vaisaxena 3****a 2
haodeqi 4****i 2
daniel-karl 5****l 2
Martin Hirzel h****l@u****m 1
ksrinivs64 k****4 1
Vicky Vishal Sahu v****u@g****m 1
Steve Martinelli 4****r 1
Souhit Dey o****d@p****m 1
Rafalll-Maciasz 1****z 1
Panagiotis Hadjidoukas (Chatzidoukas) p****o@g****m 1
MateuszSzymkowskiIBM 8****M 1
Marta Tomzik 1****k 1
BO SONG 1****g 1
Ali Reza Yahyapour l****y@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 12
  • Total pull requests: 177
  • Average time to close issues: over 1 year
  • Average time to close pull requests: 9 days
  • Total issue authors: 7
  • Total pull request authors: 10
  • Average comments per issue: 1.75
  • Average comments per pull request: 0.09
  • Merged pull requests: 163
  • Bot issues: 0
  • Bot pull requests: 2
Past Year
  • Issues: 1
  • Pull requests: 32
  • Average time to close issues: N/A
  • Average time to close pull requests: 9 days
  • Issue authors: 1
  • Pull request authors: 3
  • Average comments per issue: 5.0
  • Average comments per pull request: 0.03
  • Merged pull requests: 28
  • Bot issues: 0
  • Bot pull requests: 1
Top Authors
Issue Authors
  • kiran-kate (4)
  • rootsmusic (3)
  • tdoublep (2)
  • hirzel (2)
  • josk0 (1)
  • ghost (1)
  • mfeffer (1)
Pull Request Authors
  • shinnar (161)
  • hirzel (59)
  • kiran-kate (8)
  • Rafalll-Maciasz (3)
  • mfeffer (2)
  • marta-tomzik (2)
  • dependabot[bot] (2)
  • DanielRyszkaIBM (2)
  • renovate[bot] (2)
  • TrellixVulnTeam (1)
Top Labels
Issue Labels
help wanted (1)
Pull Request Labels
dependencies (2)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 4,527 last-month
  • Total docker downloads: 28
  • Total dependent packages: 2
  • Total dependent repositories: 9
  • Total versions: 88
  • Total maintainers: 3
pypi.org: lale

Library for Semi-Automated Data Science

  • Versions: 88
  • Dependent Packages: 2
  • Dependent Repositories: 9
  • Downloads: 4,527 Last month
  • Docker Downloads: 28
Rankings
Dependent packages count: 2.1%
Stargazers count: 3.6%
Average: 4.1%
Forks count: 4.9%
Dependent repos count: 4.9%
Downloads: 5.0%
Maintainers (3)
Last synced: 6 months ago

Dependencies

.github/workflows/build.yml actions
  • actions/cache v3 composite
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • pre-commit/action v3.0.0 composite
  • pypa/gh-action-pypi-publish release/v1 composite
.github/workflows/release.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • pypa/gh-action-pypi-publish master composite
  • pypa/gh-action-pypi-publish release/v1 composite
Dockerfile docker
  • python ${python_version}-stretch build
docs/requirements.txt pypi
  • aif360 >=0.4.0
  • astunparse *
  • black ==19.10b0
  • cvxpy >=1.0,<=1.1.7
  • decorator *
  • docutils <0.17
  • graphviz *
  • h5py *
  • hyperopt *
  • imbalanced-learn *
  • jsonschema *
  • jsonsubschema *
  • liac-arff >=2.4.0
  • lightgbm *
  • m2r2 *
  • numba ==0.49.0
  • numpy *
  • pandas *
  • scikit-learn >=0.20.3
  • scipy *
  • smac <=0.10.0
  • sphinx >=5.0.0
  • sphinx_rtd_theme >=0.5.2
  • sphinxcontrib-svg2pdfconverter *
  • sphinxcontrib.apidoc *
  • typing-extensions *
  • xgboost <=1.3.3