https://github.com/uxlfoundation/scikit-learn-intelex

Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application

https://github.com/uxlfoundation/scikit-learn-intelex

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 84 committers (1.2%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.6%) to scientific vocabulary

Keywords

ai-inference ai-machine-learning ai-training analytics big-data data-analysis gpu machine-learning machine-learning-algorithms oneapi python scikit-learn swrepo

Keywords from Contributors

deep-neural-networks distributed mot transformer interaction parallel openmp notebooks reinforcement-learning cryptocurrency
Last synced: 5 months ago · JSON representation

Repository

Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application

Basic Info
Statistics
  • Stars: 1,306
  • Watchers: 24
  • Forks: 183
  • Open Issues: 74
  • Releases: 38
Topics
ai-inference ai-machine-learning ai-training analytics big-data data-analysis gpu machine-learning machine-learning-algorithms oneapi python scikit-learn swrepo
Created over 7 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Code of conduct Codeowners Security Support

README.md

Extension for Scikit-learn*

Speed up your scikit-learn applications for CPUs and GPUs across single- and multi-node configurations [Releases](https://github.com/uxlfoundation/scikit-learn-intelex/releases)   |   [Documentation](https://uxlfoundation.github.io/scikit-learn-intelex/)   |   [Examples](https://github.com/uxlfoundation/scikit-learn-intelex/tree/master/examples/notebooks)   |   [Support](SUPPORT.md)   |  [License](https://github.com/uxlfoundation/scikit-learn-intelex/blob/master/LICENSE)    [![Build Status](https://dev.azure.com/daal/daal4py/_apis/build/status/CI?branchName=main)](https://dev.azure.com/daal/daal4py/_build/latest?definitionId=9&branchName=main) [![Coverity Scan Build Status](https://scan.coverity.com/projects/21716/badge.svg)](https://scan.coverity.com/projects/daal4py) [![OpenSSF Scorecard](https://api.securityscorecards.dev/projects/github.com/uxlfoundation/scikit-learn-intelex/badge)](https://securityscorecards.dev/viewer/?uri=github.com/uxlfoundation/scikit-learn-intelex) [![Join the community on GitHub Discussions](https://badgen.net/badge/join%20the%20discussion/on%20github/black?icon=github)](https://github.com/uxlfoundation/scikit-learn-intelex/discussions) [![PyPI Version](https://img.shields.io/pypi/v/scikit-learn-intelex)](https://pypi.org/project/scikit-learn-intelex/) [![Conda Version](https://img.shields.io/conda/vn/conda-forge/scikit-learn-intelex)](https://anaconda.org/conda-forge/scikit-learn-intelex) [![python version](https://img.shields.io/badge/python-3.9%20%7C%203.10%20%7C%203.11%20%7C%203.12%20%7C%203.13-blue)](https://img.shields.io/badge/python-3.9%20%7C%203.10%20%7C%203.11%20%7C%203.12%20%7C%203.13-blue) [![scikit-learn supported versions](https://img.shields.io/badge/sklearn-1.0%20%7C%201.4%20%7C%201.5%20%7C%201.6%20%7C%201.7-blue)](https://img.shields.io/badge/sklearn-1.0%20%7C%201.4%20%7C%201.5%20%7C%201.6%20%7C%201.7-blue) ---

Overview

Extension for Scikit-learn is a free software AI accelerator designed to deliver over 10-100X acceleration to your existing scikit-learn code. The software acceleration is achieved with vector instructions, AI hardware-specific memory optimizations, threading, and optimizations.

With Extension for Scikit-learn, you can:

  • Speed up training and inference by up to 100x with equivalent mathematical accuracy
  • Benefit from performance improvements across different hardware configurations, including GPUs and multi-GPU configurations
  • Integrate the extension into your existing Scikit-learn applications without code modifications
  • Continue to use the open-source scikit-learn API
  • Enable and disable the extension with a couple of lines of code or at the command line

Acceleration

Benchmarks code

Optimizations

Easiest way to benefit from accelerations from the extension is by patching scikit-learn with it:

  • Enable CPU optimizations

    ```python import numpy as np from sklearnex import patchsklearn patchsklearn()

    from sklearn.cluster import DBSCAN

    X = np.array([[1., 2.], [2., 2.], [2., 3.], [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) clustering = DBSCAN(eps=3, min_samples=2).fit(X) ```

  • Enable GPU optimizations

    Note: executing on GPU has additional system software requirements - see details.

    ```python import numpy as np from sklearnex import patchsklearn, configcontext patch_sklearn()

    from sklearn.cluster import DBSCAN

    X = np.array([[1., 2.], [2., 2.], [2., 3.], [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) with configcontext(targetoffload="gpu:0"): clustering = DBSCAN(eps=3, min_samples=2).fit(X) ```

:eyes: Check out available notebooks for more examples.

Usage without patching

Alternatively, all functionalities are also available under a separate module which can be imported directly, without involving any patching.

  • To run on CPU:

```python import numpy as np from sklearnex.cluster import DBSCAN

X = np.array([[1., 2.], [2., 2.], [2., 3.], [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) clustering = DBSCAN(eps=3, min_samples=2).fit(X) ```

  • To run on GPU:

```python import numpy as np from sklearnex import config_context from sklearnex.cluster import DBSCAN

X = np.array([[1., 2.], [2., 2.], [2., 3.], [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) with configcontext(targetoffload="gpu:0"): clustering = DBSCAN(eps=3, min_samples=2).fit(X) ```

Installation

To install Extension for Scikit-learn, run:

shell pip install scikit-learn-intelex

Package is also offered through other channels such as conda-forge. See all installation instructions in the Installation Guide.

Integration

The easiest way of accelerating scikit-learn workflows with the extension is through through patching, which replaces the stock scikit-learn algorithms with their optimized versions provided by the extension using the same namespaces in the same modules as scikit-learn.

The patching only affects supported algorithms and their parameters. You can still use not supported ones in your code, the package simply fallbacks into the stock version of scikit-learn.

TIP: Enable verbose mode to see which implementation of the algorithm is currently used.

To patch scikit-learn, you can: * Use the following command-line flag: shell python -m sklearnex my_application.py * Add the following lines to the script: python from sklearnex import patch_sklearn patch_sklearn()

:eyes: Read about other ways to patch scikit-learn.

As an alternative, accelerated classes from the extension can also be imported directly without patching, thereby allowing to keep them separate from stock scikit-learn ones - for example:

```python from sklearnex.cluster import DBSCAN as exDBSCAN from sklearn.cluster import DBSCAN as stockDBSCAN

...

```

Documentation

Extension and oneDAL

Acceleration in patched scikit-learn classes is achieved by replacing calls to scikit-learn with calls to oneDAL (oneAPI Data Analytics Library) behind the scenes: - oneAPI Data Analytics Library

Samples & Examples

How to Contribute

We welcome community contributions, check our Contributing Guidelines to learn more.


* The Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

Owner

  • Name: United Acceleration (UXL) Foundation
  • Login: uxlfoundation
  • Kind: organization
  • Location: United States of America

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 2,039
  • Total Committers: 84
  • Avg Commits per committer: 24.274
  • Development Distribution Score (DDS): 0.837
Past Year
  • Commits: 479
  • Committers: 24
  • Avg Commits per committer: 19.958
  • Development Distribution Score (DDS): 0.729
Top Committers
Name Email Commits
renovate[bot] 2****] 332
Frank Schlimbach f****h@i****m 236
Ian Faust i****t@g****m 158
Oleksandr Pavlyk o****k@i****m 133
Alexander Andreev a****v@i****m 130
Kirill k****v@i****m 113
amakarye a****v@i****m 104
Pavel Yakovlev 3****h 89
david-cortes-intel d****s@i****m 85
Samir Nasibli s****i@i****m 85
Nikolay Petrov n****v@i****m 72
ethanglaser 4****r 69
Kulandin Denis d****n@i****m 49
Andreas Huber 9****1 33
Nikita Timakin n****n@i****m 31
olegkkruglov 1****v 28
Michael Smirnov m****v@i****m 21
Ekaterina Mekhnetsova m****a@g****m 15
Alexandra a****a@i****m 15
Anatoly Volkov 1****l 13
Andrey Gorshkov a****v@i****m 13
msa 1****m 13
KalyanovD d****v@i****m 12
KulikovNikita n****v@i****m 12
Victoriya Fedotova v****a@i****m 10
Dmitry Zagorny d****y@i****m 9
Dmitry Razdoburdin d****n@i****m 8
dependabot[bot] 4****] 8
Ben Moore b****e@i****m 8
Maria Petrova m****a@i****m 8
and 54 more...
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 58
  • Total pull requests: 863
  • Average time to close issues: over 1 year
  • Average time to close pull requests: 20 days
  • Total issue authors: 25
  • Total pull request authors: 33
  • Average comments per issue: 1.9
  • Average comments per pull request: 2.62
  • Merged pull requests: 535
  • Bot issues: 0
  • Bot pull requests: 256
Past Year
  • Issues: 15
  • Pull requests: 773
  • Average time to close issues: 28 days
  • Average time to close pull requests: 8 days
  • Issue authors: 6
  • Pull request authors: 26
  • Average comments per issue: 0.4
  • Average comments per pull request: 2.59
  • Merged pull requests: 474
  • Bot issues: 0
  • Bot pull requests: 256
Top Authors
Issue Authors
  • fschlimb (15)
  • icfaust (9)
  • oleksandr-pavlyk (5)
  • napetrov (4)
  • Alexander-Makaryev (3)
  • triskadecaepyon (2)
  • xwu-intel (2)
  • dguijo (1)
  • m-r-munroe (1)
  • jeremiedbb (1)
  • lukezli (1)
  • Stack-it-up (1)
  • dr-pain (1)
  • remi-braun (1)
  • 392781 (1)
Pull Request Authors
  • renovate[bot] (238)
  • david-cortes-intel (214)
  • icfaust (130)
  • ethanglaser (47)
  • fschlimb (42)
  • Alexsandruss (28)
  • yuejiaointel (26)
  • ahuber21 (16)
  • oleksandr-pavlyk (15)
  • samir-nasibli (14)
  • mergify[bot] (14)
  • Alexander-Makaryev (12)
  • Vika-F (8)
  • napetrov (6)
  • KateBlueSky (6)
Top Labels
Issue Labels
help wanted (16) enhancement (15) bug (14) good first issue (11) hacktoberfest (11) distributed (4) documentation (2) examples (1) conflicts (1) infra (1)
Pull Request Labels
documentation (107) infra (50) testing (39) enhancement (35) bug (33) dependencies (25) model builders (18) Array API (10) sklearn-patch (9) examples (8) distributed (5) python (5) gpu_interfaces (4) conflicts (3)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 104,614 last-month
  • Total docker downloads: 14,109,203
  • Total dependent packages: 18
  • Total dependent repositories: 615
  • Total versions: 40
  • Total maintainers: 2
pypi.org: scikit-learn-intelex

Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application.

  • Versions: 40
  • Dependent Packages: 18
  • Dependent Repositories: 615
  • Downloads: 104,614 Last month
  • Docker Downloads: 14,109,203
Rankings
Dependent repos count: 0.6%
Downloads: 0.6%
Dependent packages count: 0.7%
Docker downloads count: 1.1%
Average: 1.5%
Stargazers count: 2.0%
Forks count: 3.9%
Last synced: 6 months ago

Dependencies

requirements-dev.txt pypi
  • Cython ==0.29.24
  • Cython ==0.29.25
  • Jinja2 ==3.0.3
  • cmake ==3.21.3
  • numpy ==1.19.2
  • numpy ==1.19.3
  • numpy ==1.21.3
  • pybind11 ==2.8.0
requirements-doc.txt pypi
  • Babel ==2.9.1
  • Jinja2 ==3.0.3
  • MarkupSafe ==2.0.1
  • PyYAML ==6.0
  • Pygments ==2.10.0
  • Sphinx ==3.5.4
  • alabaster ==0.7.12
  • async-generator ==1.10
  • attrs ==21.2.0
  • backcall ==0.2.0
  • beautifulsoup4 ==4.10.0
  • bleach ==4.1.0
  • certifi ==2021.10.8
  • charset-normalizer ==2.0.9
  • click *
  • decorator ==5.1.0
  • defusedxml ==0.7.1
  • docutils ==0.16
  • entrypoints ==0.3
  • idna ==3.3
  • imagesize ==1.3.0
  • importlib-metadata ==4.8.2
  • importlib-resources ==5.4.0
  • ipython ==7.29.0
  • ipython-genutils ==0.2.0
  • jedi ==0.18.1
  • jsonschema ==4.2.1
  • jupyter-client ==7.1.0
  • jupyter-core ==4.9.1
  • jupyterlab-pygments ==0.1.2
  • mistune ==0.8.4
  • nbclient ==0.5.9
  • nbconvert ==6.3.0
  • nbformat ==5.1.3
  • nbsphinx ==0.8.7
  • nest-asyncio ==1.5.4
  • packaging ==21.3
  • pandocfilters ==1.5.0
  • parso ==0.8.3
  • pexpect ==4.8.0
  • pickleshare ==0.7.5
  • prompt-toolkit ==3.0.24
  • ptyprocess ==0.7.0
  • pydata-sphinx-theme ==0.6.3
  • pyparsing ==2.4.7
  • pyrsistent ==0.18.0
  • python-dateutil ==2.8.2
  • pytz ==2021.3
  • pyzmq ==22.3.0
  • requests ==2.26.0
  • six ==1.16.0
  • snowballstemmer ==2.1.0
  • soupsieve ==2.3.1
  • sphinx-book-theme ==0.1.6
  • sphinx-notfound-page ==0.8
  • sphinx-tabs ==3.2.0
  • sphinx_rtd_theme *
  • sphinxcontrib-applehelp ==1.0.2
  • sphinxcontrib-devhelp ==1.0.2
  • sphinxcontrib-htmlhelp ==2.0.0
  • sphinxcontrib-jsmath ==1.0.1
  • sphinxcontrib-qthelp ==1.0.3
  • sphinxcontrib-serializinghtml ==1.1.5
  • testpath ==0.5.0
  • tornado ==6.1
  • traitlets ==5.1.1
  • typing-extensions ==4.0.1
  • urllib3 ==1.26.7
  • wcwidth ==0.2.5
  • webencodings ==0.5.1
  • zipp ==3.6.0
requirements-dppy.txt pypi
  • dpctl ==0.8.0
requirements-test.txt pypi
  • pandas ==1.4.0
  • pandas ==1.2.2
  • pandas ==1.1.5
  • pytest ==6.2.5
  • scikit-learn ==1.0.2
  • scikit-learn ==0.24
requirements.txt pypi
  • daal ==2021.4.0
  • numpy >=1.15