sklearn-porter

Transpile trained scikit-learn estimators to C, Java, JavaScript and others.

https://github.com/nok/sklearn-porter

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.9%) to scientific vocabulary

Keywords

data-science machine-learning scikit-learn sklearn
Last synced: 6 months ago · JSON representation

Repository

Transpile trained scikit-learn estimators to C, Java, JavaScript and others.

Basic Info
  • Host: GitHub
  • Owner: nok
  • License: bsd-3-clause
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 2.91 MB
Statistics
  • Stars: 1,305
  • Watchers: 32
  • Forks: 170
  • Open Issues: 47
  • Releases: 15
Topics
data-science machine-learning scikit-learn sklearn
Created over 9 years ago · Last pushed over 1 year ago
Metadata Files
Readme Changelog License

readme.md

sklearn-porter

Build Status stable branch codecov Binder PyPI PyPI GitHub license

Transpile trained scikit-learn estimators to C, Java, JavaScript and others.
It's recommended for limited embedded systems and critical applications where performance matters most.

Navigation: EstimatorsInstallationUsageKnown IssuesDevelopmentCitationLicense

Estimators

This table gives an overview over all supported combinations of estimators, programming languages and templates.

Programming language
C Go Java JS PHP Ruby
svm.SVC × × × × × ×
svm.NuSVC × × × × × ×
svm.LinearSVC × × × × × ×
tree.DecisionTreeClassifier ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ
ensemble.RandomForestClassifier × ✓ᴾ × × ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ×
ensemble.ExtraTreesClassifier × ✓ᴾ × × ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ✓ᴾ ×
ensemble.AdaBoostClassifier × ✓ᴾ × ✓ᴾ ✓ᴾ ✓ᴾ
neighbors.KNeighborsClassifier ✓ᴾ ✓ᴾ × ✓ᴾ ✓ᴾ × ✓ᴾ ✓ᴾ × ✓ᴾ ✓ᴾ × ✓ᴾ ✓ᴾ ×
naive_bayes.BernoulliNB ✓ᴾ ✓ᴾ × ✓ᴾ ✓ᴾ ×
naive_bayes.GaussianNB ✓ᴾ ✓ᴾ × ✓ᴾ ✓ᴾ ×
neural_network.MLPClassifier ✓ᴾ ✓ᴾ × ✓ᴾ ✓ᴾ ×
neural_network.MLPRegressor ×
Template

✓ = support of predict, ᴾ = support of predict_proba, × = not supported or feasible
ᴀ = attached model data, ᴇ = exported model data (JSON), ᴄ = combined model data

Installation

Purpose Version Branch Build Command
Production v0.7.4 stable pip install sklearn-porter
Development v1.0.0 main pip install https://github.com/nok/sklearn-porter/zipball/main

In both environments the only prerequisite is scikit-learn >= 0.17, <= 0.22.

Usage

Binder

Try it out yourself by starting an interactive notebook with Binder: Binder

Basics

```python from sklearn.datasets import load_iris from sklearn.tree import DecisionTreeClassifier

from sklearn_porter import port, save, make, test

1. Load data and train a dummy classifier:

X, y = loadiris(returnX_y=True) clf = DecisionTreeClassifier() clf.fit(X, y)

2. Port or transpile an estimator:

output = port(clf, language='js', template='attached') print(output)

3. Save the ported estimator:

srcpath, jsonpath = save(clf, language='js', template='exported', directory='/tmp') print(srcpath, jsonpath)

4. Make predictions with the ported estimator:

yclasses, yprobas = make(clf, X[:10], language='js', template='exported') print(yclasses, yprobas)

5. Test always the ported estimator by making an integrity check:

score = test(clf, X[:10], language='js', template='exported') print(score) ```

OOP

```python from sklearn.datasets import load_iris from sklearn.tree import DecisionTreeClassifier

from sklearn_porter import Estimator

1. Load data and train a dummy classifier:

X, y = loadiris(returnX_y=True) clf = DecisionTreeClassifier() clf.fit(X, y)

2. Port or transpile an estimator:

est = Estimator(clf, language='js', template='attached') output = est.port() print(output)

3. Save the ported estimator:

est.template = 'exported' srcpath, jsonpath = est.save(directory='/tmp') print(srcpath, jsonpath)

4. Make predictions with the ported estimator:

yclasses, yprobas = est.make(X[:10]) print(yclasses, yprobas)

5. Test always the ported estimator by making an integrity check:

score = est.test(X[:10]) print(score) ```

CLI

In addition you can use the sklearn-porter on the command line. The command calls porter and is available after the installation.

``` porter {show,port,save} [-h] [-v]

porter show [-l {c,go,java,js,php,ruby}] [-h]

porter port [-l {c,go,java,js,php,ruby}] [-t {attached,combined,exported}] [--skip-warnings] [-h]

porter save [-l {c,go,java,js,php,ruby}] [-t {attached,combined,exported}] [--directory DIRECTORY] [--skip-warnings] [-h] ```

You can serialize an estimator and save it locally. For more details you can read the instructions to model persistence.

```python from joblib import dump

dump(clf, 'estimator.joblib', compress=0) ```

After that the estimator can be transpiled by using the subcommand port:

bash porter port estimator.joblib -l js -t attached > estimator.js

For further processing you can pass the result to another applications, e.g. UglifyJS.

bash porter port estimator.joblib -l js -t attached | uglifyjs --compress -o estimator.min.js

Known Issues

  • In some rare cases the regression tests of the support vector machine, SVC and NuSVC, fail since scikit-learn>=0.22. Because of that a QualityWarning will be raised which should reminds you to evaluate the result by using the test method.

Development

Aliases

The following commands are useful time savers in the daily development:

```bash

Install a Python environment with conda:

make setup

Start a Jupyter notebook with examples:

make notebook

Start tests on the host or in a separate docker container:

make tests make tests-docker

Lint the source code with pylint:

make lint

Generate notebooks with jupytext:

make examples

Deploy a new version with twine:

make deploy ```

Dependencies

The prerequisite is Python 3.6 which you can install with conda:

bash conda env create -n sklearn-porter_3.6 python=3.6 conda activate sklearn-porter_3.6

After that you have to install all required packages:

bash pip install --no-cache-dir -e ".[development,examples]"

Environment

All tests run against these combinations of scikit-learn and Python versions:

Python
3.5 3.6 3.7 3.8
scikit-learn 0.17 cython 0.27.3 cython 0.27.3 not supported
by scikit-learn
no support
by scikit-learn
numpy 1.9.3 numpy 1.9.3
scipy 0.16.0 scipy 0.16.0
0.18 cython 0.27.3 cython 0.27.3 not supported
by scikit-learn
not supported
by scikit-learn
numpy 1.9.3 numpy 1.9.3
scipy 0.16.0 scipy 0.16.0
0.19 cython 0.27.3 cython 0.27.3 not supported
by scikit-learn
not supported
by scikit-learn
numpy 1.14.5 numpy 1.14.5
scipy 1.1.0 scipy 1.1.0
0.20 cython 0.27.3 cython 0.27.3 cython 0.27.3 not supported
by joblib
numpy numpy numpy
scipy scipy scipy
0.21 cython cython cython cython
numpy numpy numpy numpy
scipy scipy scipy scipy
0.22 cython cython cython cython
numpy numpy numpy numpy
scipy scipy scipy scipy

For the regression tests we have to use specific compilers and interpreters:

Name Source Version
GCC https://gcc.gnu.org 10.2.1
Go https://golang.org 1.15.15
Java (OpenJDK) https://openjdk.java.net 1.8.0
Node.js https://nodejs.org 12.22.5
PHP https://www.php.net 7.4.28
Ruby https://www.ruby-lang.org 2.7.4

Please notice that in general you can use older compilers and interpreters with the generated source code. For instance you can use Java 1.6 to compile and run models.

Logging

You can activate logging by changing the option logging.level.

```python from sklearn_porter import options

from logging import DEBUG

options['logging.level'] = DEBUG ```

Testing

You can run the unit and regression tests either on your local machine (host) or in a separate running Docker container.

bash pytest tests -v \ --cov=sklearn_porter \ --disable-warnings \ --numprocesses=auto \ -p no:doctest \ -o python_files="EstimatorTest.py" \ -o python_functions="test_*"

```bash docker build \ -t sklearn-porter \ --build-arg PYTHONVER=${PYTHONVER:-python=3.6} \ --build-arg SKLEARNVER=${SKLEARNVER:-scikit-learn=0.21} \ .

docker run \ -v $(pwd):/home/abc/repo \ --detach \ --entrypoint=/bin/bash \ --name test \ -t sklearn-porter

docker exec -it test ./docker-entrypoint.sh \ pytest tests -v \ --cov=sklearnporter \ --disable-warnings \ --numprocesses=auto \ -p no:doctest \ -o pythonfiles="EstimatorTest.py" \ -o pythonfunctions="test*"

docker rm -f $(docker ps --all --filter name=test -q) ```

Citation

If you use this implementation in you work, please add a reference/citation to the paper. You can use the following BibTeX entry:

bibtex @unpublished{sklearn_porter, author = {Darius Morawiec}, title = {sklearn-porter}, note = {Transpile trained scikit-learn estimators to C, Java, JavaScript and others}, url = {https://github.com/nok/sklearn-porter} }

License

The package is Open Source Software released under the BSD 3-Clause license.

Owner

  • Name: Darius Morawiec
  • Login: nok
  • Kind: user
  • Location: Germany

Software-Developer (with MSc in CS)

GitHub Events

Total
  • Watch event: 17
  • Pull request event: 7
  • Fork event: 2
Last Year
  • Watch event: 17
  • Pull request event: 7
  • Fork event: 2

Committers

Last synced: 10 months ago

All Time
  • Total Commits: 1,173
  • Total Committers: 13
  • Avg Commits per committer: 90.231
  • Development Distribution Score (DDS): 0.498
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Darius Morawiec n****k 589
Darius Morawiec d****c@n****l 528
Darius Morawiec m****l@n****l 46
der-nico d****o@m****m 1
Paweł Dawczak p****k@g****m 1
Jonas Höchst g****t@j****e 1
Jason Kessler j****r@g****m 1
Cesar A. Bernardini m****e@g****m 1
Antti Pasanen a****n@i****i 1
Tomasz Potęga t****a@w****l 1
ShenDezhou b****h@s****m 1
Jonathan Lancar j****r@a****m 1
Antti Pasanen a****n@h****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 70
  • Total pull requests: 29
  • Average time to close issues: 5 months
  • Average time to close pull requests: 8 months
  • Total issue authors: 64
  • Total pull request authors: 23
  • Average comments per issue: 2.91
  • Average comments per pull request: 1.1
  • Merged pull requests: 8
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 5
  • Average time to close issues: N/A
  • Average time to close pull requests: 23 minutes
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.2
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • lichard49 (3)
  • alonsopg (2)
  • vijaykilledar (2)
  • FakeNameSE (2)
  • momo1986 (2)
  • ELind77 (1)
  • Gizmomens (1)
  • JinLi711 (1)
  • gustavoresque (1)
  • oribarilan (1)
  • lucasavila00 (1)
  • IslamSabdelazez (1)
  • magicorlan (1)
  • libo-wu (1)
  • inielse (1)
Pull Request Authors
  • MennaHM123 (9)
  • mesarpe (3)
  • matthieudelaro (2)
  • mstanley103 (2)
  • ghost (1)
  • der-nico (1)
  • elainexmas (1)
  • jonaphin (1)
  • vkhougaz-vifive (1)
  • mikuh (1)
  • tpotega (1)
  • earlsuke (1)
  • pdawczak (1)
  • AndrewJamesTurner (1)
  • AMR-KELEG (1)
Top Labels
Issue Labels
question (19) bug (17) new feature (16) enhancement (14) 1.0.0 (8) high priority (8) duplicate (2)
Pull Request Labels
new feature (3) bug (3) high priority (1) enhancement (1)

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 478 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 16
    (may contain duplicates)
  • Total versions: 34
  • Total maintainers: 1
proxy.golang.org: github.com/nok/sklearn-porter
  • Versions: 15
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.5%
Average: 5.7%
Dependent repos count: 5.9%
Last synced: 6 months ago
pypi.org: sklearn-porter

Transpile trained scikit-learn models to C, Java, JavaScript and others.

  • Versions: 19
  • Dependent Packages: 0
  • Dependent Repositories: 16
  • Downloads: 478 Last month
Rankings
Stargazers count: 1.9%
Dependent repos count: 3.6%
Forks count: 3.9%
Average: 5.8%
Dependent packages count: 9.8%
Downloads: 9.8%
Maintainers (1)
nok
Last synced: 6 months ago

Dependencies

setup.py pypi
  • jinja2 >=2.11
  • joblib >=1
  • loguru >=0.5
  • scikit-learn >=0.17,<=0.22a0
  • tabulate >=0.8
Dockerfile docker
  • continuumio/miniconda3 4.11.0 build