tno.mpc.mpyc.secure-learning

TNO PET Lab - secure Multi-Party Computation (MPC) - MPyC - Secure Learning

https://github.com/tno-mpc/mpyc.secure_learning

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.4%) to scientific vocabulary

Keywords

gradient-descent lasso-regression logistic-regression machine-learning mpc mpc-lab mpyc multi-party-computation pet-lab secret-sharing secure-learning tno
Last synced: 6 months ago · JSON representation ·

Repository

TNO PET Lab - secure Multi-Party Computation (MPC) - MPyC - Secure Learning

Basic Info
Statistics
  • Stars: 4
  • Watchers: 2
  • Forks: 0
  • Open Issues: 1
  • Releases: 0
Topics
gradient-descent lasso-regression logistic-regression machine-learning mpc mpc-lab mpyc multi-party-computation pet-lab secret-sharing secure-learning tno
Created over 4 years ago · Last pushed almost 4 years ago
Metadata Files
Readme License Citation

README.md

TNO MPC Lab - MPyC - Secure Learning

The TNO MPC lab consists of generic software components, procedures, and functionalities developed and maintained on a regular basis to facilitate and aid in the development of MPC solutions. The lab is a cross-project initiative allowing us to integrate and reuse previously developed MPC functionalities to boost the development of new protocols and solutions.

The package tno.mpc.mpyc.secure_learning is part of the TNO Python Toolbox.

This library has been developed with funding from different projects.

In particular, the basic building blocks and an initial version of this library have been developed within the VP AI program (2018) and the ERP AI program (2019), including an SVM model and initial versions of other models.

The current secure logistic regression model has been developed within the TKI HTSM LANCELOT project, a research collaboration between TNO, IKNL and Janssen.

LANCELOT is partly funded by PPS-surcharge for Research and Innovation of the Dutch Ministry of Economic Affairs and Climate Policy.

The secure lasso regression model has been developed in the BigMedilytics project. This project has received funding from the European Union’s Horizon 2020 research and innovation program under Grant Agreement No. 780495.

In collaboration with the MPC Lab, BigMedilytics, LANCELOT, NLAIC and Appl.AI, contributed to a restructuring of the codebase to ensure a generic reusable library which can be expanded with other models and functionalities.

Limitations in (end-)use: the content of this software package may solely be used for applications that comply with international export control laws.
This implementation of cryptographic software has not been audited. Use at your own risk.

Content Explanation

Implementation based on Secure Multi-Party Computation (MPC) for training and evaluating of several machine learning models. It makes use of the MPyC framework.

Features

The library implements secure versions of popular machine learning methods in the form of MPC protocols. The underlying MPC functionalities are provided by the MPyC framework.

The library contains both regression and classification algorithms.

In particular, linear regression is implemented, with l2-penalty (Ridge), l1-penalty (Lasso), or a combination of both (ElasticNets). For what concerns classification problems, Support Vector Machines (SVM) are implemented, as well as logistic "regression". For the latter, the user can choose between an accurate implementation of the logistic function, and an approximated, but faster version. l1 and/or l2 penalties can also be selected.

The library allows users to choose either the gradient-descent or the SAG solver in order to train the implemented models.

Limitations

Currently, no code is provided to securely apply the trained models.

Documentation

Documentation of the tno.mpc.mpyc.secure_learning package can be found here.

Install

Easily install the tno.mpc.mpyc.securelearning package using pip: ```console $ python -m pip install tno.mpc.mpyc.securelearning ```

Note:

A significant performance improvement can be achieved by installing the GMPY2 library. console $ python -m pip install 'tno.mpc.mpyc.secure_learning[gmpy]'

If you wish to run the tests you can use: console $ python -m pip install 'tno.mpc.mpyc.secure_learning[tests]'

Usage

Run these examples as python example.py --no-log to suppress the MPyC barrier logging. Append the argument -M 3 to simulate a three-party protocol.

Example Usage

Click here for an example of securely training a simple linear regression model with L2 penalty (Ridge). > `example.py` > ```python > import numpy as np > from mpyc.runtime import mpc > from sklearn import datasets > from sklearn.linear_model import Ridge as RidgeSK > > import tno.mpc.mpyc.secure_learning.test.plaintext_utils.plaintext_objective_functions as plain_obj > from tno.mpc.mpyc.secure_learning import PenaltyTypes, Ridge, SolverTypes > > # Notice that we use the entire dataset to train the model > n_samples = 50 > n_features = 5 > # Fixed random state for reproducibility > random_state = 3 > tolerance = 1e-4 > > secnum = mpc.SecFxp(l=64, f=32) > > > def get_mpc_data(X, y): > X_mpc = [[secnum(x, integral=False) for x in row] for row in X.tolist()] > y_mpc = [secnum(y, integral=False) for y in y.tolist()] > return X_mpc, y_mpc > > > def distribute_data_over_players(X_mpc, y_mpc): > X_shared = [mpc.input(row, senders=0) for row in X_mpc] > y_shared = mpc.input(y_mpc, senders=0) > return X_shared, y_shared > > > async def ridge_regression_example(): > print("Ridge regression with gradient descent method") > alpha = 0.2 > > # Create regression dataset > X, y = datasets.make_regression( > n_samples=n_samples, > n_features=n_features, > noise=25.0, > random_state=random_state, > ) > X = np.array(X) > y = np.array(y) > X_mpc, y_mpc = get_mpc_data(X, y) > > async with mpc: > X_shared, y_shared = distribute_data_over_players(X_mpc, y_mpc) > > # Train secure model > model = Ridge(solver_type=SolverTypes.GD, alpha=alpha) > async with mpc: > coef_ = await model.compute_coef_mpc( > X_shared, > y_shared, > tolerance=tolerance, > ) > > # Results of secure model > objective = plain_obj.objective(X, y, coef_, "linear", PenaltyTypes.L2, alpha) > print("Securely obtained coefficients:", coef_) > print("* objective:", objective) > > # Train plaintext model > model_sk = RidgeSK( > alpha=len(X) * alpha, > solver="saga", > random_state=random_state, > fit_intercept=True, > ) > model_sk.fit(X, y) > > # Results of plaintext model > coef_sk = np.append([model_sk.intercept_], model_sk.coef_).tolist() > objective_sk = plain_obj.objective(X, y, coef_sk, "linear", PenaltyTypes.L2, alpha) > print("Sklearn obtained coefficients: ", coef_sk) > print("* objective:", objective_sk) > > > if __name__ == "__main__": > mpc.run(ridge_regression_example()) > ```
Click here for an example of securely training a logistic regression model with L1 penalty. > `example.py` > ```python > import numpy as np > from mpyc.runtime import mpc > from sklearn import datasets > from sklearn.linear_model import LogisticRegression as LogisticRegressionSK > > import tno.mpc.mpyc.secure_learning.test.plaintext_utils.plaintext_objective_functions as plain_obj > from tno.mpc.mpyc.secure_learning import ( > ClassWeightsTypes, > ExponentiationTypes, > Logistic, > PenaltyTypes, > SolverTypes, > ) > > # Notice that we use the entire dataset to train the model > n_samples = 50 > n_features = 5 > # Fixed random state for reproducibility > random_state = 3 > tolerance = 1e-4 > > > secnum = mpc.SecFxp(l=64, f=32) > > > def get_mpc_data(X, y): > X_mpc = [[secnum(x, integral=False) for x in row] for row in X.tolist()] > y_mpc = [secnum(y, integral=False) for y in y.tolist()] > return X_mpc, y_mpc > > > def distribute_data_over_players(X_mpc, y_mpc): > X_shared = [mpc.input(row, senders=0) for row in X_mpc] > y_shared = mpc.input(y_mpc, senders=0) > return X_shared, y_shared > > > def sklearn_class_weights_dict(y): > n_class_1 = sum([((y_i + 1) / 2) for y_i in y]) > n_class_0 = len(y) - n_class_1 > > w_0 = len(y) / (2 * n_class_0) > w_1 = len(y) / (2 * n_class_1) > > return {-1: w_0, 1: w_1} > > > async def logistic_regression_example(): > print( > "Classification (Logistic regression) with l1 penalty, with gradient descent method" > ) > alpha = 0.1 > > # Create classification dataset > X, y = datasets.make_classification( > n_samples=n_samples, > n_features=n_features, > n_informative=1, > n_redundant=0, > n_classes=2, > n_clusters_per_class=1, > random_state=random_state, > shift=0, > weights=[0.25, 0.75], > ) > # Transform labels from {0, 1} to {-1, +1}. > y = [-1 if x == 0 else 1 for x in y] > X = np.array(X) > y = np.array(y) > X_mpc, y_mpc = get_mpc_data(X, y) > > async with mpc: > X_shared, y_shared = distribute_data_over_players(X_mpc, y_mpc) > > # Train secure model with approximation of logistic function (faster, less accurate) > model = Logistic( > solver_type=SolverTypes.GD, > exponentiation=ExponentiationTypes.APPROX, > penalty=PenaltyTypes.L1, > alpha=alpha, > class_weights_type=ClassWeightsTypes.BALANCED, > ) > async with mpc: > coef_approx = await model.compute_coef_mpc( > X_shared, y_shared, tolerance=tolerance > ) > > class_weights_dict = model.reveal_class_weights(y_shared) > > # Results of secure model (approximated logistic function) > objective_approx = plain_obj.objective( > X, y, coef_approx, "logistic", PenaltyTypes.L1, alpha, class_weights_dict > ) > print( > "Securely obtained coefficients (approximated exponentiation):", > coef_approx, > ) > print("* objective:", objective_approx) > print("Class weights dictionary:", class_weights_dict) > # Train secure model with exact logistic function (slower, more accurate) > model = Logistic( > solver_type=SolverTypes.GD, > exponentiation=ExponentiationTypes.EXACT, > penalty=PenaltyTypes.L1, > alpha=alpha, > class_weights_type=ClassWeightsTypes.BALANCED, > ) > async with mpc: > coef_exact = await model.compute_coef_mpc( > X_shared, y_shared, tolerance=tolerance > ) > > # Results of secure model (exact logistic function) > objective_exact = plain_obj.objective( > X, > y, > coef_exact, > "logistic", > PenaltyTypes.L1, > alpha, > class_weights_dict, > ) > print( > "Securely obtained coefficients (exact exponentiation): ", > coef_exact, > ) > print("* objective:", objective_exact) > print("Class weights dictionary:", class_weights_dict) > # Train plaintext model > model_sk = LogisticRegressionSK( > solver="saga", > random_state=random_state, > fit_intercept=True, > penalty="l1", > C=1 / (len(X) * alpha), > class_weight="balanced", > ) > > class_weights_dict_sk = sklearn_class_weights_dict(y) > model_sk.fit(X, y) > coef_sk = np.append([model_sk.intercept_], model_sk.coef_).tolist() > > # Results of plaintest model > objective_sk = plain_obj.objective( > X, y, coef_sk, "logistic", PenaltyTypes.L1, alpha > ) > print("Sklearn obtained coefficients: ", coef_sk) > print("* objective:", objective_sk) > > > if __name__ == "__main__": > mpc.run(logistic_regression_example()) > ```

Owner

  • Name: TNO - MPC Lab
  • Login: TNO-MPC
  • Kind: organization
  • Email: mpclab@tno.nl
  • Location: Anna van Buerenplein 1, 2595 DA Den Haag, The Netherlands

TNO - MPC Lab

Citation (CITATION.cff)

cff-version: 1.2.0
license: Apache-2.0
message: If you use this software, please cite it using these metadata.
authors:
      - name: TNO MPC Lab
        city: The Hague
        country: NL
        email: mpclab@tno.nl
        website: https://mpc.tno.nl
type: software
url: https://mpc.tno.nl
contact:
      - name: TNO MPC Lab
        city: The Hague
        country: NL
        email: mpclab@tno.nl
        website: https://mpc.tno.nl
repository-code: https://github.com/TNO-MPC/mpyc.secure_learning
repository-artifact: https://pypi.org/project/tno.mpc.mpyc.secure_learning
title: TNO MPC Lab - MPyC - Secure Learning
version: v1.1.1
date-released: 2022-05-20

GitHub Events

Total
  • Issue comment event: 1
Last Year
  • Issue comment event: 1

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 3
  • Total Committers: 1
  • Avg Commits per committer: 3.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Thomas Rooijakkers t****s@t****l 3
Committer Domains (Top 20 + Academic)
tno.nl: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 1
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 2.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 2.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • xQiratNL (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 44 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 3
  • Total maintainers: 1
pypi.org: tno.mpc.mpyc.secure-learning

Machine learning using Secure Multi-Party Computation

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 44 Last month
Rankings
Dependent packages count: 6.6%
Stargazers count: 28.2%
Forks count: 30.5%
Dependent repos count: 30.6%
Average: 31.4%
Downloads: 61.0%
Maintainers (1)
Last synced: 6 months ago