sacro-ml

Collection of tools and resources for managing the statistical disclosure control of trained machine learning models

https://github.com/ai-sdc/sacro-ml

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org, zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.5%) to scientific vocabulary

Keywords

attribute-inference-attack data-privacy data-protection differential-privacy inference machine-learning membership-inference-attack privacy statistical-disclosure-control
Last synced: 6 months ago · JSON representation ·

Repository

Collection of tools and resources for managing the statistical disclosure control of trained machine learning models

Basic Info
Statistics
  • Stars: 29
  • Watchers: 2
  • Forks: 6
  • Open Issues: 15
  • Releases: 17
Topics
attribute-inference-attack data-privacy data-protection differential-privacy inference machine-learning membership-inference-attack privacy statistical-disclosure-control
Created over 3 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog Contributing License Citation

README.md

SACRO-ML

DOI PyPI package Conda Python versions codecov

An increasing body of work has shown that machine learning (ML) models may expose confidential properties of the data on which they are trained. This has resulted in a wide range of proposed attack methods with varying assumptions that exploit the model structure and/or behaviour to infer sensitive information.

The sacroml package is a collection of tools and resources for managing the statistical disclosure control (SDC) of trained ML models. In particular, it provides:

  • A safemodel package that extends commonly used ML models to provide ante-hoc SDC by assessing the theoretical risk posed by the training regime (such as hyperparameter, dataset, and architecture combinations) before (potentially) costly model fitting is performed. In addition, it ensures that best practice is followed with respect to privacy, e.g., using differential privacy optimisers where available. For large models and datasets, ante-hoc analysis has the potential for significant time and cost savings by helping to avoid wasting resources training models that are likely to be found to be disclosive after running intensive post-hoc analysis.
  • An attacks package that provides post-hoc SDC by assessing the empirical disclosure risk of a classification model through a variety of simulated attacks after training. It provides an integrated suite of attacks with a common application programming interface (API) and is designed to support the inclusion of additional state-of-the-art attacks as they become available. In addition to membership inference attacks (MIA) such as the likelihood ratio attack (LiRA) and attribute inference, the package provides novel structural attacks that report cheap-to-compute metrics, which can serve as indicators of model disclosiveness after model fitting, but before needing to run more computationally expensive MIAs.
  • Summaries of the results are written in a simple human-readable report.

Classification models from scikit-learn (including those implementing sklearn.base.BaseEstimator) and PyTorch are broadly supported within the package. Some attacks can still be run if only CSV files of the model predicted probabilities are supplied, e.g., if the model was produced in another language. See the examples for further information.

Installation

Python Package Index

$ pip install sacroml

Note: macOS users may need to install libomp due to a dependency on XGBoost: $ brew install libomp

Conda

$ conda install sacroml

Usage

Quick-start example:

```python from sklearn.datasets import loadbreastcancer from sklearn.ensemble import RandomForestClassifier from sklearn.modelselection import traintest_split

from sacroml.attacks.likelihood_attack import LIRAAttack from sacroml.attacks.target import Target

Load dataset

X, y = loadbreastcancer(returnXy=True, asframe=False) Xtrain, Xtest, ytrain, ytest = traintestsplit(X, y, testsize=0.3)

Fit model

model = RandomForestClassifier(minsamplessplit=2, minsamplesleaf=1) model.fit(Xtrain, ytrain)

Wrap model and data

target = Target( model=model, datasetname="breast cancer", Xtrain=Xtrain, ytrain=ytrain, Xtest=Xtest, ytest=y_test, )

Create an attack object and run the attack

attack = LIRAAttack(nshadowmodels=100, outputdir="outputexample") attack.attack(target) ```

For more information, see the examples.

Documentation

See API documentation.

Contributing

See our contributing guide.

Acknowledgement

This work was supported by UK Research and Innovation as part of the Data and Analytics Research Environments UK (DARE UK) programme, delivered in partnership with Health Data Research UK (HDR UK) and Administrative Data Research UK (ADR UK). The specific projects were Semi-Automated Checking of Research Outputs (SACRO; MCPC23006), Guidelines and Resources for AI Model Access from TrusTEd Research environments (GRAIMATTER; MCPC21033), and TREvolution (MCPC24038). This project has also been supported by MRC and EPSRC (PICTURES; MR/S010351/1).

Owner

  • Name: AI-SDC
  • Login: AI-SDC
  • Kind: organization

Citation (CITATION.cff)

cff-version: 1.2.0
title: SACRO-ML
version: 1.4.0
doi: 10.5281/zenodo.15806814
date-released: 2025-07-04
license: MIT
repository-code: https://github.com/AI-SDC/SACRO-ML
languages:
  - English
keywords:
  - attribute inference attack
  - data privacy
  - data protection
  - inference
  - machine learning
  - membership inference attack
  - privacy
  - statistical disclosure control
authors:
  - family-names: Smith
    given-names: Jim
    orcid: https://orcid.org/0000-0001-7908-1859
    affiliation: University of the West of England
  - family-names: Preen
    given-names: Richard John
    orcid: https://orcid.org/0000-0003-3351-8132
    affiliation: University of the West of England
  - family-names: McCarthy
    given-names: Andrew
    orcid: https://orcid.org/0000-0001-8054-0776
    affiliation: University of the West of England
  - family-names: Rogers
    given-names: Simon
    orcid: https://orcid.org/0000-0003-3578-4477
    affiliation: NHS National Services Scotland
  - family-names: Crespi-Boixader
    given-names: Alba
    affiliation: University of Dundee
  - family-names: Liley
    given-names: James
    orcid: https://orcid.org/0000-0002-0049-8238
    affiliation: University of Durham
  - family-names: Albashir
    given-names: Maha
    affiliation: University of the West of England
  - family-names: Jones
    given-names: Yola
    affiliation: NHS National Services Scotland
  - family-names: Mumtaz
    given-names: Shahzad
    affiliation: University of Dundee
  - family-names: Migenda
    given-names: Jost
    affiliation: King's College London
    orcid: https://orcid.org/0000-0002-5350-8049
identifiers:
  - type: doi
    value: 10.5281/zenodo.7080279
    description: This DOI represents all versions, and will always resolve to the latest one.
  - type: doi
    value: 10.5281/zenodo.7080280
    description: This is the archived snapshot of SACRO-ML v.1.0.0.
  - type: doi
    value: 10.5281/zenodo.7327390
    description: This is the archived snapshot of SACRO-ML v.1.0.1.
  - type: doi
    value: 10.5281/zenodo.7681783
    description: This is the archived snapshot of SACRO-ML v.1.0.2.
  - type: doi
    value: 10.5281/zenodo.7886945
    description: This is the archived snapshot of SACRO-ML v.1.0.3.
  - type: doi
    value: 10.5281/zenodo.7900410
    description: This is the archived snapshot of SACRO-ML v.1.0.4.
  - type: doi
    value: 10.5281/zenodo.8007939
    description: This is the archived snapshot of SACRO-ML v.1.0.5.
  - type: doi
    value: 10.5281/zenodo.8172371
    description: This is the archived snapshot of SACRO-ML v.1.0.6.
  - type: doi
    value: 10.5281/zenodo.8430143
    description: This is the archived snapshot of SACRO-ML v.1.1.0.
  - type: doi
    value: 10.5281/zenodo.10021954
    description: This is the archived snapshot of SACRO-ML v.1.1.1.
  - type: doi
    value: 10.5281/zenodo.10055182
    description: This is the archived snapshot of SACRO-ML v.1.1.2.
  - type: doi
    value: 10.5281/zenodo.11077535
    description: This is the archived snapshot of SACRO-ML v.1.1.3.
  - type: doi
    value: 10.5281/zenodo.12725798
    description: This is the archived snapshot of SACRO-ML v.1.2.0.
  - type: doi
    value: 10.5281/zenodo.13125928
    description: This is the archived snapshot of SACRO-ML v1.2.1.
  - type: doi
    value: 10.5281/zenodo.14901509
    description: This is the archived snapshot of SACRO-ML v1.2.2.
  - type: doi
    value: 10.5281/zenodo.15236977
    description: This is the archived snapshot of SACRO-ML v1.2.3.
  - type: doi
    value: 10.5281/zenodo.15680971
    description: This is the archived snapshot of SACRO-ML v1.3.0.
  - type: doi
    value: 10.5281/zenodo.15806814
    description: This is the archived snapshot of SACRO-ML v1.4.0.

GitHub Events

Total
  • Create event: 38
  • Release event: 2
  • Issues event: 23
  • Watch event: 5
  • Delete event: 39
  • Issue comment event: 59
  • Push event: 179
  • Pull request review comment event: 21
  • Pull request review event: 39
  • Pull request event: 67
  • Fork event: 3
Last Year
  • Create event: 38
  • Release event: 2
  • Issues event: 23
  • Watch event: 5
  • Delete event: 39
  • Issue comment event: 59
  • Push event: 179
  • Pull request review comment event: 21
  • Pull request review event: 39
  • Pull request event: 67
  • Fork event: 3

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 16
  • Total pull requests: 46
  • Average time to close issues: 4 months
  • Average time to close pull requests: 12 days
  • Total issue authors: 3
  • Total pull request authors: 6
  • Average comments per issue: 0.44
  • Average comments per pull request: 0.5
  • Merged pull requests: 31
  • Bot issues: 0
  • Bot pull requests: 23
Past Year
  • Issues: 14
  • Pull requests: 38
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 11 days
  • Issue authors: 3
  • Pull request authors: 6
  • Average comments per issue: 0.5
  • Average comments per pull request: 0.5
  • Merged pull requests: 25
  • Bot issues: 0
  • Bot pull requests: 17
Top Authors
Issue Authors
  • rpreen (9)
  • jim-smith (6)
  • Joe-Heffer-Shef (1)
Pull Request Authors
  • pre-commit-ci[bot] (21)
  • rpreen (18)
  • dependabot[bot] (3)
  • jim-smith (1)
  • JostMigenda (1)
  • kayatefi (1)
Top Labels
Issue Labels
enhancement (6) bug (4) documentation (1) dependencies (1)
Pull Request Labels
dependencies (3) enhancement (1) github_actions (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 37 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 5
  • Total maintainers: 1
pypi.org: sacroml

Tools for the statistical disclosure control of machine learning models

  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 37 Last month
Rankings
Dependent packages count: 10.6%
Average: 35.0%
Dependent repos count: 59.5%
Maintainers (1)
Last synced: 6 months ago

Dependencies

docs/requirements.txt pypi
  • absl-py *
  • dictdiffer ==0.9.0
  • keras ==2.8.0
  • matplotlib ==3.3.4
  • numpy *
  • numpydoc *
  • pandas ==1.1.5
  • protobuf *
  • pytest ==6.2.4
  • scikit-image ==0.18.3
  • scikit-learn ==1.0.2
  • scipy ==1.5.4
  • sphinx-autopackagesummary ==1.3
  • sphinx-gallery ==0.10.1
  • sphinx-issues ==3.0.1
  • sphinx-prompt ==1.5.0
  • sphinx-rtd-theme ==1.0.0
  • tensorboard ==2.8.0
  • tensorboard-data-server ==0.6.1
  • tensorboard-plugin-wit ==1.8.1
  • tensorflow ==2.8.0
  • tensorflow-datasets ==4.5.2
  • tensorflow-estimator ==2.8.0
  • tensorflow-io-gcs-filesystem ==0.25.0
  • tensorflow-metadata ==1.7.0
  • tensorflow-privacy ==0.8.0
  • tensorflow-probability ==0.15.0
requirements.txt pypi
  • dictdiffer *
  • fpdf *
  • joblib *
  • numpy *
  • pandas *
  • pylint *
  • scikit_learn *
  • tensorflow *
  • tensorflow_privacy *
.github/workflows/lint.yml actions
  • actions/checkout v3 composite
.github/workflows/sphinx-docs.yml actions
  • actions/checkout v3 composite
  • ad-m/github-push-action master composite
.github/workflows/test.yml actions
  • actions/checkout v3 composite
setup.py pypi
  • dictdiffer *
  • fpdf *
  • joblib *
  • multiprocess *
  • numpy *
  • pandas *
  • scikit_learn *
  • scipy *
  • tensorflow *
  • tensorflow_privacy *
.github/workflows/tests.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite