calzone-tool

Evaluating the Calibration of Probabilistic Models

https://github.com/didsr/calzone

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.0%) to scientific vocabulary

Keywords

calibration metrics probability testing validation
Last synced: 6 months ago · JSON representation ·

Repository

Evaluating the Calibration of Probabilistic Models

Basic Info
Statistics
  • Stars: 6
  • Watchers: 2
  • Forks: 1
  • Open Issues: 2
  • Releases: 2
Topics
calibration metrics probability testing validation
Created over 1 year ago · Last pushed 6 months ago
Metadata Files
Readme License Citation

README.md

Evaluating the Calibration of Probabilistic Models

Docs PyPI version

Calzone is a comprehensive Python package for calculating and visualizing metrics for assessing the calibration of models with probabilistic output.

Features

  • Supports multiple calibration metrics including Spiegelhalter's Z-test, Expected Calibration Error (ECE), Maximum Calibration Error (MCE), Hosmer-Lemeshow (HL) test, Cox regression analysis, and Loess regression analysis.
  • Provides tools for creating reliability diagrams and ROC curves.
  • Offers equal-space and equal-frequency binning options.
  • Provides bootstrapped confidence intervals for each calibration metric.
  • Supports prevelance adjustment to account for prevalance differences between enriched data and population data.
  • Extends metrics to multiclass classification problems with one-vs-rest or top-class calculations.

To accurately assess the calibration of machine learning models, it is essential to have a comprehensive and representative testing dataset with sufficient coverage of the prediction space that is also independent of the model development dataset (for training, tuning, and calibration). The calibration metrics are not meaningful if the dataset is not representative of true intended population.

Installation

You can install the package using pip: pip install calzone-tool

Usage

Using Calzone in Python: ```python import numpy as np from scipy.stats import beta from calzone.metrics import CalibrationMetrics

Generate simulated data with beta-binomial distribution.

class1proba = beta.rvs(0.5, 0.5, size=1000) class0proba = 1 - class1proba X = np.concatenate( (class0proba.reshape(-1, 1), class1proba.reshape(-1, 1)), axis=1 ) Y = np.random.binomial(1, p=class1proba)

Calculate calibration metrics.

calmetrics = CalibrationMetrics(classtocalculate=1) calmetrics.calculate_metrics(Y, X, metrics='all') ```

Also, an experimental build of the graphical user interface can now be downloaded at https://github.com/DIDSR/calzone/releases/tag/v0.0.1-alpha.

Alternatively, you can run cal_metrics for a command line interface.

Documentation

For a detailed manual and API reference, please visit our documentation page.

Support

If you encounter any issues or have questions about the package, please open an issue request or contact: * Kwok Lung (Jason) Fan * Qian Cao

Disclaimer

This software and documentation (the "Software") were developed at the Food and Drug Administration (FDA) by employees of the Federal Government in the course of their official duties. Pursuant to Title 17, Section 105 of the United States Code, this work is not subject to copyright protection and is in the public domain. Permission is hereby granted, free of charge, to any person obtaining a copy of the Software, to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, or sell copies of the Software or derivatives, and to permit persons to whom the Software is furnished to do so. FDA assumes no responsibility whatsoever for use by other parties of the Software, its source code, documentation or compiled executables, and makes no guarantees, expressed or implied, about its quality, reliability, or any other characteristic. Further, use of this code in no way implies endorsement by the FDA or confers any advantage in regulatory decisions. Although this software can be redistributed and/or modified freely, we ask that any derivative works bear some notice that they are derived from it, and any modified versions bear some notice that they have been modified.

Owner

  • Name: DIDSR (Aldo Badano, Director)
  • Login: DIDSR
  • Kind: organization
  • Location: United States of America

FDA, CDRH, OSEL, Division of Imaging, Diagnostics, and Software Reliability

JOSS Publication

Calzone: A Python package for measuring calibration of probabilistic models for classification
Published
October 24, 2025
Volume 10, Issue 114, Page 8026
Authors
Kwok Lung Fan ORCID
U.S. Food and Drug Administration
Gene Pennello ORCID
U.S. Food and Drug Administration
Qi Liu ORCID
U.S. Food and Drug Administration
Nicholas Petrick ORCID
U.S. Food and Drug Administration
Ravi K. Samala ORCID
U.S. Food and Drug Administration
Frank W. Samuelson ORCID
U.S. Food and Drug Administration
Yee Lam Elim Thompson ORCID
U.S. Food and Drug Administration
Qian Cao ORCID
U.S. Food and Drug Administration
Editor
Fabian Scheipl ORCID
Tags
Machine Learning Artificial Intelligence Calibration Probabilistic models Metric Evaluation

Citation (citation.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Fan"
  given-names: "Kwok Lung"
  orcid: "https://orcid.org/0000-0002-2180-9082"
- family-names: "Cao"
  given-names: "Qian"
title: "calzone: A Python package for measring calibration of probablistic models for classification"
date-released: 2024-10-18
url: "https://github.com/DIDSR/calzone"

GitHub Events

Total
  • Release event: 4
  • Watch event: 4
  • Delete event: 4
  • Issue comment event: 4
  • Public event: 1
  • Push event: 67
  • Pull request event: 5
  • Fork event: 1
  • Create event: 7
Last Year
  • Release event: 4
  • Watch event: 4
  • Delete event: 4
  • Issue comment event: 4
  • Public event: 1
  • Push event: 67
  • Pull request event: 5
  • Fork event: 1
  • Create event: 7

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 17 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
  • Total maintainers: 1
pypi.org: calzone-tool

A package for calibration measurement and analysis

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 17 Last month
Rankings
Dependent packages count: 9.9%
Average: 32.8%
Dependent repos count: 55.8%
Maintainers (1)
Last synced: 6 months ago