calibre

Advanced Calibration Models

https://github.com/finite-sample/calibre

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.9%) to scientific vocabulary

Keywords

calibration near-isotonic pava relaxed-pava
Last synced: 7 months ago · JSON representation ·

Repository

Advanced Calibration Models

Basic Info
  • Host: GitHub
  • Owner: finite-sample
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage:
  • Size: 271 KB
Statistics
  • Stars: 3
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
calibration near-isotonic pava relaxed-pava
Created about 1 year ago · Last pushed 11 months ago
Metadata Files
Readme License Citation

README.md

Calibre: Advanced Calibration Models

PyPI version PyPI Downloads Python Versions

Calibration is a critical step in deploying machine learning models. While techniques like isotonic regression have been standard for this task, they come with significant limitations:

  1. Loss of granularity: Traditional isotonic regression often collapses many distinct probability values into a small number of unique values, which can be problematic for decision-making.

  2. Rigid monotonicity: Perfect monotonicity might not always be necessary or beneficial; small violations might be acceptable if they better preserve the information content of the original predictions.

Calibre addresses these limitations by implementing a suite of advanced calibration techniques that provide more nuanced control over model probability calibration. Its methods are designed to preserve granularity while still favoring a generally monotonic trend.

  • Nearly-isotonic regression: Allows controlled violations of monotonicity to better preserve data granularity
  • I-spline calibration: Uses monotonic splines for smooth calibration functions
  • Relaxed PAVA: Ignores "small" violations based on percentile thresholds in the data
  • Regularized isotonic regression: Adds L2 regularization to standard isotonic regression for smoother calibration curves while maintaining monotonicity.
  • Locally smoothed isotonic: Applies Savitzky-Golay filtering to isotonic regression results to reduce the "staircase effect" while preserving monotonicity.
  • Adaptive smoothed isotonic: Uses variable-sized smoothing windows based on data density to provide better detail in dense regions and smoother curves in sparse regions.

Benchmark

The notebook has benchmark results.

Installation

bash pip install calibre

Usage Examples

Nearly Isotonic Regression with CVXPY

```python import numpy as np from calibre import NearlyIsotonicRegression

Example data

np.random.seed(42) x = np.sort(np.random.uniform(0, 1, 1000)) ytrue = np.sin(2 * np.pi * x) y = ytrue + np.random.normal(0, 0.1, size=1000)

Calibrate with different lambda values

calstrict = NearlyIsotonicRegression(lam=10.0, method='cvx') calstrict.fit(x, y) ycalibratedstrict = cal_strict.transform(x)

calrelaxed = NearlyIsotonicRegression(lam=0.1, method='cvx') calrelaxed.fit(x, y) ycalibratedrelaxed = cal_relaxed.transform(x)

Now ycalibratedrelaxed will preserve more unique values

while ycalibratedstrict will be more strictly monotonic

```

I-Spline Calibration

```python from calibre import ISplineCalibrator

Smooth calibration using I-splines with cross-validation

calispline = ISplineCalibrator(nsplines=10, degree=3, cv=5) calispline.fit(x, y) yispline = cal_ispline.transform(x) ```

Relaxed PAVA

```python from calibre import RelaxedPAVA

Calibrate allowing small violations (threshold at 10th percentile)

calrelaxedpava = RelaxedPAVA(percentile=10, adaptive=True) calrelaxedpava.fit(x, y) yrelaxed = calrelaxed_pava.transform(x)

This preserves more structure than standard isotonic regression

while still correcting larger violations of monotonicity

```

Regularized Isotonic

```python from calibre import RegularizedIsotonicRegression

Calibrate with L2 regularization

calregiso = RegularizedIsotonicRegression(alpha=0.1) calregiso.fit(x, y) yregiso = calregiso.transform(x) ```

Locally Smoothed Isotonic

```python from calibre import SmoothedIsotonicRegression

Apply local smoothing to reduce the “staircase” effect

calsmoothed = SmoothedIsotonicRegression(windowlength=7, polyorder=3, interpmethod='linear') calsmoothed.fit(x, y) ysmoothed = cal_smoothed.transform(x) ```

Evaluating Calibration Quality

```python from calibre import ( meancalibrationerror, binnedcalibrationerror, correlationmetrics, uniquevalue_counts )

Calculate error metrics

mce = meancalibrationerror(ytrue, ycalibrated) bce = binnedcalibrationerror(ytrue, ycalibrated, n_bins=10)

Check correlations

corr = correlationmetrics(ytrue, ycalibrated, x=x, yorig=y) print(f"Correlation with true values: {corr['spearmancorrtoytrue']:.4f}") print(f"Correlation with original predictions: {corr['spearmancorrtoyorig']:.4f}")

Check granularity preservation

counts = uniquevaluecounts(ycalibrated, yorig=y) print(f"Original unique values: {counts['nuniqueyorig']}") print(f"Calibrated unique values: {counts['nuniqueypred']}") print(f"Preservation ratio: {counts['uniquevalueratio']:.2f}") ```

Evaluation Metrics

mean_calibration_error(y_true, y_pred)

Calculates the mean calibration error.

binned_calibration_error(y_true, y_pred, x=None, n_bins=10)

Calculates binned calibration error.

correlation_metrics(y_true, y_pred, x=None, y_orig=None)

Calculates Spearman's correlation metrics.

unique_value_counts(y_pred, y_orig=None, precision=6)

Counts unique values in predictions to assess granularity preservation.

When to Use Which Method

  • NearlyIsotonicRegression (method='cvx'): When you want precise control over the monotonicity/granularity trade-off and can afford the computational cost of convex optimization.

  • NearlyIsotonicRegression (method='path'): When you need an efficient algorithm for larger datasets that still provides control over monotonicity.

  • ISplineCalibrator: When you want a smooth calibration function rather than a step function, particularly for visualization and interpretation.

  • RelaxedPAVA: When you want a simple, efficient approach that ignores "small" violations while correcting larger ones.

  • RegularizedIsotonicRegression: When you need smoother calibration curves with L2 regularization to prevent overfitting.

  • SmoothedIsotonicRegression: When you want to reduce the "staircase effect" of standard isotonic regression while preserving monotonicity.

References

  1. Nearly-Isotonic Regression Tibshirani, R. J., Hoefling, H., & Tibshirani, R. (2011). Technometrics, 53(1), 54–61. DOI:10.1198/TECH.2010.09281

  2. A path algorithm for the fused lasso signal approximator. Hoefling, H. (2010). Journal of Computational and Graphical Statistics, 19(4), 984–1006. DOI:10.1198/jcgs.2010.09208

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT

🔗 Adjacent Repositories

  • gojiplus/robust_pava — Increase uniqueness in isotonic regression by ignoring small violations
  • gojiplus/pyppur — pyppur: Python Projection Pursuit Unsupervised (Dimension) Reduction To Min. Reconstruction Loss or DIstance DIstortion
  • gojiplus/rmcp — R MCP Server
  • gojiplus/bloomjoin — bloomjoin: An R package implementing Bloom filter-based joins for improved performance with large datasets.
  • gojiplus/incline — Estimate Trend at a Point in a Noisy Time Series

Owner

  • Name: finite-sample
  • Login: finite-sample
  • Kind: organization

Citation (citation.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "Calibre: Advanced Calibration Models"
abstract: "A suite of advanced calibration techniques for machine learning models that address limitations in traditional methods like isotonic regression by preserving probability granularity and allowing for controlled flexibility in monotonicity constraints."
authors:
  - family-names: "Sood"
    given-names: "Gaurav"
keywords:
  - machine learning
  - calibration
  - probability calibration
  - isotonic regression
  - model deployment
repository-code: "https://github.com/gojiplus/calibre"  # Replace with actual repository URL
license: MIT  # Replace with actual license
version: 0.2.1
date-released: "2025-04-05"  # Replace with actual release date

GitHub Events

Total
Last Year

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 27
  • Total Committers: 2
  • Avg Commits per committer: 13.5
  • Development Distribution Score (DDS): 0.111
Past Year
  • Commits: 27
  • Committers: 2
  • Avg Commits per committer: 13.5
  • Development Distribution Score (DDS): 0.111
Top Committers
Name Email Commits
gaurav g****7@g****m 24
github-actions a****s@g****m 3
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 7 months ago


Dependencies

pyproject.toml pypi
  • cvxpy >=1.2.0
  • matplotlib >=3.4.0
  • numpy >=1.20.0
  • pandas >=1.3.0
  • scikit-learn >=1.0.0
  • scipy >=1.7.0
.github/workflows/adjacent_repo_recommender.yaml actions
  • actions/checkout v4 composite
  • gojiplus/adjacent v1.3 composite
.github/workflows/python-publish.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • pypa/gh-action-pypi-publish release/v1 composite