Recent Releases of https://github.com/sebhaan/tabpfgen

https://github.com/sebhaan/tabpfgen - v0.1.4

Minor release with expanded Python support

  • Python 3.10+ support (previously 3.11+)
  • Code cleanup and linting fixes
  • Verified compatibility across Python 3.10-3.13

- Python
Published by sebhaan 9 months ago

https://github.com/sebhaan/tabpfgen - v0.1.3

TabPFGen v0.1.3 Release Notes

New Features

Dataset Balancing: Added balance_dataset() method for automatic class balancing in imbalanced datasets. - Balances classes to majority class size by default - Custom target per class option via target_per_class parameter - Automatic filtering of small classes below min_class_size threshold - Returns both synthetic samples and combined dataset

API Changes

New Method: python X_synthetic, y_synthetic, X_combined, y_combined = generator.balance_dataset(X, y)

Parameters: - target_per_class: Target samples per class (default: majority class size) - min_class_size: Minimum class size for balancing (default: 5)

Improvements

  • Enhanced class-specific synthetic sample generation using SGLD
  • Improved TabPFN integration for label refinement
  • Comprehensive validation statistics and progress reporting
  • Updated documentation with balancing examples
  • Screen-optimised visualisation plots (max 15×10 size)

Notes

Final class distributions are approximately balanced due to TabPFN's quality-focused label refinement process, which prioritises data quality over exact class counts.

Compatibility

Fully backwards compatible. Existing generate_classification() and generate_regression() methods unchanged.

- Python
Published by sebhaan 9 months ago

https://github.com/sebhaan/tabpfgen - v0.1.2

Minor update including fix for tabpfn version compatibility.

- Python
Published by sebhaan about 1 year ago

https://github.com/sebhaan/tabpfgen - v0.1.1

- Python
Published by sebhaan about 1 year ago