Recent Releases of https://github.com/sebhaan/tabpfgen
https://github.com/sebhaan/tabpfgen - v0.1.4
Minor release with expanded Python support
- Python 3.10+ support (previously 3.11+)
- Code cleanup and linting fixes
- Verified compatibility across Python 3.10-3.13
- Python
Published by sebhaan 9 months ago
https://github.com/sebhaan/tabpfgen - v0.1.3
TabPFGen v0.1.3 Release Notes
New Features
Dataset Balancing: Added balance_dataset() method for automatic class balancing in imbalanced datasets.
- Balances classes to majority class size by default
- Custom target per class option via target_per_class parameter
- Automatic filtering of small classes below min_class_size threshold
- Returns both synthetic samples and combined dataset
API Changes
New Method:
python
X_synthetic, y_synthetic, X_combined, y_combined = generator.balance_dataset(X, y)
Parameters:
- target_per_class: Target samples per class (default: majority class size)
- min_class_size: Minimum class size for balancing (default: 5)
Improvements
- Enhanced class-specific synthetic sample generation using SGLD
- Improved TabPFN integration for label refinement
- Comprehensive validation statistics and progress reporting
- Updated documentation with balancing examples
- Screen-optimised visualisation plots (max 15×10 size)
Notes
Final class distributions are approximately balanced due to TabPFN's quality-focused label refinement process, which prioritises data quality over exact class counts.
Compatibility
Fully backwards compatible. Existing generate_classification() and generate_regression() methods unchanged.
- Python
Published by sebhaan 9 months ago
https://github.com/sebhaan/tabpfgen - v0.1.2
Minor update including fix for tabpfn version compatibility.
- Python
Published by sebhaan about 1 year ago