Recent Releases of simpleml
simpleml - 0.14.0
0.14.0 (2022-07-13)
- Standarized formatting with Black
- Split up ORM into a standalone swappable backend
- Persistables maintain weakrefs for lineage
- Persistables are normal python objects now
- Hashing flag to reject non-serializable objects
What's Changed
- Black formatting by @eyadgaran in https://github.com/eyadgaran/SimpleML/pull/103
- dask tweaks by @eyadgaran in https://github.com/eyadgaran/SimpleML/pull/104
- Orm separation by @eyadgaran in https://github.com/eyadgaran/SimpleML/pull/99
- Allow user to request a raised exception if hash(content) will be inconsistent by @ptoman-pa in https://github.com/eyadgaran/SimpleML/pull/107
- version bump and changelog by @eyadgaran in https://github.com/eyadgaran/SimpleML/pull/109
New Contributors
- @ptoman-pa made their first contribution in https://github.com/eyadgaran/SimpleML/pull/107
Full Changelog: https://github.com/eyadgaran/SimpleML/compare/0.13.0...0.14.0
- Python
Published by eyadgaran over 3 years ago
simpleml - 0.12.0
- Changed internal dataset structure from mixins to direct inheritance
- Condensed all pandas dataset types into a single base class
- Adds support for dask datasets
- Placeholders for additional dataset libraries
- Adds hashing support for dask dataframes
- Refactored persistence ("save_patterns") package into standalone extensible framework
- Adds context manager support to registries for temporary overwrite
- Refactor pipelines into library based subclasses
BREAKING CHANGES
- Pandas dataset will default param squeeze_return to False (classes expecting to return a series will need to be updated)
- Numpy dataset is considered unstable and will be redesigned in a future release
- Onedrive, Hickle, and database save patterns are removed (functionality is still available but a composed pattern is not predefined. these can be trivially added in user code if needed)
- Changed pandas hash output to int from numpy.int64 (due to breaking change in NumpyHasher)
- Changed primitive deterministic hash from pickle to md5
- Extracted data iterators into utility wrappers. Pipelines no longer have flags to return iterators
- Random split defaults are computed at runtime instead of precalculated (affects hash)
What's Changed
- Ml management structure by @eyadgaran in https://github.com/eyadgaran/SimpleML/pull/87
- Python eol by @eyadgaran in https://github.com/eyadgaran/SimpleML/pull/92
- Dataset libraries by @eyadgaran in https://github.com/eyadgaran/SimpleML/pull/90
- Pipeline refactor by @eyadgaran in https://github.com/eyadgaran/SimpleML/pull/96
- additional testing coverage by @eyadgaran in https://github.com/eyadgaran/SimpleML/pull/83
- Adding Ensemble Model Histogram-based Gradient Boosting Classifier by @aolopez in https://github.com/eyadgaran/SimpleML/pull/91
- version bump by @eyadgaran in https://github.com/eyadgaran/SimpleML/pull/98
New Contributors
- @aolopez made their first contribution in https://github.com/eyadgaran/SimpleML/pull/91
Full Changelog: https://github.com/eyadgaran/SimpleML/compare/0.11.0...0.12.0
- Python
Published by eyadgaran about 4 years ago
simpleml - 0.11.0
- Added support to hasher for initialized objects
- Adds support for arbitrary dataset splits and sections
- Dataset hooks to validate dataframe setting
- Pipelines no longer cache dataset splits and proxy directly to dataset on every call
- Introduces pipeline splits as reproducible projections over dataset splits
- Database utility to recalculate hashes for existing persistables
BREAKING CHANGES - Hash for an uninitialized class changed from repr(cls) to "cls.module_.cls.name_" - Database migrations no longer recalculate hashes. That has to be done manually via a utility
- Python
Published by eyadgaran over 4 years ago
simpleml - 0.10.0
- Dataset external file setter with validation hooks
- Pandas changes to always return dataframe copies (does not extend to underlying python objects! eg lists, objects, etc)
- Pandas Dataset Subclasses for Single and Multi label datasets
- PersistableLoader methods do not require name as a parameter
BREAKING CHANGES
- PandasDataset is deprecated and will be dropped in a future release. Use SingleLabelPandasDataset or MultiLabelPandasDataset instead
- Pandas Dataset Classes require dataframe objects of type pd.DataFrame and will validate input (containers of pd.DataFrames are no longer supported)
- Python
Published by eyadgaran over 4 years ago
simpleml - 0.9.0
- Refactored save patterns. Supports multiple concurrent save locations and arbitrary artifact declaration
- Registry centric model for easier extension and third party contrib
- Support for in-memory sqlite db
- Changed database JSON mapping class and dependency to support mutability tracking
- New import wrapper class to manage optional dependencies
- Added dataset_id as a Metric reference. Breaking workflow change! Will raise an error if a dataset is not added and the metric depends on it
- Dropped default Train pipeline split. Will return an empty split for split pipelines and a singleton full dataset split for NoSplitPipelines
- Explicitly migrated to tensorflow 2 and tf.keras
- Python
Published by eyadgaran over 5 years ago
simpleml - 0.7
- Thread-safe Keras Sequence dataset splits
- Additional Seq2Seq models
- Bastion tunneling support for SSH db connections
- Explicit modules for constants and imports
- Additional base classes for database connections (plain and alembic)
- Database independent sqlachemy types
- Switched pickle library from dill to cloudpickle
- SQLite support
- Changed default DB connection to SQLite
- Python
Published by eyadgaran over 6 years ago
simpleml - 0.6
- Full database initialization with alembic
- DB schema validation on start
- Main configuration file for all credentials
- Drop official support for python 3.4
- Automatic handling of no data operations
- Remaining cloud provider support
- Feature metadata for classification models
- Runtime environment validation
- Add Split and SplitContainer objects
- Simplejson dependency
- Pipeline generator support
- Library specific model base classes
- Generalized database connection classes
- Python
Published by eyadgaran over 6 years ago
simpleml - 0.5
- Default identity pipeline
- Alembic integration for database migration
- Standardized model inheritance pattern
- Condensed pandas split dataframes into single df
- Remaining classification metrics
- Updated schema with hash datatype
- Updated hash to use joblib code, consistent across initializations
- Generator pipeline and fitted kwarg
- Dropped base prefixes
- Moved composed subclasses to inits
- Unified datasets and pipelines
- Python
Published by eyadgaran about 7 years ago