Recent Releases of Machine Learning Validation via Rational Dataset Sampling with astartes

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.3.2

This patch adds a new optional argument to the SPXY sampler - distance metrics for X and y can now be provided separately to handle cases like continuous features with discrete targets.

What's Changed

  • add new optional X and y specific distance options by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/183

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.3.1...v1.3.2

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns 10 months ago

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.3.1

This patch releases fixes a small bug when returning an array with only 0 in it, and removes a backwards-incompatible use of sklearn.

What's Changed

  • fix failing paper reproduction ci by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/178
  • fix usage of meansquarederror by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/179
  • Use len instead of any when returing val indexes by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/181

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.3.0...v1.3.1

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns 10 months ago

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.3.0

This minor releases fixes a small bug in the base sampler which would incorrectly print the requested vs. actual validation set size.

A new demo notebook showing the MLM sampler has also been added.

What's Changed

  • Implementing the MLM Sampler by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/134
  • astartes v1.3.0 by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/176

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.2.2...v1.3.0

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns almost 2 years ago

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.2.2

astartes Patch Release 1.2.2

This parch release significantly decreases the required dependencies for the molecules subpackage by switching the molecule handling backend from aimsim to aimsim_core when installing from PyPI.

What's Changed

  • Change aimsim to aimsim_core for PyPI to lean out dependencies by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/174

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.2.1...v1.2.2

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns about 2 years ago

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.2.1

astartes Minor + Patch Release 1.2.1

This release is a combination of 1.2.0 (which was yanked due to a bug) and 1.2.1 (which fixes the bug) - new features include the TargetValue and MolecularWeight samplers and some minor formatting and bug fixes elsewhere. Try out the new samplers to see how well your model can extrapolate outside of the training target space!

What's Changed

  • Fix scaffold formatting by @kspieks in https://github.com/JacksonBurns/astartes/pull/169
  • Adding TargetValue and MolecularWeight Sampler by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/172
  • Fix Incorrectly Delayed Import in 1.2.0 by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/173

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.1.5...v1.2.1

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns over 2 years ago

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.1.5

astartes Patch Release 1.1.5

This patch release contains a bugfix in extrapolative sampling with train_test_split. Previously, random_state was effectively ignored when shuffling clusters inside of train_test_split since the max_shufflable_size was set based on the size of the validation set (which is zero here). Thanks to @kspieks for the quick fix and @PatWalters for reporting it!

What's Changed

  • Fix a bug with the definition of max_shufflable_size by @kspieks in https://github.com/JacksonBurns/astartes/pull/164

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.1.4...v1.1.5

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns over 2 years ago

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.1.4

astartes Patch Release 1.1.4

This patch release adds support for Python 3.12 as well as the JOSS review badge to the README.

What's Changed

  • Final Wrap-up Edits for pyOpenSci Zenodo Release by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/158
  • add joss review status badge to readme by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/159
  • update readme and cff with new JOSS citation by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/161
  • add python 3.12 to ci, pyproject, readme - bump patch version 4 release by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/162

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.1.3...v1.1.4

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns over 2 years ago

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.1.3.post1

astartes Post-Release 1.1.3.post1

This post-release contains a single small documentation update to the README, now including the pyOpenSci approval badge.

What's Changed

  • Final Wrap-up Edits for pyOpenSci Zenodo Release by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/158

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.1.3...v1.1.3.post1

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns over 2 years ago

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.1.3

astartes Patch Release 1.1.3

This patch release contains a number of quality of life updates and cleanups related to the submission of astartes to PyOpenSci. See the extended log below.

What's Changed

  • add readme line to explain how to use MDKS by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/149
  • Delete postBuild by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/150
  • Internal Improvements by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/151
  • added a conda-forge package by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/152
  • PyOpenSci Review - Documentation Updates by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/154
  • PyOpenSci Review - Code Updates by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/155
  • bump version for patch release by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/157

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.1.2...v1.1.3

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns over 2 years ago

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.1.2

astartes Patch Release 1.1.2

This release contains minor changes, primarily to the documentation, in response to initial editor comments at pyOpenSci. The primary difference in the codebase is that astartes now has a __version__ attribute for backwards compatibility with Python 3.7.

To upgrade, run pip install --upgrade astartes.

To check which version of astartes you have installed, you can run python -c "import astartes; print(astartes.__version__)" on Python 3.7 or python -c "from importlib.metadata import version; version('astartes')" on Python 3.8 or newer.

What's Changed

  • Pre-Submission Fixup for PyOpenSci by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/143
  • pyOpenSci Initial Review Comments Fixup by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/147
  • patch fixes for the previous patch fixes (version 1.1.2) by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/148

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.1.1...v1.1.2

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns almost 3 years ago

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.1.1

astartes Patch Release 1.1.1

This patch release contains minor bugfixes compared to version 1.1.0, see the linked PR below for complete details.

What's Changed

  • v1.1.1 patches by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/140

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.1.0...v1.1.1

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns almost 3 years ago

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.1.0

astartes Minor Release 1.1.0 :tada:

The first minor release of astartes v1!

This new version adds the generate_regression_results_dict to auto-magically subject your sklearn-compatible model to a variety of different sampling algorithms and then show you the results. Great for rapid exploration!

There is also now a conda package for astartes base functionality, with more comprehensive packages coming in the future :crossed_fingers:. This version and all previous versions are available from the jacksonburns channel.

Complete details below:

What's Changed

  • add a conda package by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/132
  • Add function to obtain values for a table by @kspieks in https://github.com/JacksonBurns/astartes/pull/122
  • astartes v1.1.0 :tada: by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/136

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.0.3...v1.1.0

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns almost 3 years ago

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.0.3

astartes Patch Release 1.0.3

This release adds first-party support for Pandas dataframes and series as input arguments to astartes!

What's Changed

  • Support for Pandas Dataframe and Series by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/127

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.0.2...v1.0.3

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns almost 3 years ago

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.0.2

astartes Patch Release 1.0.2

This patch release of astartes includes dramatic speedup and scaling improvements for the Kennard Stone and SPXY samplers (see #126).

What's Changed

  • Improve Performance of Kennard-Stone, SPXY Samplers by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/126

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.0.1...v1.0.2

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns almost 3 years ago

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.0.1

astartes Patch Release 1.0.1

This patch release contains minor internal changes, including fixing a typo in a demonstration notebook and more careful internal handling of input arrays to ensure consistent data types.

astartes is now more flexible for the types of X, y, and labels input - if they are not numpy arrays, astartes will attempt to convert them and provide a helpful warning.

What's Changed

  • Fix typo in RDB7 notebook by @kspieks in https://github.com/JacksonBurns/astartes/pull/121
  • add support for X, y, and labels of type other than np.ndarray by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/124

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.0.0...v1.0.1

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns almost 3 years ago

Machine Learning Validation via Rational Dataset Sampling with astartes - astartes v1.0.0

This is the initial production release of astartes. See the complete changelog below for a record of updates since the beta testing.

What's Changed

  • Add example notebook with Sci Dataset example by @kspieks in https://github.com/JacksonBurns/astartes/pull/43
  • Examples: new notebook to demonstrate trainvaltest_split and CI for notebooks by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/67
  • fix issue with missing packages, bump beta version by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/82
  • Set random seed by @kspieks in https://github.com/JacksonBurns/astartes/pull/89
  • Update description by @kspieks in https://github.com/JacksonBurns/astartes/pull/90
  • Update Project Description and Supported Python Subversions in pyproject.toml and README.md by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/92
  • Change behavior of return_indices by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/93
  • Implement random_state effect in extrapolative cluster assignment by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/94
  • Repo meta fixes by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/95
  • Add support for passing RDKit Molecules directly by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/96
  • astartes v1.0 Release Candidate 1 by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/97
  • Ensuring Reproducibility in the Paper and Regular Use by @kspieks in https://github.com/JacksonBurns/astartes/pull/99
  • Time splits by @kspieks in https://github.com/JacksonBurns/astartes/pull/101
  • Double-Assignment Bug in main.py Leads to Points being Left Out by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/105
  • Fix scaffold splits by @kspieks in https://github.com/JacksonBurns/astartes/pull/106
  • Backend Changes for Better Clarity and Maintainability by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/107
  • New Example Notebook: Quantitative and Visual Comparisons of Different Sampling Algorithms with Fast Food by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/104
  • Small Bugfixes for v1.0.0 by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/108
  • Update notebooks by @kspieks in https://github.com/JacksonBurns/astartes/pull/110
  • Fix KennardStone Sampler by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/112
  • remove kennard_stone from requirements by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/113
  • astartes Demonstration Notebook for MLPDS 2023 by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/114
  • Bug fixes for v1.0 by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/117
  • astartes v1.0.0 Release by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/109

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.0.0b0...v1.0.0

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns about 3 years ago

Machine Learning Validation via Rational Dataset Sampling with astartes - `astartes` 1.0.0 Initial Beta Release

Initial Beta Release

This release coincides with the publication of astartes v1.0.0b0 on PyPI and contains a number of additions and minor API changes for testing by the broader community.

What's Changed

  • add test for random and sampler factory by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/55
  • Switch 'backend' to train_val_test_split instead of train_test_split by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/56
  • DBSCAN by @kspieks in https://github.com/JacksonBurns/astartes/pull/59
  • add transition to astartes docs page by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/61
  • Addition of Missing (and Improvement of Existing) Docstrings by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/62
  • Avoid redefining Python's native NotImplementedError by @kspieks in https://github.com/JacksonBurns/astartes/pull/64
  • Add OptiSim by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/63
  • Add SPXY Sampler, Fix Bug in Interpolative Sampler Set Filling by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/71
  • Scaffold splits by @kspieks in https://github.com/JacksonBurns/astartes/pull/65
  • README Update, PR and Issue Templates, Developer Instructions by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/66
  • Add Actions Concurrency by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/72
  • Allow users to access the power hyperparameter for DBSCAN by @kspieks in https://github.com/JacksonBurns/astartes/pull/74
  • Improve scaffold by @kspieks in https://github.com/JacksonBurns/astartes/pull/73
  • Allow arguments for KMeans to be accessible to users by @kspieks in https://github.com/JacksonBurns/astartes/pull/75
  • Initial beta release by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/77

Full Changelog: https://github.com/JacksonBurns/astartes/compare/v1.0.0a2...v1.0.0b0

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns about 3 years ago

Machine Learning Validation via Rational Dataset Sampling with astartes - `astartes` 1.0.0 Alpha release 2

This is a pre-release of astartes for packaging and distribution testing and is not intended for deployment and/or use.

What's Changed

  • Rename abstract_sampler methods for grammatical consistency by @himaghna in https://github.com/JacksonBurns/astartes/pull/1
  • Set up name changes by @himaghna in https://github.com/JacksonBurns/astartes/pull/3
  • Correct typos in sampler.py by @himaghna in https://github.com/JacksonBurns/astartes/pull/2
  • Fix KEnnardStone and random sampler naming by @himaghna in https://github.com/JacksonBurns/astartes/pull/5
  • Remove random from samplers init.py file by @himaghna in https://github.com/JacksonBurns/astartes/pull/4
  • Add matrix_ops by @himaghna in https://github.com/JacksonBurns/astartes/pull/6
  • Add KS sampler by @himaghna in https://github.com/JacksonBurns/astartes/pull/7
  • Warn about PR's that have low test coverage by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/11
  • remove coverage check from CI on main by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/12
  • Refactor to Leverage ABC Better by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/8
  • Add QM9 test as example for validating all samplers and train_test_split by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/14
  • add test runs on more python versions and OSs by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/38
  • only run pr coverage check on ready to review PRs by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/39
  • switch to pyproject.toml by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/41
  • Interface sampler refactor by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/42
  • Fix minor typos in readme by @kspieks in https://github.com/JacksonBurns/astartes/pull/49
  • Updated Implementation and Unit Testing for Kennard-Stone Sampler by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/47
  • Run isort as a CI Check by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/51
  • Revert back to sklearn train_test split for random splitting by @kspieks in https://github.com/JacksonBurns/astartes/pull/50
  • move sphere_exclusion, add fxn/test headers and imports by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/48
  • Add SamplerFactory to simplify traintestsplit by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/53
  • add pypi build and release action by @JacksonBurns in https://github.com/JacksonBurns/astartes/pull/54

New Contributors

  • @himaghna made their first contribution in https://github.com/JacksonBurns/astartes/pull/1
  • @kspieks made their first contribution in https://github.com/JacksonBurns/astartes/pull/49

Full Changelog: https://github.com/JacksonBurns/astartes/commits/v1.0.0a2

Scientific Software - Peer-reviewed - Python
Published by JacksonBurns over 3 years ago