Recent Releases of BetaML

BetaML - v0.12.2

BetaML v0.12.2

Diff since v0.12.1

Introduced onfail option to try again fitting (with new random initial parameters) in case the loss doesn't decrease. Implemented for neural networks.

Merged pull requests: - Update README.md - don't link to PyJulia (#72) (@PallHaraldsson) - CompatHelper: bump compat for JLD2 to 0.5, (keep existing compat) (#74) (@github-actions[bot]) - CompatHelper: bump compat for Zygote to 0.7, (keep existing compat) (#76) (@github-actions[bot])

Closed issues: - [MLJ interface] NeuralNetworkClassifier falsely claims to support Count target (#75)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] 8 months ago

BetaML - v0.12.1

BetaML v0.12.1

Diff since v0.12.0

  • minor bugfixes

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 1 year ago

BetaML - v0.12.0

BetaML v0.12.0

Diff since v0.11.4

  • Added FeatureRanker, a flexible feature ranking estimator using multiple feature importance metrics
  • new functions kl_divergence and sobol_index
  • added option to tree-based models to ignore specific variables in prediction, by following both the splits on nodes occurring on that dimensions, as the keyword ignore_dims to the predict function
  • added option sampling_share to RandomForestEstimator model
  • DOC: added Benchmarks (but then temporarily removed due to the issue of SystemBenchmark not installable, see this issue )
  • DOC: added FeatureRanker tutorial
  • bugfix on l2loss_by_cv for unsupervised models

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 1 year ago

BetaML - v0.11.4

BetaML v0.11.4

Diff since v0.11.3

bugfix (solve issue in cosine_distance - similarity was actually computed)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] almost 2 years ago

BetaML - v0.11.3

BetaML v0.11.3

Diff since v0.11.2

  • bugfixes (removed old, undocumented, unused, type pirate findfirst and findall functions)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] almost 2 years ago

BetaML - v0.11.2

BetaML v0.11.2

Diff since v0.11.1

  • bugfixes

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] almost 2 years ago

BetaML - v0.11.1

BetaML v0.11.1

Diff since v0.11.0

  • changed some keyword arguments of AutoEncoder and PCAEncoder: outdims => encoded_size and innerdims => layers_size

This shouldn't be breaking as I twisted the constructor to accept the older names (until next breaking version 0.12)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] almost 2 years ago

BetaML - v0.11.0

BetaML v0.11.0

Diff since v0.10.4

Attention: many breaking changes in this version !!

  • experimental new ConvLayer and PoolLayer for convolutional networks. BetaML neural networks work only on CPU and even on CPU the convolution layers (but not the dense ones) are 2-3 times slower than Flux. Still they have some quite unique characteristics, like working with any dimensions or not requiring AD in most cases, so they may still be useful in some corner situations. Then, if you want to help in porting to GPU... ;-)
  • Isolated MLJ interface models into their own Bmlj submodule
  • Renamed many model in a congruent way
  • Shortened the hyper-parameters and learnable parameters struct names
  • Corrected many doc bugs
  • Several bugfixes

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] almost 2 years ago

BetaML - v0.10.4

BetaML v0.10.4

Diff since v0.10.3

  • Added models AutoEncoder and MLJ wrapper AutoEncoderMLJ with a m=AutoEncoder(hp); fit!(m,x); x_latent = predict(m,x); x̂ = inverse_predict(m,x_latent) interface. Users can optionally specify the number of dimensions to shrink the data (outdims), the number of neurons of the inner layers (innerdims) or the full details of the encoding and decoding layers and all the underlying NN options, but this remains optional.
  • Adapted 2loss_by_cv function to unsupervised models with inverse_predict
  • Several bugfixes

Merged pull requests: - CompatHelper: add new compat entry for Statistics at version 1, (keep existing compat) (#61) (@github-actions[bot]) - correct typo in AbstractTrees.printnode (#62) (@roland-KA)

Closed issues: - Deprecation warning from ProgressMeter.jl (#58)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] about 2 years ago

BetaML - v0.10.3

BetaML v0.10.3

Diff since v0.10.2

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 2 years ago

BetaML - v0.10.2

BetaML v0.10.2

Diff since v0.10.1

Merged pull requests: - CompatHelper: add new compat entry for DelimitedFiles at version 1, (keep existing compat) (#55) (@github-actions[bot])

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 2 years ago

BetaML - v0.10.1

BetaML v0.10.1

Diff since v0.10.0

Closed issues: - target_scitype for MultitargetNeuralNetworkRegressor is too broad (#53)

Merged pull requests: - CompatHelper: bump compat for StatsBase to 0.34, (keep existing compat) (#54) (@github-actions[bot])

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 2 years ago

BetaML - v0.10.0

BetaML v0.10.0

Diff since v0.9.7

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 2 years ago

BetaML - v0.9.7

BetaML v0.9.7

Diff since v0.9.6

Closed issues: - Trouble interpolating feature names in a wrapped tree (#48) - MLJ model docstrings (#49) - GaussianMixtureModelClusterer docstring has formatting issues (#50)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 2 years ago

BetaML - v0.9.6

BetaML v0.9.6

Diff since v0.9.5

Merged pull requests: - add test for plotting a tree using TreeRecipe (#47) (@roland-KA)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] almost 3 years ago

BetaML - v0.9.5

BetaML v0.9.5

Diff since v0.9.4

  • new function match_known_derivatives to set corresponding manual derivatives for well known activation or loss functions in NN and its implementation as default of the derivatives instead of nothing (i.e. AD)
  • new ConvLayer and PoolLayer. while very flexible (any dimension, any function) they are way too slow for any practical work, even if the convolution ids are cached. A pitty!

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] about 3 years ago

BetaML - v0.9.4

BetaML v0.9.4

Diff since v0.9.3

  • examples on all v2 API models

Closed issues: - Add MLJ-compliant document strings (#39)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] about 3 years ago

BetaML - v0.9.3

BetaML v0.9.3

Diff since v0.9.2

  • examples in doc (expecially MLJ interface), minor bugfixes

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] about 3 years ago

BetaML - v0.9.2

BetaML v0.9.2

Diff since v0.9.1

  • removed a warning that slipped in v0.9.1

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] about 3 years ago

BetaML - v0.9.1

BetaML v0.9.1

Diff since v0.9.0

  • cleaned up "old" deprecated v1 api
  • started implementation (forward pass) of a generic n-dimensional ConvLayer
  • printing of Decision Trees (@roland-KA )
  • improved documentation
  • new dependency on StaticArrays
  • changed verbosity in MLJ models

Closed issues: - Example with GaussianMixtureClusterer (#43) - Allow verbosity to be any integer? (#44)

Merged pull requests: - CompatHelper: add new compat entry for StaticArrays at version 1, (keep existing compat) (#45) (@github-actions[bot]) - Adapt to new AbstractTrees.AbstractNode type (#46) (@roland-KA)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] about 3 years ago

BetaML - v0.9.0

BetaML v0.9.0

Diff since v0.8.0

  • clean up of old functions deemed deprecated: either removed the code or unexported in favour of API v2
  • added some optimisations and support for LoopVectorization
  • renamed shuffle to consistent_shuffle to avoid type piracy
  • completed documentation of MLJ models

Closed issues: - Error during precompilation (ERROR: LoadError: InitError: Evaluation into the closed module Perceptron ...) (#42)

Merged pull requests: - JuliaCall from PythonCall.jl (#41) (@PallHaraldsson)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] about 3 years ago

BetaML - v0.8.0

BetaML v0.8.0

Diff since v0.7.1

  • support for all models of the new "V2" API that implements a "standard" mod = Model([Options]), fit!(mod,X,[Y]), predict(mod,[X]) workflow (details here). Classic API is now deprecated, with some of its functions be removed in the next BetaML 0.9 versions and some unexported.
  • standardised function names to follow the Julia style guidelines and the new BetaML code style guidelines](https://sylvaticus.github.io/BetaML.jl/dev/StyleGuide_templates.html)
  • new hyper-parameter autotuning method: mod = ModelXX(autotune=true) # --> control autotune with the parameter `tunemethod` fit!(mod,x,[y]) # --> autotune happens here together with final tuning est = predict(mod,xnew) Autotune is hyperthreaded with model-specific defaults. For example for Random Forests the defaults are: tunemethod=SuccessiveHalvingSearch( hpranges = Dict("n_trees" => [10, 20, 30, 40], "max_depth" => [5,10,nothing], "min_gain" => [0.0, 0.1, 0.5], "min_records" => [2,3,5], "max_features" => [nothing,5,10,30], "beta" => [0,0.01,0.1]), loss = l2loss_by_cv, # works for both regression and classification res_shares = [0.08, 0.1, 0.13, 0.15, 0.2, 0.3, 0.4] multithreads = false) # RF are already multi-threaded The number of models is reduced in order to arrive with a single model. Only supervised model autotuning is currently implemented, but GMM-based clustering autotuniing is planned using BIC or AIC.
  • new functions model_load and model_save to load/save trained models from the filesystem
  • new MinMaxScaler (StandardScaler was already available as classical API functions scale and getScalingFactors)
  • many bugfixes/improvments on corner situations
  • new MLJ interface models to NeuralNetworkEstimator

Closed issues: - Improve oneHotEncode stability for encoding integers embedding categories (#29) - initVarainces! doesn't support mixed-type variances (#33) - Error generating MLJ model registry (#37) - WARNING: could not import Perceptron ... (#38) - MLJ model BetaMLGMMRegressor predicting row vectors instead of column vectors (#40)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 3 years ago

BetaML - v0.7.1

BetaML v0.7.1

Diff since v0.7.0

  • solve issue #37
  • initial attempt to provide plotting of a decision tree

Merged pull requests: - 1st attempt to implement AbstractTrees-interface (#34) (@roland-KA) - CompatHelper: add new compat entry for AbstractTrees at version 0.4, (keep existing compat) (#35) (@github-actions[bot]) - AbstractTrees-interface completed (#36) (@roland-KA)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 3 years ago

BetaML - v0.7.0

BetaML v0.7.0

Diff since v0.6.1

  • new experimental V2 API that implements a "standard" mod = Model([Options]), train!(mod,X,[Y]), predict(mod,[X]) workflow. In BetaML v0.7 this new API is still experimental, as documentation and implementation are not completed (missing yet perceptions and NeuralNetworks). We plan to make it the default API in BetaML 0.8, when the current API will be dimmed deprecated.
  • new Imputation module with several missing values imputers MeanImputer, GMMImputer, RFImputer, GeneralImputer and relative MLJ interfaces. The last one, in particular, allows using any regressor/classifier (not necessarily of BetaML) for which the API described above is valid
  • Cluster module reorganised with only hard clustering algorithms (K-Means and K-medoids), while GMM clustering and the new GMMRegressor1 and GMMRegressor2 are in the new GMM module
  • Split large files in subfiles, like Trees.jl where DT and RF are now on separate (included) files
  • New oneHotDecoder(x) function in Utils module
  • New dependency to DocStringExtensions.jl
  • Several bugfixes

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 3 years ago

BetaML - v0.6.1

BetaML v0.6.1

Diff since v0.6.0

bugfix in Kernel Perceptron (binary and multi-class) when a single class is present in training (issue #32)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 3 years ago

BetaML - v0.6.0

BetaML v0.6.0

Diff since v0.5.6

  • bugfixes in MLJ interface, gmm clustering and other
    • API change for print(confusionMatrix) only

Merged pull requests: - Indent pegasos a bit more in the docstring (#30) (@rikhuijzer)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 3 years ago

BetaML - v0.5.6

BetaML v0.5.6

Diff since v0.5.5

  • bugfixes in MLJ interface, documentation build and a rare case of segfault on Julia 1.5

Closed issues: - MLJ traits for GMMClusterer (#26) - The input scitypes for trees are incorrect (#28)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] about 4 years ago

BetaML - v0.5.5

BetaML v0.5.5

Diff since v0.5.4

  • Added an optional "learnable" parameter to the activation function of VectorFunctionLayer
    • Added similar ScalarFunctionLayer (useful for multiclass, multi-label classification, see the test added to Nn_test.jl in the previous commit)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 4 years ago

BetaML - v0.5.4

BetaML v0.5.4

Diff since v0.5.3

Bugfix on pca() that was reporting the reprojected matrix (and the reprojection vectors) in the opposite order than announced (from the most explained variance to the less)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 4 years ago

BetaML - v0.5.3

BetaML v0.5.3

Diff since v0.5.2

Bugfix on findfirst() that was making ambiguous some base calls that use functions. The BetaML version now restricts to arrays of abstractstrings and numbers.

Closed issues: - Tag a new release to enable use with Distributions 0.25 (#25)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 4 years ago

BetaML - v0.5.2

BetaML v0.5.2

Diff since v0.5.1

Started development on STATS (sub-)module ("classical" statistics) Updated to Distributions 0.25

Merged pull requests: - CompatHelper: bump compat for "Distributions" to "0.25" (#24) (@github-actions[bot])

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 4 years ago

BetaML - v0.5.1

BetaML v0.5.1

Diff since v0.5.0

  • Finalised the JOSS paper
  • Some dependencies updated

Merged pull requests: - CompatHelper: bump compat for "CategoricalArrays" to "0.10" (#22) (@github-actions[bot]) - Minor changes to text (#23) (@arfon)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 4 years ago

BetaML - v0.5.0

BetaML v0.5.0

Diff since v0.4.1

What’s new in v0.5 (compared to 0.4.1):

Documentation - Extensive step-by-step tutorial to BetaML algorithms (and in a certain sense to ML and Julia in general), with comparisons with Clustering.jl, GaussianMixtures.jl, Flux.jl and DecisionTree.jl packages; - Added option to "preview" the documentation without running the code in the tutorial (push!(ARGS,"preview"); include("make.jl"))

MLJ API - Integration with the MLJ API. The following models have been made available to the MLJ framework : PerceptronClassifier, KernelPerceptronClassifier, PegasosClassifier, DecisionTreeClassifier, DecisionTreeRegressor, RandomForestClassifier, RandomForestRegressor, KMeans, KMedoids, GMMClusterer, MissingImputator

Package reorganisation - All the functionality of the different sub-modules is now re-exported at the root level, so the user needs just to using BetaML to access it - The Utils module has been split in different files

Stochasticity management - Added the parameter rng to all stochastic models to allow fine-tuning of the stochasticity/replicability trade-off - Added function generateParallelRngs to allow repeteable results indipendently from the number of thread used - Extended Random.shuffle function to allow multiple matrices and specify the dimension over which to shuffle

Utilities (BetaML.Utils) - Added dims and copy parameters to partition - Added crossValidation, with a user defined function/do block and configurable sampler (SamplerWithData{T <: AbstractDataSampler}) - Added ConfusionMatrix - Added the pool1d activation function

Other - Improved the grid initialisation for clusters - Updated the JOSS paper - New package dependencies: StableRNGs and ForceImport - Several bugfixes, optimisations and updated dependencies (see the commit log for details)

Closed issues: - Avoid observation-by-observation construction of UnivariateFinite objects in MLJ interface (#19) - MLJ interface: fit should not mutate model fields (#20)

Merged pull requests: - CompatHelper: bump compat for "MLJModelInterface" to "1.0" (#21) (@github-actions[bot])

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 4 years ago

BetaML - v0.4.1

BetaML v0.4.1

Diff since v0.4.0

News in v0.4.1 compared to 0.4:

  • Clustering (Beta.Clustering)

    • gmm and predictMissing initialisation default to kmeans
    • added given as initialisation for gmm and initMixtures!(v::Vector{AbstractMixtures})
    • added maxIter parameter to gmm
    • added MLJ interface models: KMeans, KMedoids, GMM, MissingImputator
  • Utilitis (BetaML.Utils)

    • expanded accuracy(yhatvector,yvector) to use ignoreLabels
    • added (but not used in accuracy) getPermutations(vector,keepStructure)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] almost 5 years ago

BetaML - v0.4.0

BetaML v0.4.0

Diff since v0.2.2

News in v0.4 compared to 0.3:

  • Decision Trees / Random Forests (BetaML.Trees)

    • Added support for fully categorical features (i.e. non even sortable ones) to trees models. All Trees models accept now almost any kind of possible type as feature: continuous, categorical, ordinal, missing data....
    • Added oobEstimation to Random Trees and support for trees weights on Random Forests models
  • Perceptron-like models (BetaML.Perceptron)

    • perceptron, kernelPerceptron and pegasos can now perfom multiclass classification and report their otputs as "probabilities" (or better, "normalised scores") for each class. Use their [name]Binary version for binary classification on {-1,+1} labels, and/or mode(y) to retrieve a single class prediction per each record.
  • Utilitis (BetaML.Utils)

    • Added issortable(array) to check if an array is sortable, i.e. has methos issort defined"""
    • Added partition() to partition (by rows) one or more matrices according to the predetermined shares, e.g. ((xtrain,xtest),(ytrain,ytest)) = partition([x,y],[0.7,0.3])
    • Added colsWithMissings to check which columns in a matrix have indeed missing values
    • Expanded error() and accuracy() to work with any T categorical value, not just Int64
  • Clustering (Beta.Clustering)

    • Renamed the em algorithm to gmm
  • MLJ API

    • Experimental initial integration with the MLJ API. For the time being the following models have been made available to the MLJ framework : PerceptronClassifier, KernelPerceptronClassifier, PegasosClassifier, DecisionTreeClassifier, DecisionTreeRegressor, RandomForestClassifier, RandomForestRegressor.
  • Other

    • Moved Continuous Integration to GitHub actions
    • Rename all rShuffle and sequential parameters in the various algorithms to shuffle
    • New package dependencies: CategoricalArrays and MLJModelInterface
    • Several bugfixes, optimisations and updated dependencies (see the commit log for details)
    • Updated documentation
    • Added option to run partial testing, eg: `using Pkg; Pkg.test("BetaML", test_args=["Trees","Clustering","all"])

Closed issues: - FYI: NNlib.jl; depend on it? (#11) - Random Forest does not appear to work (#12)

Merged pull requests: - CompatHelper: bump compat for "Distributions" to "0.24" (#13) (@github-actions[bot]) - CompatHelper: bump compat for "Zygote" to "0.6" (#14) (@github-actions[bot]) - CompatHelper: bump compat for "Reexport" to "1.0" (#15) (@github-actions[bot]) - CompatHelper: add new compat entry for "CategoricalArrays" at version "0.9" (#16) (@github-actions[bot]) - CompatHelper: bump compat for "PDMats" to "0.11" (#17) (@github-actions[bot])

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] almost 5 years ago

BetaML - v0.2.2

BetaML v0.2.2

Diff since v0.2.1

What's new (compared to v0.2.0):

PCA Analysys

You can now transform your data using PCA specifying either the number of dimensions you want to keep or the maximum error (variance) you are wiling to accept

kmeans init strategy for em clustering

The expectation-maximisation algorithm for fitting a Generative Mixture Models and cluster data/impute missing data can now be automatically initialised with the output of a kmeans clustering (just pass the parameter initStrategy="kmeans".

ADAM optimisation algorithm for neural networks

In addition to the classical Stochastic Gradient Descent, we added the efficient ADAM, moment based optimiser. The implementation is the same as in the paper where it is introduced, with the difference that the learning rate can be expressed as a (user-provied) function of the epoch rather than being a constant (but we kept as default t -> 0.001 as in the paper). The solution we chosen proved to be very flexible: adding a optimiser is just a matter of creating a struct that subclass OptimisationAlgorithm and implementing singleUpdate!(θ,▽,optAlg::OptimisationAlgorithm;nEpoch,nBatch,nBatches,xbatch,ybatch) and eventually initOptAlg!(optAlg::OptimisationAlgorithm;θ,batchSize,x,y).

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 5 years ago

BetaML - v0.2.1

BetaML v0.2.1

Diff since v0.2.0

New: - Added PCA - Added kmeans as initStrategy to em/predictMissing

Bugfix: - kmeans should not go toward a divide by zero error anymore

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 5 years ago

BetaML - v0.2.0

BetaML v0.2.0

  • Added several Neural Networks activation functions (thanks to @PallHaraldsson )
  • Generalised EM algorithm to diagonal and full Gaussian mixtures, with the possibility to use user-generated mixtures. Renamed collaborativeFiltering to predictMissing()

Closed issues: - Trigger registration to Julia (#1) - Rename Nn to NN (#4)

Merged pull requests: - CompatHelper: bump compat for "PDMats" to "0.10" (#5) (@github-actions[bot]) - New and better implementation of activations (#6) (@PallHaraldsson)

Scientific Software - Peer-reviewed - Julia
Published by github-actions[bot] over 5 years ago