Recent Releases of BetaML
BetaML - v0.12.2
BetaML v0.12.2
Introduced onfail option to try again fitting (with new random initial parameters) in case the loss doesn't decrease. Implemented for neural networks.
Merged pull requests: - Update README.md - don't link to PyJulia (#72) (@PallHaraldsson) - CompatHelper: bump compat for JLD2 to 0.5, (keep existing compat) (#74) (@github-actions[bot]) - CompatHelper: bump compat for Zygote to 0.7, (keep existing compat) (#76) (@github-actions[bot])
Closed issues:
- [MLJ interface] NeuralNetworkClassifier falsely claims to support Count target (#75)
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] 8 months ago
BetaML - v0.12.0
BetaML v0.12.0
- Added
FeatureRanker, a flexible feature ranking estimator using multiple feature importance metrics - new functions
kl_divergenceandsobol_index - added option to tree-based models to ignore specific variables in prediction, by following both the splits on nodes occurring on that dimensions, as the keyword
ignore_dimsto thepredictfunction - added option
sampling_sharetoRandomForestEstimatormodel - DOC: added Benchmarks (but then temporarily removed due to the issue of SystemBenchmark not installable, see this issue )
- DOC: added
FeatureRankertutorial - bugfix on
l2loss_by_cvfor unsupervised models
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] over 1 year ago
BetaML - v0.11.1
BetaML v0.11.1
- changed some keyword arguments of AutoEncoder and PCAEncoder:
outdims=>encoded_sizeandinnerdims=>layers_size
This shouldn't be breaking as I twisted the constructor to accept the older names (until next breaking version 0.12)
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] almost 2 years ago
BetaML - v0.11.0
BetaML v0.11.0
Attention: many breaking changes in this version !!
- experimental new
ConvLayerandPoolLayerfor convolutional networks. BetaML neural networks work only on CPU and even on CPU the convolution layers (but not the dense ones) are 2-3 times slower than Flux. Still they have some quite unique characteristics, like working with any dimensions or not requiring AD in most cases, so they may still be useful in some corner situations. Then, if you want to help in porting to GPU... ;-) - Isolated MLJ interface models into their own
Bmljsubmodule - Renamed many model in a congruent way
- Shortened the hyper-parameters and learnable parameters struct names
- Corrected many doc bugs
- Several bugfixes
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] almost 2 years ago
BetaML - v0.10.4
BetaML v0.10.4
- Added models
AutoEncoderand MLJ wrapperAutoEncoderMLJwith am=AutoEncoder(hp); fit!(m,x); x_latent = predict(m,x); x̂ = inverse_predict(m,x_latent)interface. Users can optionally specify the number of dimensions to shrink the data (outdims), the number of neurons of the inner layers (innerdims) or the full details of the encoding and decoding layers and all the underlying NN options, but this remains optional. - Adapted
2loss_by_cvfunction to unsupervised models with inverse_predict - Several bugfixes
Merged pull requests:
- CompatHelper: add new compat entry for Statistics at version 1, (keep existing compat) (#61) (@github-actions[bot])
- correct typo in AbstractTrees.printnode (#62) (@roland-KA)
Closed issues: - Deprecation warning from ProgressMeter.jl (#58)
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] about 2 years ago
BetaML - v0.10.1
BetaML v0.10.1
Closed issues:
- target_scitype for MultitargetNeuralNetworkRegressor is too broad (#53)
Merged pull requests: - CompatHelper: bump compat for StatsBase to 0.34, (keep existing compat) (#54) (@github-actions[bot])
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] over 2 years ago
BetaML - v0.9.5
BetaML v0.9.5
- new function
match_known_derivativesto set corresponding manual derivatives for well known activation or loss functions in NN and its implementation as default of the derivatives instead ofnothing(i.e. AD) - new ConvLayer and PoolLayer. while very flexible (any dimension, any function) they are way too slow for any practical work, even if the convolution ids are cached. A pitty!
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] about 3 years ago
BetaML - v0.9.1
BetaML v0.9.1
- cleaned up "old" deprecated v1 api
- started implementation (forward pass) of a generic n-dimensional ConvLayer
- printing of Decision Trees (@roland-KA )
- improved documentation
- new dependency on StaticArrays
- changed verbosity in MLJ models
Closed issues:
- Example with GaussianMixtureClusterer (#43)
- Allow verbosity to be any integer? (#44)
Merged pull requests:
- CompatHelper: add new compat entry for StaticArrays at version 1, (keep existing compat) (#45) (@github-actions[bot])
- Adapt to new AbstractTrees.AbstractNode type (#46) (@roland-KA)
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] about 3 years ago
BetaML - v0.9.0
BetaML v0.9.0
- clean up of old functions deemed deprecated: either removed the code or unexported in favour of API v2
- added some optimisations and support for
LoopVectorization - renamed
shuffletoconsistent_shuffleto avoid type piracy - completed documentation of MLJ models
Closed issues:
- Error during precompilation (ERROR: LoadError: InitError: Evaluation into the closed module Perceptron ...) (#42)
Merged pull requests: - JuliaCall from PythonCall.jl (#41) (@PallHaraldsson)
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] about 3 years ago
BetaML - v0.8.0
BetaML v0.8.0
- support for all models of the new "V2" API that implements a "standard"
mod = Model([Options]),fit!(mod,X,[Y]),predict(mod,[X])workflow (details here). Classic API is now deprecated, with some of its functions be removed in the next BetaML 0.9 versions and some unexported. - standardised function names to follow the Julia style guidelines and the new BetaML code style guidelines](https://sylvaticus.github.io/BetaML.jl/dev/StyleGuide_templates.html)
- new hyper-parameter autotuning method:
mod = ModelXX(autotune=true) # --> control autotune with the parameter `tunemethod` fit!(mod,x,[y]) # --> autotune happens here together with final tuning est = predict(mod,xnew)Autotune is hyperthreaded with model-specific defaults. For example for Random Forests the defaults are:tunemethod=SuccessiveHalvingSearch( hpranges = Dict("n_trees" => [10, 20, 30, 40], "max_depth" => [5,10,nothing], "min_gain" => [0.0, 0.1, 0.5], "min_records" => [2,3,5], "max_features" => [nothing,5,10,30], "beta" => [0,0.01,0.1]), loss = l2loss_by_cv, # works for both regression and classification res_shares = [0.08, 0.1, 0.13, 0.15, 0.2, 0.3, 0.4] multithreads = false) # RF are already multi-threadedThe number of models is reduced in order to arrive with a single model. Only supervised model autotuning is currently implemented, but GMM-based clustering autotuniing is planned usingBICorAIC. - new functions
model_loadandmodel_saveto load/save trained models from the filesystem - new
MinMaxScaler(StandardScalerwas already available as classical API functionsscaleandgetScalingFactors) - many bugfixes/improvments on corner situations
- new MLJ interface models to
NeuralNetworkEstimator
Closed issues:
- Improve oneHotEncode stability for encoding integers embedding categories (#29)
- initVarainces! doesn't support mixed-type variances (#33)
- Error generating MLJ model registry (#37)
- WARNING: could not import Perceptron ... (#38)
- MLJ model BetaMLGMMRegressor predicting row vectors instead of column vectors (#40)
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] over 3 years ago
BetaML - v0.7.1
BetaML v0.7.1
- solve issue #37
- initial attempt to provide plotting of a decision tree
Merged pull requests:
- 1st attempt to implement AbstractTrees-interface (#34) (@roland-KA)
- CompatHelper: add new compat entry for AbstractTrees at version 0.4, (keep existing compat) (#35) (@github-actions[bot])
- AbstractTrees-interface completed (#36) (@roland-KA)
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] over 3 years ago
BetaML - v0.7.0
BetaML v0.7.0
- new experimental V2 API that implements a "standard"
mod = Model([Options]),train!(mod,X,[Y]),predict(mod,[X])workflow. In BetaML v0.7 this new API is still experimental, as documentation and implementation are not completed (missing yet perceptions and NeuralNetworks). We plan to make it the default API in BetaML 0.8, when the current API will be dimmed deprecated. - new
Imputationmodule with several missing values imputersMeanImputer,GMMImputer,RFImputer,GeneralImputerand relative MLJ interfaces. The last one, in particular, allows using any regressor/classifier (not necessarily of BetaML) for which the API described above is valid Clustermodule reorganised with only hard clustering algorithms (K-Means and K-medoids), while GMM clustering and the newGMMRegressor1andGMMRegressor2are in the newGMMmodule- Split large files in subfiles, like
Trees.jlwhere DT and RF are now on separate (included) files - New
oneHotDecoder(x)function inUtilsmodule - New dependency to
DocStringExtensions.jl - Several bugfixes
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] over 3 years ago
BetaML - v0.6.0
BetaML v0.6.0
- bugfixes in MLJ interface, gmm clustering and other
- API change for print(confusionMatrix) only
Merged pull requests:
- Indent pegasos a bit more in the docstring (#30) (@rikhuijzer)
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] over 3 years ago
BetaML - v0.5.6
BetaML v0.5.6
- bugfixes in MLJ interface, documentation build and a rare case of segfault on Julia 1.5
Closed issues: - MLJ traits for GMMClusterer (#26) - The input scitypes for trees are incorrect (#28)
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] about 4 years ago
BetaML - v0.5.5
BetaML v0.5.5
- Added an optional "learnable" parameter to the activation function of VectorFunctionLayer
- Added similar ScalarFunctionLayer (useful for multiclass, multi-label classification, see the test added to Nn_test.jl in the previous commit)
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] over 4 years ago
BetaML - v0.5.3
BetaML v0.5.3
Bugfix on findfirst() that was making ambiguous some base calls that use functions. The BetaML version now restricts to arrays of abstractstrings and numbers.
Closed issues: - Tag a new release to enable use with Distributions 0.25 (#25)
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] over 4 years ago
BetaML - v0.5.2
BetaML v0.5.2
Started development on STATS (sub-)module ("classical" statistics) Updated to Distributions 0.25
Merged pull requests: - CompatHelper: bump compat for "Distributions" to "0.25" (#24) (@github-actions[bot])
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] over 4 years ago
BetaML - v0.5.1
BetaML v0.5.1
- Finalised the JOSS paper
- Some dependencies updated
Merged pull requests: - CompatHelper: bump compat for "CategoricalArrays" to "0.10" (#22) (@github-actions[bot]) - Minor changes to text (#23) (@arfon)
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] over 4 years ago
BetaML - v0.5.0
BetaML v0.5.0
What’s new in v0.5 (compared to 0.4.1):
Documentation
- Extensive step-by-step tutorial to BetaML algorithms (and in a certain sense to ML and Julia in general), with comparisons with Clustering.jl, GaussianMixtures.jl, Flux.jl and DecisionTree.jl packages;
- Added option to "preview" the documentation without running the code in the tutorial (push!(ARGS,"preview"); include("make.jl"))
MLJ API
- Integration with the MLJ API. The following models have been made available to the MLJ framework : PerceptronClassifier, KernelPerceptronClassifier, PegasosClassifier, DecisionTreeClassifier, DecisionTreeRegressor, RandomForestClassifier, RandomForestRegressor, KMeans, KMedoids, GMMClusterer, MissingImputator
Package reorganisation
- All the functionality of the different sub-modules is now re-exported at the root level, so the user needs just to using BetaML to access it
- The Utils module has been split in different files
Stochasticity management
- Added the parameter rng to all stochastic models to allow fine-tuning of the stochasticity/replicability trade-off
- Added function generateParallelRngs to allow repeteable results indipendently from the number of thread used
- Extended Random.shuffle function to allow multiple matrices and specify the dimension over which to shuffle
Utilities (BetaML.Utils)
- Added dims and copy parameters to partition
- Added crossValidation, with a user defined function/do block and configurable sampler (SamplerWithData{T <: AbstractDataSampler})
- Added ConfusionMatrix
- Added the pool1d activation function
Other
- Improved the grid initialisation for clusters
- Updated the JOSS paper
- New package dependencies: StableRNGs and ForceImport
- Several bugfixes, optimisations and updated dependencies (see the commit log for details)
Closed issues: - Avoid observation-by-observation construction of UnivariateFinite objects in MLJ interface (#19) - MLJ interface: fit should not mutate model fields (#20)
Merged pull requests: - CompatHelper: bump compat for "MLJModelInterface" to "1.0" (#21) (@github-actions[bot])
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] over 4 years ago
BetaML - v0.4.1
BetaML v0.4.1
News in v0.4.1 compared to 0.4:
Clustering (Beta.Clustering)
gmmandpredictMissinginitialisation default tokmeans- added
givenas initialisation forgmmandinitMixtures!(v::Vector{AbstractMixtures}) - added
maxIterparameter togmm - added MLJ interface models: KMeans, KMedoids, GMM, MissingImputator
Utilitis (BetaML.Utils)
- expanded
accuracy(yhatvector,yvector)to useignoreLabels - added (but not used in
accuracy)getPermutations(vector,keepStructure)
- expanded
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] almost 5 years ago
BetaML - v0.4.0
BetaML v0.4.0
News in v0.4 compared to 0.3:
Decision Trees / Random Forests (BetaML.Trees)
- Added support for fully categorical features (i.e. non even sortable ones) to trees models. All Trees models accept now almost any kind of possible type as feature: continuous, categorical, ordinal, missing data....
- Added oobEstimation to Random Trees and support for trees weights on Random Forests models
Perceptron-like models (BetaML.Perceptron)
perceptron,kernelPerceptronandpegasoscan now perfom multiclass classification and report their otputs as "probabilities" (or better, "normalised scores") for each class. Use their[name]Binaryversion for binary classification on{-1,+1}labels, and/ormode(y)to retrieve a single class prediction per each record.
Utilitis (BetaML.Utils)
- Added
issortable(array)to check if an array is sortable, i.e. has methos issort defined""" - Added
partition()to partition (by rows) one or more matrices according to the predetermined shares, e.g.((xtrain,xtest),(ytrain,ytest)) = partition([x,y],[0.7,0.3]) - Added
colsWithMissingsto check which columns in a matrix have indeed missing values - Expanded
error()andaccuracy()to work with any T categorical value, not just Int64
- Added
Clustering (Beta.Clustering)
- Renamed the
emalgorithm togmm
- Renamed the
MLJ API
- Experimental initial integration with the MLJ API. For the time being the following models have been made available to the MLJ framework :
PerceptronClassifier,KernelPerceptronClassifier,PegasosClassifier,DecisionTreeClassifier,DecisionTreeRegressor,RandomForestClassifier,RandomForestRegressor.
- Experimental initial integration with the MLJ API. For the time being the following models have been made available to the MLJ framework :
Other
- Moved Continuous Integration to GitHub actions
- Rename all
rShuffleandsequentialparameters in the various algorithms toshuffle - New package dependencies: CategoricalArrays and MLJModelInterface
- Several bugfixes, optimisations and updated dependencies (see the commit log for details)
- Updated documentation
- Added option to run partial testing, eg: `using Pkg; Pkg.test("BetaML", test_args=["Trees","Clustering","all"])
Closed issues: - FYI: NNlib.jl; depend on it? (#11) - Random Forest does not appear to work (#12)
Merged pull requests: - CompatHelper: bump compat for "Distributions" to "0.24" (#13) (@github-actions[bot]) - CompatHelper: bump compat for "Zygote" to "0.6" (#14) (@github-actions[bot]) - CompatHelper: bump compat for "Reexport" to "1.0" (#15) (@github-actions[bot]) - CompatHelper: add new compat entry for "CategoricalArrays" at version "0.9" (#16) (@github-actions[bot]) - CompatHelper: bump compat for "PDMats" to "0.11" (#17) (@github-actions[bot])
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] almost 5 years ago
BetaML - v0.2.2
BetaML v0.2.2
What's new (compared to v0.2.0):
PCA Analysys
You can now transform your data using PCA specifying either the number of dimensions you want to keep or the maximum error (variance) you are wiling to accept
kmeans init strategy for em clustering
The expectation-maximisation algorithm for fitting a Generative Mixture Models and cluster data/impute missing data can now be automatically initialised with the output of a kmeans clustering (just pass the parameter initStrategy="kmeans".
ADAM optimisation algorithm for neural networks
In addition to the classical Stochastic Gradient Descent, we added the efficient ADAM, moment based optimiser. The implementation is the same as in the paper where it is introduced, with the difference that the learning rate can be expressed as a (user-provied) function of the epoch rather than being a constant (but we kept as default t -> 0.001 as in the paper).
The solution we chosen proved to be very flexible: adding a optimiser is just a matter of creating a struct that subclass OptimisationAlgorithm and implementing singleUpdate!(θ,▽,optAlg::OptimisationAlgorithm;nEpoch,nBatch,nBatches,xbatch,ybatch) and eventually initOptAlg!(optAlg::OptimisationAlgorithm;θ,batchSize,x,y).
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] over 5 years ago
BetaML - v0.2.0
BetaML v0.2.0
- Added several Neural Networks activation functions (thanks to @PallHaraldsson )
- Generalised EM algorithm to diagonal and full Gaussian mixtures, with the possibility to use user-generated mixtures. Renamed collaborativeFiltering to predictMissing()
Closed issues: - Trigger registration to Julia (#1) - Rename Nn to NN (#4)
Merged pull requests: - CompatHelper: bump compat for "PDMats" to "0.10" (#5) (@github-actions[bot]) - New and better implementation of activations (#6) (@PallHaraldsson)
Scientific Software - Peer-reviewed
- Julia
Published by github-actions[bot] over 5 years ago