mlr

Machine Learning in R

https://github.com/mlr-org/mlr

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    24 of 105 committers (22.9%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.7%) to scientific vocabulary

Keywords

classification clustering cran data-science feature-selection hyperparameters-optimization imbalance-correction learners machine-learning mlr multilabel-classification predictive-modeling r r-package regression stacking statistics survival-analysis tuning tutorial

Keywords from Contributors

mlr3 reproducibility gbm automl resampling parallel-computing interface book temporal resampling-methods
Last synced: 6 months ago · JSON representation

Repository

Machine Learning in R

Basic Info
  • Host: GitHub
  • Owner: mlr-org
  • License: other
  • Language: R
  • Default Branch: master
  • Homepage: https://mlr.mlr-org.com
  • Size: 604 MB
Statistics
  • Stars: 1,665
  • Watchers: 105
  • Forks: 405
  • Open Issues: 12
  • Releases: 20
Topics
classification clustering cran data-science feature-selection hyperparameters-optimization imbalance-correction learners machine-learning mlr multilabel-classification predictive-modeling r r-package regression stacking statistics survival-analysis tuning tutorial
Created over 12 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog License

README.md

mlr

Package website: release | dev

Machine learning in R.

tic CRAN_Status_Badge cran checks CRAN Downloads StackOverflow lifecycle codecov

Deprecated

{mlr} is considered retired from the mlr-org team. We won't add new features anymore and will only fix severe bugs. We suggest to use the new mlr3 framework from now on and for future projects.

Not all features of {mlr} are already implemented in {mlr3}. If you are missing a crucial feature, please open an issue in the respective mlr3 extension package and do not hesitate to follow-up on it.

Installation

Release

r install.packages("mlr")

Development

R remotes::install_github("mlr-org/mlr")

Citing {mlr} in publications

Please cite our JMLR paper [bibtex].

Some parts of the package were created as part of other publications. If you use these parts, please cite the relevant work appropriately. An overview of all {mlr} related publications can be found here.

Introduction

R does not define a standardized interface for its machine-learning algorithms. Therefore, for any non-trivial experiments, you need to write lengthy, tedious and error-prone wrappers to call the different algorithms and unify their respective output.

Additionally you need to implement infrastructure to

  • resample your models
  • optimize hyperparameters
  • select features
  • cope with pre- and post-processing of data and compare models in a statistically meaningful way.

As this becomes computationally expensive, you might want to parallelize your experiments as well. This often forces users to make crummy trade-offs in their experiments due to time constraints or lacking expert programming skills.

{mlr} provides this infrastructure so that you can focus on your experiments! The framework provides supervised methods like classification, regression and survival analysis along with their corresponding evaluation and optimization methods, as well as unsupervised methods like clustering. It is written in a way that you can extend it yourself or deviate from the implemented convenience methods and construct your own complex experiments or algorithms.

Furthermore, the package is nicely connected to the OpenML R package and its online platform, which aims at supporting collaborative machine learning online and allows to easily share datasets as well as machine learning tasks, algorithms and experiments in order to support reproducible research.

Features

  • Clear S3 interface to R classification, regression, clustering and survival analysis methods
  • Abstract description of learners and tasks by properties
  • Convenience methods and generic building blocks for your machine learning experiments
  • Resampling methods like bootstrapping, cross-validation and subsampling
  • Extensive visualizations (e.g. ROC curves, predictions and partial predictions)
  • Simplified benchmarking across data sets and learners
  • Easy hyperparameter tuning using different optimization strategies, including potent configurators like
    • iterated F-racing (irace)
    • sequential model-based optimization
  • Variable selection with filters and wrappers
  • Nested resampling of models with tuning and feature selection
  • Cost-sensitive learning, threshold tuning and imbalance correction
  • Wrapper mechanism to extend learner functionality in complex ways
  • Possibility to combine different processing steps to a complex data mining chain that can be jointly optimized
  • OpenML connector for the Open Machine Learning server
  • Built-in parallelization
  • Detailed tutorial

Miscellaneous

Simple usage questions are better suited at Stackoverflow using the mlr tag.

Please note that all of us work in academia and put a lot of work into this project - simply because we like it, not because we are paid for it.

New development efforts should go into {mlr3}. We have a own style guide which can easily applied by using the mlr_style from the styler package. See our wiki for more information.

Talks, Workshops, etc.

mlr-outreach holds all outreach activities related to {mlr} and {mlr3}.

Owner

  • Name: mlr-org
  • Login: mlr-org
  • Kind: organization
  • Location: Munich, Germany

GitHub Events

Total
  • Watch event: 30
  • Delete event: 1
  • Push event: 1
  • Fork event: 4
  • Create event: 1
Last Year
  • Watch event: 30
  • Delete event: 1
  • Push event: 1
  • Fork event: 4
  • Create event: 1

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 4,308
  • Total Committers: 105
  • Avg Commits per committer: 41.029
  • Development Distribution Score (DDS): 0.717
Past Year
  • Commits: 3
  • Committers: 1
  • Avg Commits per committer: 3.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Bernd Bischl b****l@g****t 1,218
pat-s p****z@g****m 567
Michel m****g@g****m 511
Jakob Richter c****e@j****e 403
Lars Kotthoff l****o@c****a 270
Travis t@n****m 138
Zachary M. Jones z****j@z****m 111
Bernd Bischl y****u@e****m 110
studerus e****s@g****m 98
Lars Kotthoff l****f 74
schiffner s****r@m****e 63
ja-thomas j****s 49
Lars Kotthoff l****o@4****e 48
mb706 m****6 47
kerschke k****e@u****e 30
GitHub n****y@g****m 28
Florian Fendt f****t@g****e 28
Philipp Probst p****t@g****e 28
Giuseppe g****o@g****m 27
Stefan Coors s****s@g****t 25
Jakob Bossek i****o@j****e 25
Maria Erdmann m****n@c****e 24
Travis t****s@n****m 23
hetong007 h****7@g****m 22
Mason Gallo M****o 20
Pascal Kerschke p****e@w****e 20
giuseppec g****o@s****e 19
tkuehn13 t****n@g****e 17
ljudt l****t@t****e 15
Karin Schork s****n@g****m 14
and 75 more...

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 51
  • Total pull requests: 54
  • Average time to close issues: 6 months
  • Average time to close pull requests: 11 days
  • Total issue authors: 36
  • Total pull request authors: 9
  • Average comments per issue: 3.69
  • Average comments per pull request: 0.76
  • Merged pull requests: 42
  • Bot issues: 0
  • Bot pull requests: 11
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: about 1 hour
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • pat-s (8)
  • DrZhaoJie (5)
  • bommert (3)
  • Haridut (2)
  • maozdd (2)
  • decenwang (1)
  • TylerGrantSmith (1)
  • sckott (1)
  • vee3capp (1)
  • nikosGeography (1)
  • arielf (1)
  • SebastianHapp (1)
  • S-UP (1)
  • Sade154 (1)
  • ajing (1)
Pull Request Authors
  • pat-s (32)
  • pre-commit-ci[bot] (11)
  • jokokojote (3)
  • jakob-r (2)
  • mb706 (2)
  • lorenzwalthert (1)
  • bommert (1)
  • zhangzhixi0305 (1)
  • MichaelChirico (1)
Top Labels
Issue Labels
type-bug (24) type-question (22) prio-low (3) stale (3) prio-medium (3) type-documentation (2) prio-high (2) project - Learners (2) project - base (1) effort-simplefix (1) project - tutorial (1) pinned (1) project - Obscure / Potential close? (1) type-enhancement (1)
Pull Request Labels
type-bug (1) project - base (1) type-documentation (1) prio-medium (1)

Packages

  • Total packages: 3
  • Total downloads:
    • cran 47,503 last-month
  • Total docker downloads: 54,336
  • Total dependent packages: 42
    (may contain duplicates)
  • Total dependent repositories: 91
    (may contain duplicates)
  • Total versions: 48
  • Total maintainers: 1
cran.r-project.org: mlr

Machine Learning in R

  • Versions: 26
  • Dependent Packages: 39
  • Dependent Repositories: 90
  • Downloads: 47,503 Last month
  • Docker Downloads: 54,336
Rankings
Forks count: 0.1%
Stargazers count: 0.1%
Dependent packages count: 2.0%
Dependent repos count: 2.4%
Downloads: 3.5%
Average: 5.2%
Docker downloads count: 23.3%
Maintainers (1)
Last synced: 6 months ago
proxy.golang.org: github.com/mlr-org/mlr
  • Versions: 12
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 9.0%
Average: 9.6%
Dependent repos count: 10.2%
Last synced: 6 months ago
conda-forge.org: r-mlr
  • Versions: 10
  • Dependent Packages: 3
  • Dependent Repositories: 1
Rankings
Forks count: 8.4%
Stargazers count: 10.0%
Average: 14.6%
Dependent packages count: 15.6%
Dependent repos count: 24.4%
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • ParamHelpers >= 1.10 depends
  • R >= 3.0.2 depends
  • BBmisc >= 1.11 imports
  • XML * imports
  • backports >= 1.1.0 imports
  • checkmate >= 1.8.2 imports
  • data.table >= 1.12.4 imports
  • ggplot2 * imports
  • methods * imports
  • parallelMap >= 1.3 imports
  • stats * imports
  • stringi * imports
  • survival * imports
  • utils * imports
  • C50 * suggests
  • ClusterR * suggests
  • Cubist * suggests
  • DiceKriging * suggests
  • FDboost * suggests
  • FNN * suggests
  • FSelector * suggests
  • FSelectorRcpp >= 0.3.5 suggests
  • GPfit * suggests
  • GenSA * suggests
  • Hmisc * suggests
  • LiblineaR * suggests
  • MASS * suggests
  • PMCMRplus * suggests
  • ROCR * suggests
  • RRF * suggests
  • RSNNS * suggests
  • RWeka * suggests
  • Rmpi * suggests
  • SwarmSVM * suggests
  • TH.data * suggests
  • ada * suggests
  • adabag * suggests
  • batchtools * suggests
  • bit64 * suggests
  • brnn * suggests
  • bst * suggests
  • care * suggests
  • caret >= 6.0 suggests
  • class * suggests
  • clue * suggests
  • cluster * suggests
  • clusterSim >= 0.44 suggests
  • cmaes * suggests
  • cowplot * suggests
  • crs * suggests
  • deepnet * suggests
  • e1071 * suggests
  • earth * suggests
  • elasticnet * suggests
  • emoa * suggests
  • evtree * suggests
  • fda.usc * suggests
  • forecast >= 8.3 suggests
  • fpc * suggests
  • frbs * suggests
  • gbm * suggests
  • ggpubr * suggests
  • glmnet * suggests
  • h2o >= 3.6.0.8 suggests
  • irace >= 2.0 suggests
  • kernlab * suggests
  • kknn * suggests
  • klaR * suggests
  • knitr * suggests
  • laGP * suggests
  • lintr >= 1.0.0.9001 suggests
  • mRMRe * suggests
  • mboost * suggests
  • mco * suggests
  • mda * suggests
  • memoise * suggests
  • mlbench * suggests
  • mldr * suggests
  • mlrMBO * suggests
  • mmpf * suggests
  • modeltools * suggests
  • neuralnet * suggests
  • nnet * suggests
  • numDeriv * suggests
  • pamr * suggests
  • pander * suggests
  • party * suggests
  • pec * suggests
  • penalized >= 0.9 suggests
  • pls * suggests
  • praznik >= 5.0.0 suggests
  • rFerns * suggests
  • randomForest * suggests
  • ranger >= 0.8.0 suggests
  • rappdirs * suggests
  • refund * suggests
  • rex * suggests
  • rgenoud * suggests
  • rmarkdown * suggests
  • rotationForest * suggests
  • rpart * suggests
  • rsm * suggests
  • rucrdtw * suggests
  • sda * suggests
  • sf * suggests
  • smoof * suggests
  • sparseLDA * suggests
  • stepPlr * suggests
  • survAUC * suggests
  • svglite * suggests
  • testthat * suggests
  • tgp * suggests
  • tidyr * suggests
  • tsfeatures * suggests
  • vdiffr * suggests
  • wavelets * suggests
  • xgboost >= 0.7 suggests