mlr3torch

Deep learning framework for the mlr3 ecosystem based on torch

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (18.8%) to scientific vocabulary

Keywords

data-science deep-learning machine-learning mlr3 r r-package torch

Keywords from Contributors

stack bagging ensemble-learning pipelines autograding preprocessing dataflow-programming learners distribution interactive

Last synced: 6 months ago · JSON representation

Repository

Deep learning framework for the mlr3 ecosystem based on torch

Basic Info

Host: GitHub
Owner: mlr-org
License: other
Language: R
Default Branch: main
Homepage: https://mlr3torch.mlr-org.com
Size: 120 MB

Statistics

Stars: 52
Watchers: 7
Forks: 8
Open Issues: 52
Releases: 7

Topics

data-science deep-learning machine-learning mlr3 r r-package torch

Created over 4 years ago · Last pushed 6 months ago

Metadata Files

Readme Changelog License

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  cache = FALSE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

set.seed(123)
library("mlr3torch")
lgr::get_logger("mlr3")$set_threshold("warn")
```

# mlr3torch 

Package website: [release](https://mlr3torch.mlr-org.com/) | [dev](https://mlr3torch.mlr-org.com/dev/)

Deep Learning with torch and mlr3.


[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[![r-cmd-check](https://github.com/mlr-org/mlr3torch/actions/workflows/r-cmd-check.yml/badge.svg)](https://github.com/mlr-org/mlr3torch/actions/workflows/r-cmd-check.yml)
[![CRAN status](https://www.r-pkg.org/badges/version/mlr3torch)](https://CRAN.R-project.org/package=mlr3torch)
[![Mattermost](https://img.shields.io/badge/chat-mattermost-orange.svg)](https://lmmisld-lmu-stats-slds.srv.mwn.de/mlr_invite/)


## Installation

```{r eval = FALSE}
# Install from CRAN
install.packages("mlr3torch")
# Install the development version from GitHub:
pak::pak("mlr-org/mlr3torch")
```

Afterwards, you also need to run the command below:

```{r, eval = FALSE}
torch::install_torch()
```

More information about installing `torch` can be found [here](https://torch.mlverse.org/docs/articles/installation.html).

## What is mlr3torch?

`mlr3torch` is a deep learning framework for the [`mlr3`](https://mlr-org.com) ecosystem built on top of [`torch`](https://torch.mlverse.org/).
It allows to easily build, train and evaluate deep learning models in a few lines of codes, without needing to worry about low-level details.
Off-the-shelf learners are readily available, but custom architectures can be defined by connecting `PipeOpTorch` operators in an `mlr3pipelines::Graph`.

Using predefined learners such as a simple multi layer perceptron (MLP) works just like any other mlr3 `Learner`.

```{r}
library(mlr3torch)
learner_mlp = lrn("classif.mlp",
  # defining network parameters
  activation     = nn_relu,
  neurons        = c(20, 20),
  # training parameters
  batch_size     = 16,
  epochs         = 50,
  device         = "cpu",
  # Proportion of data to use for validation
  validate = 0.3,
  # Defining the optimizer, loss, and callbacks
  optimizer      = t_opt("adam", lr = 0.1),
  loss           = t_loss("cross_entropy"),
  callbacks      = t_clbk("history"), # this saves the history in the learner
  # Measures to track
  measures_valid = msrs(c("classif.logloss", "classif.ce")),
  measures_train = msrs(c("classif.acc")),
  # predict type (required by logloss)
  predict_type = "prob"
)
```

Below, we train this learner on the sonar example task:

```{r}
learner_mlp$train(tsk("sonar"))
```

Next, we construct the same architecture using `PipeOpTorch` objects.
The first pipeop -- a `PipeOpTorchIngress` -- defines the entrypoint of the network.
All subsequent pipeops define the neural network layers.

```{r}
architecture = po("torch_ingress_num") %>>%
  po("nn_linear", out_features = 20) %>>%
  po("nn_relu") %>>%
  po("nn_head")
```

To turn this into a learner, we configure the loss, optimizer, callbacks as well as the training arguments.

```{r}
graph_mlp = architecture %>>%
  po("torch_loss", loss = t_loss("cross_entropy")) %>>%
  po("torch_optimizer", optimizer = t_opt("adam", lr = 0.1)) %>>%
  po("torch_callbacks", callbacks = t_clbk("history")) %>>%
  po("torch_model_classif",
    batch_size = 16, epochs = 50, device = "cpu")

graph_lrn = as_learner(graph_mlp)
```

To work with generic tensors, the `lazy_tensor` type can be used.
It wraps a `torch::dataset`, but allows to preprocess the data (lazily) using `PipeOp` objects.
Below, we flatten the MNIST task, so we can then train a multi-layer perceptron on it.
Note that this does *not* transform the data in-memory, but is only applied when the data is actually loaded.

```{r}
# load the predefined mnist task
mnist = tsk("mnist")
mnist$head(3L)

# Flatten the images
flattener = po("trafo_reshape", shape = c(-1, 28 * 28))
mnist_flat = flattener$train(list(mnist))[[1L]]

mnist_flat$head(3L)
```

To actually access the tensors, we can call `materialize()`.
We only show a slice of the resulting tensor for readability:

```{r}
materialize(
  mnist_flat$data(1:2, cols = "image")[[1L]],
  rbind = TRUE
)[1:2, 1:4]
```

Below, we define a more complex architecture that has one single input which is a `lazy_tensor`.
For that, we first define a single residual block:

```{r}
layer = list(
  po("nop"),
  po("nn_linear", out_features = 50L) %>>%
    po("nn_dropout") %>>% po("nn_relu")
) %>>% po("nn_merge_sum")

```

Next, we create a neural network that takes as input a `lazy_tensor` (`po("torch_ingress_ltnsr")`).
It first applies a linear layer and then repeats the above layer using the special `PipeOpTorchBlock`, followed by the network's head.
After that, we configure the loss, optimizer and the training parameters.
Note that `po("nn_linear_0")` is equivalent to `po("nn_linear", id = "nn_linear_0")` and we need this here to avoid ID clashes with the linear layer from `po("nn_block")`.

```{r}
deep_network = po("torch_ingress_ltnsr") %>>%
  po("nn_linear", out_features = 50L) %>>%
  po("nn_block", layer, n_blocks = 5L) %>>%
  po("nn_head") %>>%
  po("torch_loss", loss = t_loss("cross_entropy")) %>>%
  po("torch_optimizer", optimizer = t_opt("adam")) %>>%
  po("torch_model_classif",
    epochs = 100L, batch_size = 32
  )
```
Next, we prepend the preprocessing step that flattens the images so we can directly apply this learner to the unflattened MNIST task.

```{r}
deep_learner = as_learner(
  flattener %>>% deep_network
)
deep_learner$id = "deep_network"
```

In order to keep track of the performance during training, we use 20% of the data and evaluate it using classification accuracy.

```{r}
set_validate(deep_learner, 0.2)
deep_learner$param_set$set_values(
  torch_model_classif.measures_valid = msr("classif.acc")
)
```

```{r, include = FALSE}
# so it renders faster
deep_learner$param_set$values$torch_model_classif.epochs = 1L
mnist$filter(1:5)
```

All that is left is to train the learner:

```{r}
deep_learner$train(mnist)
```

## Feature Overview

* Off-the-shelf architectures are readily available as `mlr3::Learner`s.
* Currently, supervised regression and classification is supported.
* Custom learners can be defined using the `Graph` language from `mlr3pipelines`.
* The package supports tabular data, as well as generic tensors via the `lazy_tensor` type.
* Multi-modal data can be handled conveniently, as `lazy_tensor` objects can be stored alongside tabular data.
* It is possible to customize the training process via (predefined or custom) callbacks.
* The package is fully integrated into the `mlr3` ecosystem.
* Neural network architectures, as well as their hyperparameters can be easily tuned via `mlr3tuning` and friends.

## Documentation

* Start by reading one of the vignettes on the package website!
* There is a [course on `(mlr3)torch`](https://mlr-org.github.io/mlr3torch-course/).
* You can check out our [presentation from UseR 2024](https://sebffischer.github.io/mlr3torch-UseR-2024/#/).

## Contributing:

* To run the tests one needs to set the environment variable `TEST_TORCH = 1`, e.g. by adding it to `.Renviron`.

## Acknowledgements

* Without the great R package `torch` none of this would have been possible.
* The names for the callback stages are taken from [luz](https://mlverse.github.io/luz/), another high-level deep learning framework for R `torch`.
* Building neural networks using `PipeOpTorch` operators is inspired by [keras](https://keras.io/).
* This R package is developed as part of the [Mathematical Research Data Initiative](https://www.mardi4nfdi.de/about/mission).

## Bugs, Questions, Feedback

*mlr3torch* is a free and open source software project that
encourages participation and feedback. If you have any issues,
questions, suggestions or feedback, please do not hesitate to open an
“issue” about it on the GitHub page\!

In case of problems / bugs, it is often helpful if you provide a
“minimum working example” that showcases the behaviour (but don’t
worry about this if the bug is obvious).

Please understand that the resources of the project are limited:
response may sometimes be delayed by a few days, and some feature
suggestions may be rejected if they are deemed too tangential to the
vision behind the project.

Owner

Name: mlr-org
Login: mlr-org
Kind: organization
Location: Munich, Germany

Website: https://mlr-org.com
Repositories: 80
Profile: https://github.com/mlr-org

GitHub Events

Total

Create event: 59
Release event: 3
Issues event: 124
Watch event: 9
Delete event: 46
Issue comment event: 94
Push event: 701
Pull request event: 94
Pull request review comment event: 171
Pull request review event: 100
Fork event: 1

Last Year

Create event: 59
Release event: 3
Issues event: 124
Watch event: 9
Delete event: 46
Issue comment event: 94
Push event: 701
Pull request event: 94
Pull request review comment event: 171
Pull request review event: 100
Fork event: 1

Committers

Last synced: 9 months ago

All Time

Total Commits: 374
Total Committers: 9
Avg Commits per committer: 41.556
Development Distribution Score (DDS): 0.214

Past Year

Commits: 141
Committers: 6
Avg Commits per committer: 23.5
Development Distribution Score (DDS): 0.163

Top Committers

Name	Email	Commits
Sebastian Fischer	s**r@g**m	294
Lukas Burk	b**k@l**e	47
dependabot[bot]	4****]	12
cxzhang4	c**4@g**m	11
Sebastian Fischer	s**6@w**e	5
mb706	m**r@m**m	2
Toby Dylan Hocking	t**g@r**g	1
Maximilian Mücke	m**n@g**m	1
Charlie Gao	5****o	1

Committer Domains (Top 20 + Academic)

r-project.org: 1 mb706.com: 1 leibniz-bips.de: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 257
Total pull requests: 178
Average time to close issues: 5 months
Average time to close pull requests: 17 days
Total issue authors: 11
Total pull request authors: 9
Average comments per issue: 0.58
Average comments per pull request: 0.24
Merged pull requests: 140
Bot issues: 0
Bot pull requests: 18

Past Year

Issues: 78
Pull requests: 112
Average time to close issues: 23 days
Average time to close pull requests: 7 days
Issue authors: 6
Pull request authors: 4
Average comments per issue: 0.63
Average comments per pull request: 0.28
Merged pull requests: 85
Bot issues: 0
Bot pull requests: 9

View more stats

Top Authors

Issue Authors

sebffischer (224)
jemus42 (12)
tdhock (10)
iLivius (2)
pfistfl (2)
mb706 (2)
MislavSag (1)
jurbanhost (1)
lorenzwalthert (1)
Rud854 (1)
wmay (1)

Pull Request Authors

sebffischer (117)
cxzhang4 (30)
dependabot[bot] (18)
tdhock (5)
jemus42 (2)
shikokuchuo (2)
m-muecke (2)
HarutyunyanLiana (1)
mb706 (1)

Top Labels

Issue Labels

enhancement (14) good first issue (13) needs discussion (12) bug (11) workshop (10) Prio: Low (7) Modality: Vision (6) Prio: High (6) Modality: Tabular (4) Type: Feature (3) cloning (3) Status: Blocked (3) torch (3) documentation (3) convenience (3) pipelines (3) layer operations (3) Prio: Medium (2) type-maintenance (2) Transfer Learning (1) TasksAndBackends (1)

Pull Request Labels

dependencies (18) bug (2) github_actions (1)

Packages

Total packages: 1
Total downloads:
- cran 1,496 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 7
Total maintainers: 1

cran.r-project.org: mlr3torch

Deep Learning with 'mlr3'

Homepage: https://mlr3torch.mlr-org.com/
Documentation: http://cran.r-project.org/web/packages/mlr3torch/mlr3torch.pdf
License: LGPL (≥ 3)
Latest release: 0.3.1
published 6 months ago

Versions: 7
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 1,496 Last month

Rankings

Dependent packages count: 28.6%

Dependent repos count: 35.2%

Average: 50.1%

Downloads: 86.7%

Maintainers (1)

sebf.fischer@gmail.com

Last synced: 6 months ago

Dependencies

DESCRIPTION cran

R6 * imports
backports * imports
checkmate * imports
coro * imports
data.table * imports
fs * imports
magick * imports
methods * imports
mlr3 * imports
mlr3misc * imports
mlr3pipelines * imports
paradox * imports
progress * imports
rlang * imports
torch * imports
torchvision * imports
zeallot * imports
lgr * suggests
tabnet * suggests
testthat >= 3.0.0 suggests
zip * suggests

.github/workflows/pkgdown.yml actions

JamesIves/github-pages-deploy-action v4.4.1 composite
actions/checkout v3 composite
r-lib/actions/setup-pandoc v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

.github/workflows/r-cmd-check.yml actions

actions/checkout v3 composite
mxschmitt/action-tmate v3 composite
r-lib/actions/check-r-package v2 composite
r-lib/actions/setup-pandoc v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

mlr3torch

Science Score: 26.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.Rmd

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

cran.r-project.org: mlr3torch

Rankings

Maintainers (1)

Dependencies