DeepRiver

DeepRiver: A Deep Learning Library for Data Streams - Published in JOSS (2025)

Science Score: 100.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 8 DOI reference(s) in README and JOSS metadata
✓
Academic publication links
Links to: joss.theoj.org, zenodo.org
✓
Committers with academic emails
1 of 8 committers (12.5%) from academic institutions
○
Institutional organization owner
✓
JOSS paper metadata
Published in Journal of Open Source Software

Keywords

data-science deep-learning incremental-learning machine-learning neural-network online-deep-learning online-learning outlier-detection pytorch stream

Last synced: 6 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: online-ml
License: bsd-3-clause
Language: Python
Default Branch: master
Homepage: https://online-ml.github.io/deep-river/
Size: 53 MB

Statistics

Stars: 154
Watchers: 5
Forks: 21
Open Issues: 5
Releases: 17

Topics

data-science deep-learning incremental-learning machine-learning neural-network online-deep-learning online-learning outlier-detection pytorch stream

Created over 4 years ago · Last pushed 6 months ago

Metadata Files

Readme Contributing License Citation

README.md

PyPI PyPI - Python Version PyPI - Downloads GitHub

deep-river is a Python library for online deep learning. deep-river's ambition is to enable online machine learning for neural networks. It combines the river API with the capabilities of designing neural networks based on PyTorch.

📚 Documentation

The documentation contains an overview of all features of this repository as well as the repository's full features list. In each of these, the git repo reference is listed in a section that shows examples of the features and functionality. As we are always looking for further use cases and examples, feel free to contribute to the documentation or the repository itself via a pull request

💈 Installation

shell pip install deep-river or shell pip install "river[deep]" You can install the latest development version from GitHub as so:

shell pip install https://github.com/online-ml/deep-river/archive/refs/heads/master.zip

Development Environment

For contributing to deep-river, we recommend using uv for fast dependency management and environment setup:

```shell

Install uv if you haven't already

curl -LsSf https://astral.sh/uv/install.sh | sh

Clone the repository

git clone https://github.com/online-ml/deep-river.git cd deep-river

Install all dependencies (including dev dependencies)

uv sync --extra dev

Run tests

make test

Format code

make format

Build documentation

make doc ```

🍫 Quickstart

We build the development of neural networks on top of the river API and refer to the rivers design principles. The following example creates a simple MLP architecture based on PyTorch and incrementally predicts and trains on the website phishing dataset. For further examples check out the Documentation.

Classification

```python

from river import metrics, datasets, preprocessing, compose from deepriver import classification from torch import nn from torch import optim from torch import manualseed

_ = manual_seed(42)

class MyModule(nn.Module): ... def init(self, nfeatures): ... super(MyModule, self).init() ... self.dense0 = nn.Linear(nfeatures, 5) ... self.nonlin = nn.ReLU() ... self.dense1 = nn.Linear(5, 2) ... self.softmax = nn.Softmax(dim=-1) ... ... def forward(self, X, **kwargs): ... X = self.nonlin(self.dense0(X)) ... X = self.nonlin(self.dense1(X)) ... X = self.softmax(X) ... return X

modelpipeline = compose.Pipeline( ... preprocessing.StandardScaler(), ... classification.ClassifierInitialized(module=MyModule(10), lossfn='binarycrossentropy', optimizer_fn='adam') ... )

dataset = datasets.Phishing() metric = metrics.Accuracy()

for x, y in dataset: ... ypred = modelpipeline.predictone(x) # make a prediction ... metric.update(y, ypred) # update the metric ... modelpipeline.learnone(x, y) # make the model learn print(f"Accuracy: {metric.get():.4f}") Accuracy: 0.7264

```

Multi Target Regression

```python

from river import evaluate, compose from river import metrics from river import preprocessing from river import stream from sklearn import datasets from torch import nn from deep_river.regression.multioutput import MultiTargetRegressorInitialized

class MyModule(nn.Module): ... def init(self, nfeatures): ... super(MyModule, self).init() ... self.dense0 = nn.Linear(nfeatures, 3) ... ... def forward(self, X, **kwargs): ... X = self.dense0(X) ... return X

dataset = stream.itersklearndataset( ... dataset=datasets.loadlinnerud(), ... shuffle=True, ... seed=42 ... ) model = compose.Pipeline( ... preprocessing.StandardScaler(), ... MultiTargetRegressorInitialized( ... module=MyModule(10), ... lossfn='mse', ... lr=0.3, ... optimizerfn='sgd', ... )) metric = metrics.multioutput.MicroAverage(metrics.MAE()) ev = evaluate.progressiveval_score(dataset, model, metric) print(f"MicroAverage(MAE): {metric.get():.2f}") MicroAverage(MAE): 34.31

```

Anomaly Detection

```python

from deep_river.anomaly import AutoencoderInitialized from river import metrics from river.datasets import CreditCard from torch import nn import math from river.compose import Pipeline from river.preprocessing import MinMaxScaler

dataset = CreditCard().take(5000) metric = metrics.RollingROCAUC(window_size=5000)

class MyAutoEncoder(nn.Module): ... def init(self, nfeatures, latentdim=3): ... super(MyAutoEncoder, self).init() ... self.linear1 = nn.Linear(nfeatures, latentdim) ... self.nonlin = nn.LeakyReLU() ... self.linear2 = nn.Linear(latentdim, nfeatures) ... self.sigmoid = nn.Sigmoid() ... ... def forward(self, X, **kwargs): ... X = self.linear1(X) ... X = self.nonlin(X) ... X = self.linear2(X) ... return self.sigmoid(X)

ae = AutoencoderInitialized(module=MyAutoEncoder(10), lr=0.005) scaler = MinMaxScaler() model = Pipeline(scaler, ae)

for x, y in dataset: ... score = model.scoreone(x) ... model.learnone(x=x) ... metric.update(y, score) ... print(f"Rolling ROCAUC: {metric.get():.4f}") Rolling ROCAUC: 0.8901

```

💬 Citation

To acknowledge the use of the DeepRiver library in your research, please refer to our paper published on Journal of Open Source Software (JOSS):

bibtex @article{Kulbach2025, doi = {10.21105/joss.07226}, url = {https://doi.org/10.21105/joss.07226}, year = {2025}, publisher = {The Open Journal}, volume = {10}, number = {105}, pages = {7226}, author = {Cedric Kulbach and Lucas Cazzonelli and Hoang-Anh Ngo and Max Halford and Saulo Martiello Mastelini}, title = {DeepRiver: A Deep Learning Library for Data Streams}, journal = {Journal of Open Source Software} }

🏫 Affiliations

Owner

Name: The Fellowship of Online Machine Learning
Login: online-ml
Kind: organization

Website: https://maxhalford.notion.site/Friends-of-Online-Machine-Learning-8a264829ccf345a4b2627de38139ec8b
Repositories: 8
Profile: https://github.com/online-ml

JOSS Publication

DeepRiver: A Deep Learning Library for Data Streams

Published

January 06, 2025

DOI

10.21105/joss.07226

Volume 10, Issue 105, Page 7226

Authors

Cedric Kulbach

FZI Research Center for Information Technology, Karlsruhe, Germany

Lucas Cazzonelli

FZI Research Center for Information Technology, Karlsruhe, Germany

Hoang-Anh Ngo

AI Institute, University of Waikato, Hamilton, New Zealand

Max Halford

Carbonfact, Paris, France

Saulo Martiello Mastelini

Institute of Mathematics and Computer Science, University of São Paulo, São Carlos, Brazil

Editor

Taher Chegini

View PDF Review Thread Software Archive

Citation (CITATION.cff)

cff-version: "1.2.0"
authors:
- family-names: Kulbach
  given-names: Cedric
  orcid: "https://orcid.org/0000-0002-9363-4728"
- family-names: Cazzonelli
  given-names: Lucas
  orcid: "https://orcid.org/0000-0003-2886-1219"
- family-names: Ngo
  given-names: Hoang-Anh
  orcid: "https://orcid.org/0000-0002-7583-753X"
- family-names: Halford
  given-names: Max
  orcid: "https://orcid.org/0000-0003-1464-4520"
- family-names: Mastelini
  given-names: Saulo Martiello
  orcid: "https://orcid.org/0000-0002-0092-3572"
contact:
- family-names: Kulbach
  given-names: Cedric
  orcid: "https://orcid.org/0000-0002-9363-4728"
- family-names: Ngo
  given-names: Hoang-Anh
  orcid: "https://orcid.org/0000-0002-7583-753X"
doi: 10.5281/zenodo.14601979
message: If you use this software, please cite our article in the
  Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: Kulbach
    given-names: Cedric
    orcid: "https://orcid.org/0000-0002-9363-4728"
  - family-names: Cazzonelli
    given-names: Lucas
    orcid: "https://orcid.org/0000-0003-2886-1219"
  - family-names: Ngo
    given-names: Hoang-Anh
    orcid: "https://orcid.org/0000-0002-7583-753X"
  - family-names: Halford
    given-names: Max
    orcid: "https://orcid.org/0000-0003-1464-4520"
  - family-names: Mastelini
    given-names: Saulo Martiello
    orcid: "https://orcid.org/0000-0002-0092-3572"
  date-published: 2025-01-06
  doi: 10.21105/joss.07226
  issn: 2475-9066
  issue: 105
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 7226
  title: "DeepRiver: A Deep Learning Library for Data Streams"
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.07226"
  volume: 10
title: "DeepRiver: A Deep Learning Library for Data Streams"

GitHub Events

Total

Create event: 15
Release event: 6
Issues event: 17
Watch event: 33
Delete event: 7
Issue comment event: 24
Push event: 136
Pull request review comment event: 3
Pull request review event: 4
Pull request event: 30
Fork event: 9

Last Year

Create event: 15
Release event: 6
Issues event: 17
Watch event: 33
Delete event: 7
Issue comment event: 24
Push event: 136
Pull request review comment event: 3
Pull request review event: 4
Pull request event: 30
Fork event: 9

Committers

Last synced: 7 months ago

All Time

Total Commits: 693
Total Committers: 8
Avg Commits per committer: 86.625
Development Distribution Score (DDS): 0.599

Past Year

Commits: 153
Committers: 5
Avg Commits per committer: 30.6
Development Distribution Score (DDS): 0.203

Top Committers

Name	Email	Commits
Cedric Kulbach	c**h@g**m	278
Cedric Kulbach	c**c@g**m	222
Lucas Cazzonelli	c**i@f**e	93
kulbach	k**h@f**e	64
Hoang-Anh Ngo	h**o@s**k	32
Max Halford	m**5@g**m	2
gobeumsu	g**u@g**m	1
Jose Enrique	j**o@g**m	1

Committer Domains (Top 20 + Academic)

fzi.de: 2 sms.ed.ac.uk: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 54
Total pull requests: 72
Average time to close issues: 3 months
Average time to close pull requests: 1 day
Total issue authors: 13
Total pull request authors: 6
Average comments per issue: 1.28
Average comments per pull request: 0.71
Merged pull requests: 65
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 9
Pull requests: 25
Average time to close issues: 15 days
Average time to close pull requests: about 14 hours
Issue authors: 6
Pull request authors: 4
Average comments per issue: 1.0
Average comments per pull request: 0.44
Merged pull requests: 19
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

kulbachcedric (24)
lucasczz (10)
ercangunbilek (8)
panchao12345 (3)
leal2020 (1)
alpcansoydas (1)
atanikan (1)
zeyaddeeb (1)
jabowery (1)
fox-ds (1)
jpbarddal (1)
joseEnrique (1)
Asuskf (1)
albertobotana (1)

Pull Request Authors

kulbachcedric (39)
lucasczz (22)
hoanganhngo610 (8)
MaxHalford (3)
joseEnrique (2)
GoBeromsu (2)

Top Labels

Issue Labels

enhancement (17) bug (9) documentation (2)

Pull Request Labels

enhancement (1)

Packages

Total packages: 1
Total downloads:
- pypi 220 last-month

Total dependent packages: 1
Total dependent repositories: 0
Total versions: 9
Total maintainers: 1

pypi.org: deep-river

Online Deep Learning for river

Homepage: https://github.com/online-ml/deep-river
Documentation: https://github.com/online-ml/deep-river
License: BSD-3-Clause
Latest release: 0.3.0
published 11 months ago

Versions: 9
Dependent Packages: 1
Dependent Repositories: 0
Downloads: 220 Last month

Rankings

Dependent packages count: 2.9%

Stargazers count: 8.7%

Forks count: 11.6%

Average: 13.8%

Downloads: 14.9%

Dependent repos count: 30.6%

Maintainers (1)

ckulbach

Last synced: 6 months ago

Dependencies

.github/workflows/mkdocs.yml actions

actions/cache v2 composite
actions/checkout v2 composite
actions/setup-python v2 composite

.github/workflows/pypi-publish.yml actions

actions/checkout v3 composite
actions/setup-python v3 composite
pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite

.github/workflows/unit-tests.yml actions

actions/cache v2 composite
actions/checkout v2 composite
actions/setup-python v2 composite
codecov/codecov-action v2 composite