DeepRiver

DeepRiver: A Deep Learning Library for Data Streams - Published in JOSS (2025)

https://github.com/online-ml/deep-river

Science Score: 100.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 8 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org, zenodo.org
  • Committers with academic emails
    1 of 8 committers (12.5%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

data-science deep-learning incremental-learning machine-learning neural-network online-deep-learning online-learning outlier-detection pytorch stream
Last synced: 4 months ago · JSON representation ·

Repository

Basic Info
Statistics
  • Stars: 154
  • Watchers: 5
  • Forks: 21
  • Open Issues: 5
  • Releases: 17
Topics
data-science deep-learning incremental-learning machine-learning neural-network online-deep-learning online-learning outlier-detection pytorch stream
Created about 4 years ago · Last pushed 5 months ago
Metadata Files
Readme Contributing License Citation

README.md

incremental dl logo

PyPI PyPI - Python Version PyPI - Downloads GitHub DOI

deep-river is a Python library for online deep learning. deep-river's ambition is to enable online machine learning for neural networks. It combines the river API with the capabilities of designing neural networks based on PyTorch.

📚 Documentation

The documentation contains an overview of all features of this repository as well as the repository's full features list. In each of these, the git repo reference is listed in a section that shows examples of the features and functionality. As we are always looking for further use cases and examples, feel free to contribute to the documentation or the repository itself via a pull request

💈 Installation

shell pip install deep-river or shell pip install "river[deep]" You can install the latest development version from GitHub as so:

shell pip install https://github.com/online-ml/deep-river/archive/refs/heads/master.zip

Development Environment

For contributing to deep-river, we recommend using uv for fast dependency management and environment setup:

```shell

Install uv if you haven't already

curl -LsSf https://astral.sh/uv/install.sh | sh

Clone the repository

git clone https://github.com/online-ml/deep-river.git cd deep-river

Install all dependencies (including dev dependencies)

uv sync --extra dev

Run tests

make test

Format code

make format

Build documentation

make doc ```

🍫 Quickstart

We build the development of neural networks on top of the river API and refer to the rivers design principles. The following example creates a simple MLP architecture based on PyTorch and incrementally predicts and trains on the website phishing dataset. For further examples check out the Documentation.

Classification

```python

from river import metrics, datasets, preprocessing, compose from deepriver import classification from torch import nn from torch import optim from torch import manualseed

_ = manual_seed(42)

class MyModule(nn.Module): ... def init(self, nfeatures): ... super(MyModule, self).init() ... self.dense0 = nn.Linear(nfeatures, 5) ... self.nonlin = nn.ReLU() ... self.dense1 = nn.Linear(5, 2) ... self.softmax = nn.Softmax(dim=-1) ... ... def forward(self, X, **kwargs): ... X = self.nonlin(self.dense0(X)) ... X = self.nonlin(self.dense1(X)) ... X = self.softmax(X) ... return X

modelpipeline = compose.Pipeline( ... preprocessing.StandardScaler(), ... classification.ClassifierInitialized(module=MyModule(10), lossfn='binarycrossentropy', optimizer_fn='adam') ... )

dataset = datasets.Phishing() metric = metrics.Accuracy()

for x, y in dataset: ... ypred = modelpipeline.predictone(x) # make a prediction ... metric.update(y, ypred) # update the metric ... modelpipeline.learnone(x, y) # make the model learn print(f"Accuracy: {metric.get():.4f}") Accuracy: 0.7264

```

Multi Target Regression

```python

from river import evaluate, compose from river import metrics from river import preprocessing from river import stream from sklearn import datasets from torch import nn from deep_river.regression.multioutput import MultiTargetRegressorInitialized

class MyModule(nn.Module): ... def init(self, nfeatures): ... super(MyModule, self).init() ... self.dense0 = nn.Linear(nfeatures, 3) ... ... def forward(self, X, **kwargs): ... X = self.dense0(X) ... return X

dataset = stream.itersklearndataset( ... dataset=datasets.loadlinnerud(), ... shuffle=True, ... seed=42 ... ) model = compose.Pipeline( ... preprocessing.StandardScaler(), ... MultiTargetRegressorInitialized( ... module=MyModule(10), ... lossfn='mse', ... lr=0.3, ... optimizerfn='sgd', ... )) metric = metrics.multioutput.MicroAverage(metrics.MAE()) ev = evaluate.progressiveval_score(dataset, model, metric) print(f"MicroAverage(MAE): {metric.get():.2f}") MicroAverage(MAE): 34.31

```

Anomaly Detection

```python

from deep_river.anomaly import AutoencoderInitialized from river import metrics from river.datasets import CreditCard from torch import nn import math from river.compose import Pipeline from river.preprocessing import MinMaxScaler

dataset = CreditCard().take(5000) metric = metrics.RollingROCAUC(window_size=5000)

class MyAutoEncoder(nn.Module): ... def init(self, nfeatures, latentdim=3): ... super(MyAutoEncoder, self).init() ... self.linear1 = nn.Linear(nfeatures, latentdim) ... self.nonlin = nn.LeakyReLU() ... self.linear2 = nn.Linear(latentdim, nfeatures) ... self.sigmoid = nn.Sigmoid() ... ... def forward(self, X, **kwargs): ... X = self.linear1(X) ... X = self.nonlin(X) ... X = self.linear2(X) ... return self.sigmoid(X)

ae = AutoencoderInitialized(module=MyAutoEncoder(10), lr=0.005) scaler = MinMaxScaler() model = Pipeline(scaler, ae)

for x, y in dataset: ... score = model.scoreone(x) ... model.learnone(x=x) ... metric.update(y, score) ... print(f"Rolling ROCAUC: {metric.get():.4f}") Rolling ROCAUC: 0.8901

```

💬 Citation

To acknowledge the use of the DeepRiver library in your research, please refer to our paper published on Journal of Open Source Software (JOSS):

bibtex @article{Kulbach2025, doi = {10.21105/joss.07226}, url = {https://doi.org/10.21105/joss.07226}, year = {2025}, publisher = {The Open Journal}, volume = {10}, number = {105}, pages = {7226}, author = {Cedric Kulbach and Lucas Cazzonelli and Hoang-Anh Ngo and Max Halford and Saulo Martiello Mastelini}, title = {DeepRiver: A Deep Learning Library for Data Streams}, journal = {Journal of Open Source Software} }

🏫 Affiliations

FZI Logo

Lieferbot net

Owner

  • Name: The Fellowship of Online Machine Learning
  • Login: online-ml
  • Kind: organization

JOSS Publication

DeepRiver: A Deep Learning Library for Data Streams
Published
January 06, 2025
Volume 10, Issue 105, Page 7226
Authors
Cedric Kulbach ORCID
FZI Research Center for Information Technology, Karlsruhe, Germany
Lucas Cazzonelli ORCID
FZI Research Center for Information Technology, Karlsruhe, Germany
Hoang-Anh Ngo ORCID
AI Institute, University of Waikato, Hamilton, New Zealand
Max Halford ORCID
Carbonfact, Paris, France
Saulo Martiello Mastelini ORCID
Institute of Mathematics and Computer Science, University of São Paulo, São Carlos, Brazil
Editor
Taher Chegini ORCID

Citation (CITATION.cff)

cff-version: "1.2.0"
authors:
- family-names: Kulbach
  given-names: Cedric
  orcid: "https://orcid.org/0000-0002-9363-4728"
- family-names: Cazzonelli
  given-names: Lucas
  orcid: "https://orcid.org/0000-0003-2886-1219"
- family-names: Ngo
  given-names: Hoang-Anh
  orcid: "https://orcid.org/0000-0002-7583-753X"
- family-names: Halford
  given-names: Max
  orcid: "https://orcid.org/0000-0003-1464-4520"
- family-names: Mastelini
  given-names: Saulo Martiello
  orcid: "https://orcid.org/0000-0002-0092-3572"
contact:
- family-names: Kulbach
  given-names: Cedric
  orcid: "https://orcid.org/0000-0002-9363-4728"
- family-names: Ngo
  given-names: Hoang-Anh
  orcid: "https://orcid.org/0000-0002-7583-753X"
doi: 10.5281/zenodo.14601979
message: If you use this software, please cite our article in the
  Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: Kulbach
    given-names: Cedric
    orcid: "https://orcid.org/0000-0002-9363-4728"
  - family-names: Cazzonelli
    given-names: Lucas
    orcid: "https://orcid.org/0000-0003-2886-1219"
  - family-names: Ngo
    given-names: Hoang-Anh
    orcid: "https://orcid.org/0000-0002-7583-753X"
  - family-names: Halford
    given-names: Max
    orcid: "https://orcid.org/0000-0003-1464-4520"
  - family-names: Mastelini
    given-names: Saulo Martiello
    orcid: "https://orcid.org/0000-0002-0092-3572"
  date-published: 2025-01-06
  doi: 10.21105/joss.07226
  issn: 2475-9066
  issue: 105
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 7226
  title: "DeepRiver: A Deep Learning Library for Data Streams"
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.07226"
  volume: 10
title: "DeepRiver: A Deep Learning Library for Data Streams"

GitHub Events

Total
  • Create event: 15
  • Release event: 6
  • Issues event: 17
  • Watch event: 33
  • Delete event: 7
  • Issue comment event: 24
  • Push event: 136
  • Pull request review comment event: 3
  • Pull request review event: 4
  • Pull request event: 30
  • Fork event: 9
Last Year
  • Create event: 15
  • Release event: 6
  • Issues event: 17
  • Watch event: 33
  • Delete event: 7
  • Issue comment event: 24
  • Push event: 136
  • Pull request review comment event: 3
  • Pull request review event: 4
  • Pull request event: 30
  • Fork event: 9

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 693
  • Total Committers: 8
  • Avg Commits per committer: 86.625
  • Development Distribution Score (DDS): 0.599
Past Year
  • Commits: 153
  • Committers: 5
  • Avg Commits per committer: 30.6
  • Development Distribution Score (DDS): 0.203
Top Committers
Name Email Commits
Cedric Kulbach c****h@g****m 278
Cedric Kulbach c****c@g****m 222
Lucas Cazzonelli c****i@f****e 93
kulbach k****h@f****e 64
Hoang-Anh Ngo h****o@s****k 32
Max Halford m****5@g****m 2
gobeumsu g****u@g****m 1
Jose Enrique j****o@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 54
  • Total pull requests: 72
  • Average time to close issues: 3 months
  • Average time to close pull requests: 1 day
  • Total issue authors: 13
  • Total pull request authors: 6
  • Average comments per issue: 1.28
  • Average comments per pull request: 0.71
  • Merged pull requests: 65
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 9
  • Pull requests: 25
  • Average time to close issues: 15 days
  • Average time to close pull requests: about 14 hours
  • Issue authors: 6
  • Pull request authors: 4
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.44
  • Merged pull requests: 19
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • kulbachcedric (24)
  • lucasczz (10)
  • ercangunbilek (8)
  • panchao12345 (3)
  • leal2020 (1)
  • alpcansoydas (1)
  • atanikan (1)
  • zeyaddeeb (1)
  • jabowery (1)
  • fox-ds (1)
  • jpbarddal (1)
  • joseEnrique (1)
  • Asuskf (1)
  • albertobotana (1)
Pull Request Authors
  • kulbachcedric (39)
  • lucasczz (22)
  • hoanganhngo610 (8)
  • MaxHalford (3)
  • joseEnrique (2)
  • GoBeromsu (2)
Top Labels
Issue Labels
enhancement (17) bug (9) documentation (2)
Pull Request Labels
enhancement (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 220 last-month
  • Total dependent packages: 1
  • Total dependent repositories: 0
  • Total versions: 9
  • Total maintainers: 1
pypi.org: deep-river

Online Deep Learning for river

  • Versions: 9
  • Dependent Packages: 1
  • Dependent Repositories: 0
  • Downloads: 220 Last month
Rankings
Dependent packages count: 2.9%
Stargazers count: 8.7%
Forks count: 11.6%
Average: 13.8%
Downloads: 14.9%
Dependent repos count: 30.6%
Maintainers (1)
Last synced: 4 months ago

Dependencies

.github/workflows/mkdocs.yml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
.github/workflows/pypi-publish.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
  • pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite
.github/workflows/unit-tests.yml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • codecov/codecov-action v2 composite