OnlineNMF.jl: A Julia Package for Out-of-core and Sparse Non-negative Matrix Factorization

OnlineNMF.jl: A Julia Package for Out-of-core and Sparse Non-negative Matrix Factorization - Published in JOSS (2026)

https://github.com/rikenbit/onlinenmf.jl

Science Score: 87.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: sciencedirect.com, ieee.org, joss.theoj.org, zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

bioinformatics dimensionality-reduction julia nmf out-of-core-processing sparse-matrix

Keywords from Contributors

projections control archival generic sequences profiles genomics surrogate ida hybrid-differential-equations
Last synced: 22 days ago · JSON representation

Repository

Online Non-negative Matrix Factorization

Basic Info
Statistics
  • Stars: 2
  • Watchers: 8
  • Forks: 1
  • Open Issues: 1
  • Releases: 10
Topics
bioinformatics dimensionality-reduction julia nmf out-of-core-processing sparse-matrix
Created about 7 years ago · Last pushed about 1 month ago
Metadata Files
Readme Contributing License Code of conduct

README.md

OnlineNMF.jl

Online Non-negative Matrix Factorization

Build Status DOI

📚 Documentation

Documentation

Description

OnlineNMF.jl performs some online-NMF functions for extreamly large scale matrix.

Note: The input matrix is supposed to be a non-negative matrix.

Algorithms

  • Multiplicative Update (MU)
    • Alpha-divergence: Cichocki, A. et al., 2008
    • Alpha=2 : Pearson divergence-based NMF
    • Alpha=0 or 1 : Kullback–Leibler (KL) divergence-based NMF
    • Alpha=0.5 : Hellinger divergence-based NMF
    • Beta-divergence: Févotte, C. et al., 2011, Nakano, M. et al., 2010
    • Beta=2 : Euclidean distance-based NMF with Gaussian distribution
    • Beta=1 : Kullback–Leibler divergence-based NMF with Poisson distribution
    • Beta=0 : Itakura-Saito divergence-based NMF with Gamma distribution
  • Discretized Non-negative Matrix Factorization (DNMF): Koki Tsuyuzaki, 2023

Installation

```julia

push the key "]" and type the following command.

(@julia) pkg> add https://github.com/rikenbit/OnlineNMF.jl (@julia) pkg> add PlotlyJS

After that, push Ctrl + C to leave from Pkg REPL mode

```

Basic API usage

Preprocess of CSV

```julia using OnlinePCA using OnlinePCA: write_csv using OnlineNMF using Distributions using DelimitedFiles using SparseArrays using MatrixMarket

CSV(Input data is supposed to be non-negative Integer)

tmp = mktempdir() data = rand(Binomial(10, 0.05), 300, 99) data[1:50, 1:33] .= 100data[1:50, 1:33] data[51:100, 34:66] .= 100data[51:100, 34:66] data[101:150, 67:99] .= 100*data[101:150, 67:99] write_csv(joinpath(tmp, "Data.csv"), data)

Matrix Market (MM)

mmwrite(joinpath(tmp, "Data.mtx"), sparse(data))

Binary COO (BinCOO)

data2 = zeros(Int, 300, 99) data2[1:50, 1:33] .= 1 data2[51:100, 34:66] .= 1 data2[101:150, 67:99] .= 1 data2[151:300, :] .= 1

bincoofile = joinpath(tmp, "Data.bincoo") open(bincoofile, "w") do io for i in 1:size(data2, 1) for j in 1:size(data2, 2) if data2[i, j] != 0 println(io, "$i $j") end end end end

Binarization (Zstandard)

csv2bin(csvfile=joinpath(tmp, "Data.csv"), binfile=joinpath(tmp, "Data.zst"))

Sparsification (Zstandard + MM format)

mm2bin(mmfile=joinpath(tmp, "Data.mtx"), binfile=joinpath(tmp, "Data.mtx.zst"))

Binarziation (BinCOO + Zstandard)

bincoo2bin(bincoofile=bincoofile, binfile=joinpath(tmp, "Data.bincoo.zst")) ```

Setting for plot

```julia using DataFrames using PlotlyJS

function subplots(outnmf, group) # data frame dataleft = DataFrame(nmf1=outnmf[1][:,1], nmf2=outnmf[1][:,2], group=group) dataright = DataFrame(nmf2=outnmf[1][:,2], nmf3=outnmf[1][:,3], group=group) # plot pleft = Plot(dataleft, x=:nmf1, y=:nmf2, mode="markers", markersize=10, group=:group) pright = Plot(dataright, x=:nmf2, y=:nmf3, mode="markers", markersize=10, group=:group, showlegend=false) pleft.data[1]["markercolor"] = "red" pleft.data[2]["markercolor"] = "blue" pleft.data[3]["markercolor"] = "green" pright.data[1]["markercolor"] = "red" pright.data[2]["markercolor"] = "blue" pright.data[3]["markercolor"] = "green" pleft.data[1]["name"] = "group1" pleft.data[2]["name"] = "group2" pleft.data[3]["name"] = "group3" pleft.layout["title"] = "Component 1 vs Component 2" pright.layout["title"] = "Component 2 vs Component 3" pleft.layout["xaxistitle"] = "nmf-1" pleft.layout["yaxistitle"] = "nmf-2" pright.layout["xaxistitle"] = "nmf-2" pright.layout["yaxistitle"] = "nmf-3" plot([pleft pright]) end

group=vcat(repeat(["group1"],inner=100), repeat(["group2"],inner=100), repeat(["group3"],inner=100)) ```

NMF based on Alpha-Divergence

```julia outnmfalpha = nmf(input=joinpath(tmp, "Data.zst"), dim=3, alpha=2, numepoch=30, algorithm="alpha")

subplots(outnmfalpha, group) ``` NMF_ALPHA

NMF based on Beta-Divergence

```julia outnmfbeta = nmf(input=joinpath(tmp, "Data.zst"), dim=3, beta=2, numepoch=30, algorithm="beta")

subplots(outnmfbeta, group) ``` NMF_BETA

Semi-Binary MF based on Beta-Divergence

```julia outdnmfbeta = dnmf(input=joinpath(tmp, "Data.zst"), dim=3, beta=2, numepoch=30, binu=10^4) minimum(outdnmfbeta[1]) maximum(outdnmfbeta[1])

subplots(outdnmfbeta, group) ``` DNMF

Sparse-NMF based on Alpha-Divergence

```julia outsparsenmfalpha = sparsenmf(input=joinpath(tmp, "Data.mtx.zst"), dim=3, alpha=2, numepoch=30, algorithm="alpha")

subplots(outsparsenmf_alpha, group) ``` SPARSE_NMF_ALPHA

Sparse-NMF based on Beta-Divergence

```julia outsparsenmfbeta = sparsenmf(input=joinpath(tmp, "Data.mtx.zst"), dim=3, beta=1, numepoch=30, algorithm="beta")

subplots(outsparsenmf_beta, group) ``` SPARSE_NMF_BETA

Sparse-DNMF based on Beta-Divergence

```julia outsparsednmfbeta = sparsednmf(input=joinpath(tmp, "Data.mtx.zst"), dim=3, beta=2, numepoch=30, binu=10^2) minimum(outsparsednmfbeta[1]) maximum(outsparsednmfbeta[1])

subplots(outsparsednmf_beta, group) ``` SPARSE_DNMF

BinCOO-NMF based on Alpha-Divergence

```julia outbincoonmfalpha = bincoonmf(input=joinpath(tmp, "Data.bincoo.zst"), dim=3, alpha=2, numepoch=10, algorithm="alpha")

subplots(outbincoonmf_alpha, group) ``` BinCOO_NMF_ALPHA

BinCOO-NMF based on Beta-Divergence

```julia outbincoonmfbeta = bincoonmf(input=joinpath(tmp, "Data.bincoo.zst"), dim=3, beta=2, numepoch=5, algorithm="beta")

subplots(outbincoonmf_beta, group) ``` BinCOO_NMF_BETA

BinCOO-NMF based on Beta-Divergence

```julia outbincoodnmfbeta = bincoodnmf(input=joinpath(tmp, "Data.bincoo.zst"), dim=3, beta=2, numepoch=5, binu=10^1) minimum(outbincoodnmfbeta[1]) maximum(outbincoodnmfbeta[1])

subplots(outbincoodnmf_beta, group) ``` BinCOO_DNMF

Command line usage

The type of input file is assumed to be CSV or MM formats, and then be processed by csv2bin or mm2bin in OnlinePCA package. The binary file is specified as the input of NMF functions in OnlineNMF package. The NMF functions also can be performed as command line tools with same parameter names like below.

```bash

CSV → Julia Binary

julia YOURHOMEDIR/.julia/v0.x/OnlinePCA/bin/csv2bin \ --csvfile Data.csv --binfile Data.zst

MM → Julia Binary

julia YOURHOMEDIR/.julia/v0.x/OnlinePCA/bin/mm2bin \ --mmfile Data.mtx --binfile Data.mtx.zst

BinCOO → Julia Binary

julia YOURHOMEDIR/.julia/v0.x/OnlinePCA/bin/bincoo2bin \ --mmfile Data.bincoo --binfile Data.bincoo.zst

NMF based on Alpha-Divergence

julia YOURHOMEDIR/.julia/v0.x/OnlineNMF/bin/nmf \ --input Data.zst --dim 3 \ --numepoch 10 --alpha 1

NMF based on Beta-Divergence

julia YOURHOMEDIR/.julia/v0.x/OnlineNMF/bin/nmf \ --input Data.zst --dim 3 \ --numepoch 10 --beta 2

DNMF based on Beta-Divergence

julia YOURHOMEDIR/.julia/v0.x/OnlineNMF/bin/dnmf \ --input Data.zst --dim 3 \ --numepoch 10 --beta 2

Sparse-NMF based on Alpha-Divergence

julia YOURHOMEDIR/.julia/v0.x/OnlineNMF/bin/sparse_nmf \ --input Data.mtx.zst --dim 3 \ --numepoch 10 --alpha 1

Sparse-NMF based on Beta-Divergence

julia YOURHOMEDIR/.julia/v0.x/OnlineNMF/bin/sparse_nmf \ --input Data.mtx.zst --dim 3 \ --numepoch 10 --beta 2

Sparse-DNMF based on Beta-Divergence

julia YOURHOMEDIR/.julia/v0.x/OnlineNMF/bin/sparse_dnmf \ --input Data.mtx.zst --dim 3 \ --numepoch 10 --beta 2

BinCOO-NMF based on Alpha-Divergence

julia YOURHOMEDIR/.julia/v0.x/OnlineNMF/bin/bincoo_nmf \ --input Data.bincoo.zst --dim 3 \ --numepoch 10 --alpha 1

BinCOO-NMF based on Beta-Divergence

julia YOURHOMEDIR/.julia/v0.x/OnlineNMF/bin/bincoo_nmf \ --input Data.bincoo.zst --dim 3 \ --numepoch 10 --beta 2

BinCOO-DNMF based on Beta-Divergence

julia YOURHOMEDIR/.julia/v0.x/OnlineNMF/bin/bincoo_dnmf \ --input Data.bincoo.zst --dim 3 \ --numepoch 10 --beta 2 ```

Contributing

If you have suggestions for how OnlineNMF.jl could be improved, or want to report a bug, open an issue! We'd love all and any contributions.

For more, check out the Contributing Guide.

Author

  • Koki Tsuyuzaki

Owner

  • Name: RIKEN BiT
  • Login: rikenbit
  • Kind: organization
  • Location: Japan

Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics Research

JOSS Publication

OnlineNMF.jl: A Julia Package for Out-of-core and Sparse Non-negative Matrix Factorization
Published
January 29, 2026
Volume 11, Issue 117, Page 9293
Authors
Koki Tsuyuzaki ORCID
Department of Artificial Intelligence Medicine, Graduate School of Medicine, Chiba University, Japan, Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics Research, Japan
Editor
Chris Vernon ORCID
Tags
Non-negative Matrix Factorization Out-of-Core Sparse dimensionality reduction

GitHub Events

Total
  • Release event: 9
  • Delete event: 3
  • Pull request event: 24
  • Issues event: 3
  • Issue comment event: 9
  • Push event: 63
  • Create event: 21
Last Year
  • Release event: 9
  • Delete event: 3
  • Pull request event: 24
  • Issues event: 3
  • Issue comment event: 9
  • Push event: 63
  • Create event: 21

Committers

Last synced: about 2 months ago

All Time
  • Total Commits: 51
  • Total Committers: 4
  • Avg Commits per committer: 12.75
  • Development Distribution Score (DDS): 0.294
Past Year
  • Commits: 46
  • Committers: 4
  • Avg Commits per committer: 11.5
  • Development Distribution Score (DDS): 0.326
Top Committers
Name Email Commits
kokitsuyuzaki k****r@h****p 36
CompatHelper Julia c****y@j****g 9
dependabot[bot] 4****] 3
Rahul Khorana R****0@g****m 3
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 2 months ago

All Time
  • Total issues: 3
  • Total pull requests: 17
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 29 days
  • Total issue authors: 1
  • Total pull request authors: 3
  • Average comments per issue: 0.67
  • Average comments per pull request: 0.18
  • Merged pull requests: 12
  • Bot issues: 0
  • Bot pull requests: 14
Past Year
  • Issues: 3
  • Pull requests: 17
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 29 days
  • Issue authors: 1
  • Pull request authors: 3
  • Average comments per issue: 0.67
  • Average comments per pull request: 0.18
  • Merged pull requests: 12
  • Bot issues: 0
  • Bot pull requests: 14
Top Authors
Issue Authors
  • kokitsuyuzaki (3)
Pull Request Authors
  • github-actions[bot] (9)
  • dependabot[bot] (5)
  • rahulkhorana (3)
Top Labels
Issue Labels
Pull Request Labels
dependencies (5) github_actions (5)

Packages

  • Total packages: 1
  • Total downloads: unknown
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
juliahub.com: OnlineNMF

Online Non-negative Matrix Factorization

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 8.3%
Average: 21.9%
Dependent packages count: 35.6%
Last synced: about 1 month ago