https://github.com/cvxgrp/mlr_fitting
Factor Fitting, Rank Allocation, and Partitioning in Multilevel Low Rank Matrices
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.6%) to scientific vocabulary
Repository
Factor Fitting, Rank Allocation, and Partitioning in Multilevel Low Rank Matrices
Basic Info
- Host: GitHub
- Owner: cvxgrp
- License: apache-2.0
- Language: Jupyter Notebook
- Default Branch: main
- Size: 9.98 MB
Statistics
- Stars: 7
- Watchers: 3
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Factor Fitting, Rank Allocation, and Partitioning in Multilevel Low Rank Matrices
This repository accompanies the manuscript.
A multilevel low rank (MLR) matrix is a row and column permutation of sum of matrices, each one a block diagonal refinement of the previous one, with all blocks low rank given in factored form. MLR matrices extend low rank matrices but share many of their properties, such as the total storage required and complexity of matrix-vector multiplication.
In this repository we provide solutions to three problems that arise in fitting a given matrix by an MLR matrix in the Frobenius norm 1. factor fitting, where we adjust the factors of the MLR matrix 1. rank allocation, where we choose the ranks of the blocks in each level, subject to the total rank having a given value 1. hierarchical partitioning of rows and columns, along with the ranks and factors.
Installation
To install mlrfit 1) activate virtual environment, 2) clone the repo, 3) from inside the directory run
python3
python setup.py install
Requirements
* python >= 3.9
* numpy >= 1.22.2, scipy >= 1.10.0, pandas >= 1.5.2
* PyTorch >= 1.13
* numba >= 0.55.1
* osmnx >= 1.3.0
* networkx >= 2.7.1
Getting started
Given data matrix $A$, we fit the data using MLR matrix model.
Define MLR matrix object $\hat A$ using
hat_A = mlrfit.MLRMatrix()Define hierarchical partitioning of type
HpartDict- if there is an existing hierarchy, specify it in
hpart - if there is no existing hierarchy, create new
hpartusing spectral partitioning with initial rank allocationranks- by calling
mlrfit.MLRMatrix.hpartition_topdown(A, ranks)
- by calling
- if there is an existing hierarchy, specify it in
Fit $\hat A$ to given $A$
- if rank allocation is given, specify it in
ranksand run factor fit- by calling
mlrfit.MLRMatrix.factor_fit(A, ranks, hpart)
- by calling
- if rank allocation is not given, use rank allocation method with initial rank allocation
ranks- by calling
mlrfit.MLRMatrix.rank_alloc(A, ranks, hpart)
- by calling
- if rank allocation is given, specify it in
Once the $\hat A$ model has been fitted, we can use it for fast linear algebra:
1. Matrix-vector multiplication $\hat A x$ by calling
python3
b = hat_A.matvec(x)
2. Linear system solve $\hat A x = b$ (when $m=n$ and $\hat A$ is invertible)
python3
x = hat_A.solve(b)
3. Least squares $| \hat A x - b|2^2$
```python3
x = hatA.lstsq(b)
```
Hello world
We provide a guideline on how to use our methods using the hello world example.
Example notebooks
We have example notebooks
that show how to use our method on a number of different problems
* asset covariance matrix, see notebook
* distance matrix, see notebook
* DGT matrix, see notebook
Please consult our manuscript for the details of mentioned problems.
Linear algebra
We implement fast linear algebra operations that use sparse form of MLR model, see notebook * matrix vector multiplication * linear system solve: comparison of storage efficiency vs time efficiency between dense and sparse representations of $\hat A$
| $(m \times n) / (mr+nr)$ | $t{\mathrm{dense}} / t{\mathrm{sparse}}$ | | :------------ | :----------- | 10 | 11.5 | 20 | 39.3 | 50 | 200.8 | 80 | 462.9 | 100 | 701.1 | 160 | 1623.2 |
- least squares: comparison of storage efficiency vs time efficiency between dense and sparse representations of $\hat A$
| $(m \times n) / (mr+nr)$ | $t{\mathrm{dense}} / t{\mathrm{sparse}}$ | | :------------ | :----------- | 12.0 | 94.5 | 22.2 | 391.1 | 52.4 | 2435.2 | 72.4 | 5899.3 |
Linear system solve and least squares are based on the parallel direct sparse
solver (PARDISO) implemented with pypardiso.
This results in a speed improvement factor ranging from 10x to 1000x when
compared to solving problems of equivalent size using dense matrices.
Extra
We also provide additional experiments * ALS vs BCD factor fit comparison, see notebook * ALS vs BCD rank allocation comparison, see notebook
Owner
- Name: Stanford University Convex Optimization Group
- Login: cvxgrp
- Kind: organization
- Location: Stanford, CA
- Website: www.stanford.edu/~boyd
- Repositories: 102
- Profile: https://github.com/cvxgrp
GitHub Events
Total
- Watch event: 3
Last Year
- Watch event: 3
Issues and Pull Requests
Last synced: about 1 year ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0