Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.7%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: rbordoloi
- License: gpl-3.0
- Language: Jupyter Notebook
- Default Branch: master
- Size: 12.7 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Source Code for the Paper "Multivariate Functional Linear Discriminant Analysis (MUDRA) for the Classification of Short Time Series with Missing Data"
Installation
We advise to use a virtual environment, either with Conda or VirtualEnv. Then run the following command:
```bash git clone https://github.com/rbordoloi/MUDRA.git cd MUDRA/
If using pip
python -m pip install --upgrade setuptools python -m pip install -r pip/requirements.txt
If using conda
cond env create --name mudra --file=environment.yml python setup.py install --user ```
The sample dataset is included in the datasets directory. To regenerate the dataset from the original source, run datasetGeneration.py.
Example
The class MUDRA is defined like a scikit-learn module, that is
- To import the MUDRA class:
python
from MUDRA import MUDRA
The model accepts input X as a pandas DataFrame of shape (n_samples, n_features) and y as list of class labels. Each cell of the DataFrame has a pandas Series object corresponding to the time series for one feature of one sample. Each Series object is indexed by the time points for which observations were recorded. Missing features are denoted by np.nan objects.
- To fit the model on training data (X,y) (for r=8, b=9 and 300 iterations for the last optimization step):
python
model = MUDRA(r=8, n_iter=300, nBasis=9).fit(X, y)
- To perform dimension reduction on new data (X):
python
x = model.transform(X)
- To predict labels on new data (X):
python
y = model.predict(X)
- To predict scores on new data (X):
python
y = model.predict_proba(X)
Reproduce the results shown in the paper
Please check out the interactive Jupyter notebooks "synthetic.ipynb" and "real.ipynb". After installing Jupyter Notebook, please run the following commands:
bash
jupyter notebook real.ipynb
jupyter notebook synthetic.ipynb
Citations
If you use MUDRA in academic research, please cite it as follows
``` @article{bordoloi2025multivariate, title={Multivariate functional linear discriminant analysis for partially-observed time series}, author={Bordoloi, Rahul and R{\'e}da, Cl{\'e}mence and Trautmann, Orell and Bej, Saptarshi and Wolkenhauer, Olaf}, journal={Machine Learning}, volume={114}, number={3}, pages={80}, year={2025}, publisher={Springer} }
```
The citation for the ``Articulary Word Recognition'' data set (available in folder "datasets/") is
``` @article{ruizgreat2021, title = {The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances}, volume = {35}, issn = {1573-756X}, doi = {10.1007/s10618-020-00727-3}, number = {2}, journal = {Data Min Knowl Disc}, author = {Ruiz, Alejandro Pasos and Flynn, Michael and Large, James and Middlehurst, Matthew and Bagnall, Anthony}, month = mar, year = {2021}, pages = {401--449}, }
```
Original link to the freely available dataset is here.
Owner
- Name: Rahul Bordoloi
- Login: rbordoloi
- Kind: user
- Location: Rostock
- Company: University of Rostock
- Repositories: 1
- Profile: https://github.com/rbordoloi
Citation (CITATION.cff)
cff-version: 1.2.0
message: Please cite the following works when using this software.
preferred-citation:
abstract: >-
The more extensive access to time-series data, especially for biomedical
purposes, raises new methodological challenges, particularly regarding
missing values. Functional linear discriminant analysis (FLDA) extends
Linear Discriminant Analysis (LDA)-mediated multiclass classification and
dimension reduction to data in the form of fragmented observations of a
univariate function. For large multivariate and partially-observed data,
there are two challenges: (i) statistical dependencies between different
components of a multivariate function and (ii) heterogeneous sampling times
with missing features. We here develop a multivariate version of FLDA,
called MUDRA, to tackle these challenges and describe a computationally
efficient expectation/conditional-maximisation (ECM) algorithm to infer its
parameters without any tensor inversions. We assess its predictive power on
the “Articulary Words” dataset and show its improvement over the
state-of-the-art, especially in the case of missing data. This advancement
in dimension reduction of multivariate functional data holds promise for
enhancing classification accuracy in scenarios like partially observed short
multivariate time series analysis.
authors:
- family-names: Bordoloi
given-names: Rahul
- family-names: Réda
given-names: Clémence
- family-names: Trautmann
given-names: Orell
- family-names: Bej
given-names: Saptarshi
- family-names: Wolkenhauer
given-names: Olaf
doi: 10.1007/s10994-025-06741-0
identifiers:
- type: doi
value: 10.1007/s10994-025-06741-0
- type: url
value: https://doi.org/10.1007/s10994-025-06741-0
- type: other
value: urn:issn:1573-0565
title: >-
Multivariate functional linear discriminant analysis for partially-observed
time series
url: https://doi.org/10.1007/s10994-025-06741-0
date-published: 2025-02-11
year: 2025
month: 2
issn: 1573-0565
issue: '3'
journal: Machine Learning
start: '80'
type: article
volume: '114'
GitHub Events
Total
- Push event: 3
Last Year
- Push event: 3
Dependencies
- numpy >=1.24.0
- scikit-base >=0.6.0
- scikit-fda >=0.9
- scikit-learn >=1.4.1
- scipy >=1.11.2
- sktime >=0.26.0
- tensorly >=0.8.1
- tqdm ==4.40.0
- numpy >=1.24.0
- scikit-base >=0.6.0
- scikit-fda >=0.9
- scikit-learn >=1.4.1
- scipy >=1.11.2
- sktime >=0.26.0
- tensorly >=0.8.1
- tqdm ==4.40.0
- pandas >=2.1.4
- scikit-base >=0.7.5
- sktime >=0.27.0