mudra

https://github.com/rbordoloi/mudra

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (16.7%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: rbordoloi
License: gpl-3.0
Language: Jupyter Notebook
Default Branch: master
Size: 12.7 MB

Statistics

Stars: 0
Watchers: 1
Forks: 1
Open Issues: 0
Releases: 0

Created over 2 years ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

Source Code for the Paper "Multivariate Functional Linear Discriminant Analysis (MUDRA) for the Classification of Short Time Series with Missing Data"

Installation

We advise to use a virtual environment, either with Conda or VirtualEnv. Then run the following command:

```bash git clone https://github.com/rbordoloi/MUDRA.git cd MUDRA/

If using pip

python -m pip install --upgrade setuptools python -m pip install -r pip/requirements.txt

If using conda

cond env create --name mudra --file=environment.yml python setup.py install --user ```

The sample dataset is included in the datasets directory. To regenerate the dataset from the original source, run datasetGeneration.py.

Example

The class MUDRA is defined like a scikit-learn module, that is

To import the MUDRA class:

python from MUDRA import MUDRA

The model accepts input X as a pandas DataFrame of shape (n_samples, n_features) and y as list of class labels. Each cell of the DataFrame has a pandas Series object corresponding to the time series for one feature of one sample. Each Series object is indexed by the time points for which observations were recorded. Missing features are denoted by np.nan objects. - To fit the model on training data (X,y) (for r=8, b=9 and 300 iterations for the last optimization step):

python model = MUDRA(r=8, n_iter=300, nBasis=9).fit(X, y)

To perform dimension reduction on new data (X):

python x = model.transform(X)

To predict labels on new data (X):

python y = model.predict(X)

To predict scores on new data (X):

python y = model.predict_proba(X)

Reproduce the results shown in the paper

Please check out the interactive Jupyter notebooks "synthetic.ipynb" and "real.ipynb". After installing Jupyter Notebook, please run the following commands:

bash jupyter notebook real.ipynb jupyter notebook synthetic.ipynb

Citations

If you use MUDRA in academic research, please cite it as follows

``` @article{bordoloi2025multivariate, title={Multivariate functional linear discriminant analysis for partially-observed time series}, author={Bordoloi, Rahul and R{\'e}da, Cl{\'e}mence and Trautmann, Orell and Bej, Saptarshi and Wolkenhauer, Olaf}, journal={Machine Learning}, volume={114}, number={3}, pages={80}, year={2025}, publisher={Springer} }

```

The citation for the ``Articulary Word Recognition'' data set (available in folder "datasets/") is

``` @article{ruizgreat2021, title = {The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances}, volume = {35}, issn = {1573-756X}, doi = {10.1007/s10618-020-00727-3}, number = {2}, journal = {Data Min Knowl Disc}, author = {Ruiz, Alejandro Pasos and Flynn, Michael and Large, James and Middlehurst, Matthew and Bagnall, Anthony}, month = mar, year = {2021}, pages = {401--449}, }

```

Original link to the freely available dataset is here.

Owner

Name: Rahul Bordoloi
Login: rbordoloi
Kind: user
Location: Rostock
Company: University of Rostock

Repositories: 1
Profile: https://github.com/rbordoloi

Citation (CITATION.cff)

cff-version: 1.2.0
message: Please cite the following works when using this software.
preferred-citation:
  abstract: >-
    The more extensive access to time-series data, especially for biomedical
    purposes, raises new methodological challenges, particularly regarding
    missing values. Functional linear discriminant analysis (FLDA) extends
    Linear Discriminant Analysis (LDA)-mediated multiclass classification and
    dimension reduction to data in the form of fragmented observations of a
    univariate function. For large multivariate and partially-observed data,
    there are two challenges: (i) statistical dependencies between different
    components of a multivariate function and (ii) heterogeneous sampling times
    with missing features. We here develop a multivariate version of FLDA,
    called MUDRA, to tackle these challenges and describe a computationally
    efficient expectation/conditional-maximisation (ECM) algorithm to infer its
    parameters without any tensor inversions. We assess its predictive power on
    the “Articulary Words” dataset and show its improvement over the
    state-of-the-art, especially in the case of missing data. This advancement
    in dimension reduction of multivariate functional data holds promise for
    enhancing classification accuracy in scenarios like partially observed short
    multivariate time series analysis.
  authors:
    - family-names: Bordoloi
      given-names: Rahul
    - family-names: Réda
      given-names: Clémence
    - family-names: Trautmann
      given-names: Orell
    - family-names: Bej
      given-names: Saptarshi
    - family-names: Wolkenhauer
      given-names: Olaf
  doi: 10.1007/s10994-025-06741-0
  identifiers:
    - type: doi
      value: 10.1007/s10994-025-06741-0
    - type: url
      value: https://doi.org/10.1007/s10994-025-06741-0
    - type: other
      value: urn:issn:1573-0565
  title: >-
    Multivariate functional linear discriminant analysis for partially-observed
    time series
  url: https://doi.org/10.1007/s10994-025-06741-0
  date-published: 2025-02-11
  year: 2025
  month: 2
  issn: 1573-0565
  issue: '3'
  journal: Machine Learning
  start: '80'
  type: article
  volume: '114'

GitHub Events

Total

Push event: 3

Last Year

Push event: 3

Dependencies

pip/requirements.txt pypi

numpy >=1.24.0
scikit-base >=0.6.0
scikit-fda >=0.9
scikit-learn >=1.4.1
scipy >=1.11.2
sktime >=0.26.0
tensorly >=0.8.1
tqdm ==4.40.0

pyproject.toml pypi

setup.py pypi

numpy >=1.24.0
scikit-base >=0.6.0
scikit-fda >=0.9
scikit-learn >=1.4.1
scipy >=1.11.2
sktime >=0.26.0
tensorly >=0.8.1
tqdm ==4.40.0

environment.yml pypi

pandas >=2.1.4
scikit-base >=0.7.5
sktime >=0.27.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science