https://github.com/smonto2/ptfa

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.4%) to scientific vocabulary

Last synced: 11 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: smonto2
License: other
Language: Jupyter Notebook
Default Branch: main
Size: 12.2 MB

Statistics

Stars: 0
Watchers: 2
Forks: 2
Open Issues: 0
Releases: 0

Created almost 2 years ago · Last pushed 11 months ago

Metadata Files

Readme License

Probabilistic Targeted Factor Analysis (PTFA)

ptfa provides an implementation of Probabilistic Targeted Factor Analysis, a probabilistic extension of Partial Least Squares (PLS) designed to extract latent factors from features $(X)$ to optimally predict a set of pre-specified target variables $(Y)$. It leverages an Expectation-Maximization (EM) algorithm for robust parameter estimation, accommodating challenges such as missing data, stochastic volatility, and dynamic factors.

The framework balances flexibility and efficiency, providing an alternative to traditional methods like principal component analysis (PCA) and standard PLS by incorporating probabilistic foundations.

Features

Joint estimation of latent factors and parameters.
Robust against noise, missing data, and model uncertainty.
Extensible to stochastic volatility, mixed-frequency data and dynamic factor models.
Competitive performance in high-dimensional forecasting tasks.

Installation

You can install ptfa from PyPI:

bash pip install ptfa

Routines

The ptfa module includes several classes aimed at implementing PTFA in a variety of real-world data settings: - ProbabilisticTFA: main workhorse class providing factor extraction from features X to predict targets Y by extracting n_components number of common latent factors. - ProbabilisticTFA_MixedFrequency: adapts to situations where natural measurement frequency of X is larger than Y (e.g., using monthly information to predict quarterly variables). - ProbabilisticTFA_StochasticVolatility: adapts main class to deal with stochastic volatility (variance changing with time) in features and targets. - ProbabilisticTFA_DynamicFactors: when factors can exhibit time-series persistence, we fit a vector autoregressive of order 1 (VAR-1) process on the latent factors. - ProbabilisticPCA: an implementation of original Probabilistic PCA (Tipping and Bishop, 1999, JRSS-B) to compare with PTFA.

All classes have the following methods in common: - __init__(self, n_components): creates the class instance with specified number of latent components. - fit(self, X, Y, ...): fits the PTFA model to the given data using a tailored EM algorithm for each class and extracts latent factors. - fitted(self, ...): computes the in-sample fitted values for the targets. - predict(self, X): out-of-sample predicted values of targets using new features X.

In addition, each class comes equipped with specific functions to handle the respective data-generating processes. More details on the routines and the additional arguments ... each command can take can be found in the documentation for each class in the GitHub repository).

Finally, all classes can handle missing-at-random data in the form of numpy.nan entries in the data arrays X and Y. Alternatively, these arrays can be directly passed as numpy.MaskedArray objects.

Usage

A large example showcasing the capabilities of PTFA is provided in the package repository: Example Notebook.

Here is a quick example of how to use the main class for factor extraction and forecasting, called ProbabilisticTFA:

```python import numpy as np from ptfa import ProbabilisticTFA

Example data: predictors (X) and targets (Y)

X = np.random.rand(100, 10) # 100 observations, 10 predictors Y = np.random.rand(100, 2) # 100 observations, 2 targets

Initialize PTFA model with desired number of components

model = ProbabilisticTFA(n_components=1)

Fit the model to data X and Y using EM algorithm

model.fit(X, Y)

Calculate in-sample fitted values

Y_fitted = model.fitted()

Calculate out-of-sample forecasts

X = np.random.rand(100, 10) Y_predicted = model.predict(X)

print("Fitted targets:") print(Y_fitted)

print("Predicted targets:") print(Y_predicted)

Running .fit() method saves to model object the

extracted common factors from features and targets

print("Recovered factors:") print(model.factors) ```

Contributing

Feel free to open issues or contribute to the repository through pull requests. We welcome suggestions and improvements to the package!

BibTeX Citation

If you use PTFA, we would appreciate if you cite our work as: bibtex @misc{herculano_2024_probabilistic, title = {Probabilistic Targeted Factor Analysis}, author = {Herculano, Miguel C. and Montoya-Blandón, Santiago}, year = {2024}, eprint = {2412.06688}, archivePrefix = {arXiv}, primaryClass = {econ.EM}, url = {https://arxiv.org/abs/2412.06688}, }

Licence

This project is licensed under the MIT License.

Owner

Name: Santiago Montoya-Blandón
Login: smonto2
Kind: user
Location: Glasgow, UK
Company: University of Glasgow

Website: https://www.smontoyablandon.com/
Twitter: SMontoyaBlandon
Repositories: 1
Profile: https://github.com/smonto2

Lecturer in Econometrics for the Adam Smith Business School at the University of Glasgow

GitHub Events

Total

Public event: 1
Push event: 17
Fork event: 1

Last Year

Public event: 1
Push event: 17
Fork event: 1

Packages

Total packages: 1
Total downloads:
- pypi 363 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 22
Total maintainers: 1

pypi.org: ptfa

Probabilistic Targeted Factor Analysis

Homepage: https://github.com/smonto2/PTFA
Documentation: https://ptfa.readthedocs.io/
License: MIT License
Latest release: 0.3.3
published 11 months ago

Versions: 22
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 363 Last month

Rankings

Dependent packages count: 10.0%

Average: 33.2%

Dependent repos count: 56.3%

Maintainers (1)

smontoya

Last synced: 11 months ago

Dependencies

pyproject.toml pypi

numpy *
scikit-learn *

src/ptfa.egg-info/requires.txt pypi

numpy *
scikit-learn *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science