dfms

Dynamic Factor Models for R

https://github.com/sebkrantz/dfms

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    1 of 1 committers (100.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.7%) to scientific vocabulary

Keywords

dynamic-factor-models rstats time-series
Last synced: 6 months ago · JSON representation

Repository

Dynamic Factor Models for R

Basic Info
Statistics
  • Stars: 39
  • Watchers: 2
  • Forks: 10
  • Open Issues: 0
  • Releases: 7
Topics
dynamic-factor-models rstats time-series
Created over 4 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog Contributing License Codemeta

README.md

dfms: Dynamic Factor Models for R

Status at rOpenSci Software Peer Review R-CMD-check dfms status badge CRAN status cran checks downloads per month downloads Codecov test coverage minimal R version dependencies <!-- Project Status: Active – The project has reached a stable, usable state and is being actively developed. --> <!-- badges: end

dfms provides efficient estimation of Dynamic Factor Models via the EM Algorithm. Factors are assumed to follow a stationary VAR process of order p. Estimation can be done in 3 different ways following:

  • Doz, C., Giannone, D., & Reichlin, L. (2011). A two-step estimator for large approximate dynamic factor models based on Kalman filtering. Journal of Econometrics, 164(1), 188-205. doi:10.1016/j.jeconom.2011.02.012

  • Doz, C., Giannone, D., & Reichlin, L. (2012). A quasi-maximum likelihood approach for large, approximate dynamic factor models. Review of Economics and Statistics, 94(4), 1014-1024. doi:10.1162/REST_a_00225

  • Banbura, M., & Modugno, M. (2014). Maximum likelihood estimation of factor models on datasets with arbitrary pattern of missing data. Journal of Applied Econometrics, 29(1), 133-160. doi:10.1002/jae.2306

The default is em.method = "auto", which chooses "BM" following Banbura & Modugno (2014) with missing data or mixed frequency, and "DGR" following Doz, Giannone & Reichlin (2012) otherwise. Using em.method = "none" generates Two-Step estimates following Doz, Giannone & Reichlin (2011). This is extremely efficient on bigger datasets. PCA and Two-Step estimates are also reported in EM-estimation. All methods support missing data, but em.method = "DGR" does not model them in EM iterations.

The package is currently stable, but functionality may expand in the future. In particular, mixed-frequency estimation with autoregressive errors is planned for the near future, and generation of the 'news' may be added in the further future.

Comparison with Other R Packages

dfms is intended to provide a simple, numerically robust, and computationally efficient baseline implementation of (linear Gaussian) Dynamic Factor Models for R, allowing straightforward application to various contexts such as time series dimensionality reduction and forecasting. The implementation is based on efficient C++ code, making dfms orders of magnitude faster than packages that can be used to fit dynamic factor models such as MARSS, or nowcasting and nowcastDFM geared to mixed-frequency nowcasting applications - supporting blocking of variables into different groups for which factors are to be estimated and evaluation of news content. For large-scale nowcasting models the DynamicFactorMQ class in the statsmodels Python library is probably the most robust implementation - see the example by Chad Fulton. <!-- , and EM adjustments for variables at different frequencies. dfms with em.method = "BM" does allow mixed-frequency data but performs no specific adjustments for the frequency of the data^[All series are weighted equally, and the prevalence of missing values in lower-frequency series downweights them. To remedy this lower frequency series could be included multiple times in the dataset e.g. include a quarterly series 3 times in a monthly dataset.]. dfms currently also does not allow residual autocorrelation in the estimation (i.e. it cannot estimate approximate factor models), but the addition of this feature is planned. --> The dfms package is not intended to fit more general forms of the state space model like MARSS.

Installation

```r

CRAN

install.packages("dfms")

Development Version

install.packages('dfms', repos = c('https://sebkrantz.r-universe.dev', 'https://cloud.r-project.org'))

```

Usage Example

```r library(dfms)

Fit DFM with 6 factors and 3 lags in the transition equation

mod <- DFM(diff(BM14_M), r = 6, p = 3) ```

```

Converged after 32 iterations.

```

```r

'dfm' methods

summary(mod) ```

```

Dynamic Factor Model: n = 92, T = 356, r = 6, p = 3, %NA = 25.8366

Call: DFM(X = diff(BM14_M), r = 6, p = 3)

Summary Statistics of Factors [F]

N Mean Median SD Min Max

f1 356 -0.1189 0.4409 4.0228 -22.9164 7.8513

f2 356 -0.4615 -0.3476 2.9201 -9.0973 10.7003

f3 356 -0.0173 0.0377 2.2719 -8.5067 7.3009

f4 356 -0.007 -0.1338 1.9378 -9.5052 9.3673

f5 356 0.237 0.1091 2.0857 -8.7252 9.6715

f6 356 -0.8361 -0.304 3.1406 -11.6611 15.4897

Factor Transition Matrix [A]

L1.f1 L1.f2 L1.f3 L1.f4 L1.f5 L1.f6 L2.f1 L2.f2 L2.f3

f1 0.53029 -0.53009 0.367302 0.04607 -0.06351 0.10310 0.02457 0.11673 -0.12638

f2 -0.28380 0.07421 -0.032292 0.29741 -0.10094 0.21989 0.09958 -0.09149 0.06708

f3 0.17607 0.12979 0.378798 -0.06662 -0.12236 0.06685 -0.08068 0.09101 -0.22232

f4 0.02711 0.08936 0.004643 0.37159 0.12100 -0.02763 0.01234 -0.05147 0.02195

f5 -0.26227 -0.03469 -0.046294 0.12712 0.26847 0.03141 0.06400 0.01971 0.04806

f6 0.08251 0.17619 -0.013374 -0.08731 -0.03875 0.27812 -0.01662 0.04877 0.02279

L2.f4 L2.f5 L2.f6 L3.f1 L3.f2 L3.f3 L3.f4 L3.f5 L3.f6

f1 0.23135 0.117184 0.21941 0.18478 0.02259 -0.03719 -0.07236 -0.03026 -0.12606

f2 -0.09768 -0.043057 0.08489 0.21107 0.16261 0.03057 0.04835 0.12249 0.13357

f3 0.09799 -0.060666 -0.18028 -0.02773 0.01798 0.10143 -0.12420 0.04207 -0.07011

f4 0.01266 0.050912 0.05144 -0.05601 0.04665 0.05710 -0.11412 -0.05680 -0.01609

f5 -0.03965 -0.009952 -0.18471 0.08332 -0.04640 -0.02047 0.02458 0.16397 0.07820

f6 0.01163 -0.100859 0.07152 0.00792 0.06071 0.11381 0.02520 -0.17897 0.30328

Factor Covariance Matrix [cov(F)]

f1 f2 f3 f4 f5 f6

f1 16.1832 -0.4329 0.2483 -0.8224* -1.7708* 0.7702

f2 -0.4329 8.5272 0.0051 0.2954 -0.2114 4.2080*

f3 0.2483 0.0051 5.1614 -0.1851 -0.3979 0.2979

f4 -0.8224* 0.2954 -0.1851 3.7550 0.4344* 0.2211

f5 -1.7708* -0.2114 -0.3979 0.4344* 4.3503 -1.9785*

f6 0.7702 4.2080* 0.2979 0.2211 -1.9785* 9.8634

Factor Transition Error Covariance Matrix [Q]

u1 u2 u3 u4 u5 u6

u1 7.2142 0.1151 -0.8208 -0.4379 0.4110 -0.1206

u2 0.1151 4.8724 0.1076 -0.1438 0.1418 0.1759

u3 -0.8208 0.1076 4.0584 -0.0788 0.0163 0.0038

u4 -0.4379 -0.1438 -0.0788 3.0003 0.2562 0.0243

u5 0.4110 0.1418 0.0163 0.2562 2.8410 -0.1031

u6 -0.1206 0.1759 0.0038 0.0243 -0.1031 2.9284

Summary of Residual AR(1) Serial Correlations

N Mean Median SD Min Max

92 -0.0644 -0.1024 0.2702 -0.5113 0.6674

Summary of Individual R-Squared's

N Mean Median SD Min Max

92 0.4556 0.4069 0.3041 0.0112 0.9989

```

r plot(mod)

plot of chunk unnamed-chunk-1

r as.data.frame(mod) |> head()

```

Method Factor Time Value

1 PCA f1 1 0.8445713

2 PCA f1 2 0.5259228

3 PCA f1 3 -1.2107116

4 PCA f1 4 -1.5399532

5 PCA f1 5 -0.4631786

6 PCA f1 6 0.2399304

```

```r

Forecasting 12 periods ahead

fc <- predict(mod, h = 12)

'dfm_forecast' methods

plot(fc, xlim = c(320, 370)) ```

plot of chunk unnamed-chunk-1

r as.data.frame(fc) |> head()

```

Variable Time Forecast Value

1 f1 1 FALSE 4.179331

2 f1 2 FALSE -1.368577

3 f1 3 FALSE -12.845157

4 f1 4 FALSE -14.562265

5 f1 5 FALSE -7.791254

6 f1 6 FALSE -1.254970

```

Owner

  • Name: Sebastian Krantz
  • Login: SebKrantz
  • Kind: user
  • Company: Kiel Institute for the World Economy

Economist/data scientist/programmer interested in econometrics, time series, geospatial analysis, machine learning, and high-performance computing.

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "identifier": "dfms",
  "description": "Efficient estimation of Dynamic Factor Models using the Expectation Maximization (EM) algorithm or Two-Step (2S) estimation, on datasets with missing data. The implementation follows advances in the econometric literature: estimation can be done either by running the Kalman Filter and Smoother once with initial values from PCA - following Doz, Giannone and Reichlin (2011) (2S) - or via iterated Kalman Filtering and Smoothing until EM convergence - following Doz, Giannone and Reichlin (2012) - or using the adapted EM algorithm of Banbura and Modugno (2014), allowing estimation with arbitrary patterns of missing data. The implementation makes heavy use of the Armadillo C++ library and the collapse package, providing for particularly speedy estimation. A comprehensive set of methods supports interpretation/visualization of the model and forecasting. Information criteria to choose the number of factors are also provided - following Bai and Ng (2002). --- Key References: --- Doz, C., Giannone, D., & Reichlin, L. (2011). A two-step estimator for large approximate dynamic factor models based on Kalman filtering. Journal of Econometrics, 164(1), 188-205. Doz, C., Giannone, D., & Reichlin, L. (2012). A quasi-maximum likelihood approach for large, approximate dynamic factor models. Review of economics and statistics, 94(4), 1014-1024. Banbura, M., & Modugno, M. (2014). Maximum likelihood estimation of factor models on datasets with arbitrary pattern of missing data. Journal of Applied Econometrics, 29(1), 133-160.",
  "name": "dfms: Dynamic Factor Models",
  "codeRepository": "https://github.com/SebKrantz/dfms",
  "issueTracker": "https://github.com/SebKrantz/dfms/issues",
  "license": "https://spdx.org/licenses/GPL-3.0",
  "version": "0.1.1",
  "programmingLanguage": {
    "@type": "ComputerLanguage",
    "name": "R",
    "url": "https://r-project.org"
  },
  "runtimePlatform": "R version 4.1.1 (2021-08-10)",
  "author": [
    {
      "@type": "Person",
      "givenName": "Sebastian",
      "familyName": "Krantz",
      "email": "sebastian.krantz@graduateinstitute.ch"
    },
    {
      "@type": "Person",
      "givenName": "Rytis",
      "familyName": "Bagdziunas"
    }
  ],
  "maintainer": [
    {
      "@type": "Person",
      "givenName": "Sebastian",
      "familyName": "Krantz",
      "email": "sebastian.krantz@graduateinstitute.ch"
    }
  ],
  "softwareSuggestions": [
    {
      "@type": "SoftwareApplication",
      "identifier": "xts",
      "name": "xts",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=xts"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "vars",
      "name": "vars",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=vars"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "magrittr",
      "name": "magrittr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=magrittr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "testthat",
      "name": "testthat",
      "version": ">= 3.0.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=testthat"
    }
  ],
  "softwareRequirements": {
    "1": {
      "@type": "SoftwareApplication",
      "identifier": "R",
      "name": "R",
      "version": ">= 3.0.0"
    },
    "2": {
      "@type": "SoftwareApplication",
      "identifier": "Rcpp",
      "name": "Rcpp",
      "version": ">= 1.0.1",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=Rcpp"
    },
    "3": {
      "@type": "SoftwareApplication",
      "identifier": "collapse",
      "name": "collapse",
      "version": ">= 1.8.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=collapse"
    },
    "SystemRequirements": null
  },
  "fileSize": "7022.375KB",
  "relatedLink": "https://sebkrantz.github.io/dfms/",
  "releaseNotes": "https://github.com/SebKrantz/dfms/blob/master/NEWS.md",
  "readme": "https://github.com/SebKrantz/dfms/blob/main/README.md",
  "contIntegration": "https://github.com/SebKrantz/dfms/actions",
  "keywords": [
    "dynamic-factor-models",
    "rstats",
    "time-series"
  ]
}

GitHub Events

Total
  • Create event: 3
  • Release event: 3
  • Issues event: 3
  • Watch event: 8
  • Delete event: 1
  • Issue comment event: 2
  • Push event: 96
  • Pull request event: 31
  • Fork event: 1
Last Year
  • Create event: 3
  • Release event: 3
  • Issues event: 3
  • Watch event: 8
  • Delete event: 1
  • Issue comment event: 2
  • Push event: 96
  • Pull request event: 31
  • Fork event: 1

Committers

Last synced: 11 months ago

All Time
  • Total Commits: 321
  • Total Committers: 1
  • Avg Commits per committer: 321.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 32
  • Committers: 1
  • Avg Commits per committer: 32.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Sebastian Krantz s****z@g****h 321
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 11 months ago

All Time
  • Total issues: 4
  • Total pull requests: 61
  • Average time to close issues: 18 days
  • Average time to close pull requests: about 2 hours
  • Total issue authors: 4
  • Total pull request authors: 1
  • Average comments per issue: 1.5
  • Average comments per pull request: 0.0
  • Merged pull requests: 61
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 8
  • Average time to close issues: N/A
  • Average time to close pull requests: less than a minute
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 8
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • mjng93 (1)
  • clukewatson (1)
  • ffabio-econ (1)
  • apoorvalal (1)
  • SantiagoD999 (1)
Pull Request Authors
  • SebKrantz (78)
Top Labels
Issue Labels
enhancement (1)
Pull Request Labels

Dependencies

DESCRIPTION cran
  • R >= 3.0.0 depends
  • Rcpp * imports
  • collapse * imports
.github/workflows/test-coverage.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/setup-r v1 composite
  • r-lib/actions/setup-r-dependencies v1 composite
.github/workflows/check-standard.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite