ldats

Latent Dirichlet Allocation coupled with Bayesian Time Series analyses

https://github.com/weecology/ldats

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 13 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.2%) to scientific vocabulary

Keywords

changepoint lda parallel-tempering portal softmax

Keywords from Contributors

ecology community-ecology small-mammal-trapping english lesson shiny data-retrieval hacktobefest carpentries data-carpentry
Last synced: 6 months ago · JSON representation

Repository

Latent Dirichlet Allocation coupled with Bayesian Time Series analyses

Basic Info
Statistics
  • Stars: 25
  • Watchers: 5
  • Forks: 5
  • Open Issues: 22
  • Releases: 8
Topics
changepoint lda parallel-tempering portal softmax
Created about 8 years ago · Last pushed over 2 years ago
Metadata Files
Readme Contributing License Code of conduct

README.md

Latent Dirichlet Allocation coupled with Bayesian Time Series analyses

Build Status License Lifecycle:maturing Codecov test coverage CRAN downloads DOI

Overview

The LDATS package provides functionality for analyzing time series of high-dimensional data using a two-stage approach comprised of Latent Dirichlet Allocation (LDA) and Bayesian time series (TS) analyses.

For a full description of the math underlying the LDATS package, see the technical document.

Status: Stable Version Available, Continuing Development

A stable version of LDATS is available on CRAN, but the package is actively being developed by the Weecology Team. The API is well defined at this point and should not change substantially.

Installation

You can install the stable version of LDATS from CRAN with:

To obtain the current development version of LDATS from GitHub, install the devtools package and then use it to install LDATS:

r install.packages("devtools") devtools::install_github("weecology/LDATS")

Usage

Here is an example of a full LDA-TS analysis using the Portal rodent data:

r library(LDATS) data(rodents) r_LDATS <- LDA_TS(rodents, topics = 2:5, nseeds = 2, formulas = ~1, nchangepoints = 0:1, timename = "newmoon") Which conducts two replicates (nseeds) for each of two to five topics in an LDA model using the document term table, selects the best (AIC) of those, then conducts two time series models on it (an intercept-only model under 0 and 1 changepoints), then selects the best (AIC) of the time series, and packages all the models together. This uses the document term table to weight the samples by their sizes (number of words) and instructs the function to use the column named "newmoon" in the document covariates table as the time variable.

The resulting object is of class LDA_TS, which has a few basic routines available:

r print(r_LDATS) prints the selected LDA and TS models and r plot(r_LDATS) produces a 4-panel figure of them a la Figure 1 from Christensen et al. 2018.

More Information

Based on initial work using LDA to analyze time-series data at Portal by Erica M. Christensen, David J. Harris, and S. K. Morgan Ernest, which has been published in Ecology

Acknowledgements

The motivating study—the Portal Project—has been funded nearly continuously since 1977 by the National Science Foundation, most recently by DEB-1622425 to S. K. M. Ernest, which also supported (in part) E. Christensen’s time. Much of the computational work (including time of J. Simonis, D. Harris, and H. Ye) was supported by the Gordon and Betty Moore Foundation’s Data-Driven Discovery Initiative through Grant GBMF4563 to E. P. White. R. Diaz was supported in part by a National Science Foundation Graduate Research Fellowship (No. DGE-1315138 and DGE-1842473).

Author Contributions

J. L. Simonis provided insight on LDA applications and feedback on technical writing during development of the first version of the LDATS model and application, led the coding and mathematical development of the model into an R package, and led writing on the technical model document. E. M. Christensen led the project during development of the first version of the LDATS model and its application to the Portal data, specifically conceiving the project, coding the pipeline wrappers of the analysis, and writing and editing the first description of the model and its application (Christensen et al. 2018). D. J. Harris was involved in developing and applying the first version of the LDATS model, specifically suggesting the LDA and change point approaches, coding the first version of the change point model, and writing and editing the first description of the model (Christensen et al. 2018). R. Diaz contributed code to the LDATS package, wrote vignettes, provided insight into model development, and conducted extensive end-user code application testing. H. Ye contributed code to the LDATS package, insight into data structures and LDA algorithms, and significant feedback on vignettes. E. P. White helped design, troubleshoot, and supervise initial methods development; provided big-picture feedback on development of the R package; contributed end-user application testing; and gave substantial editing feedback on the technical document. S. K. Morgan Ernest provided managerial oversight and feedback on the project in both the initial and second stages of LDATS development, tested applications of the code to data sets, and assisted with writing and editing of the first description of the model and its application (Christensen et al. 2018) as well as the technical model document.

Owner

  • Name: Weecology
  • Login: weecology
  • Kind: organization

GitHub Events

Total
Last Year

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 669
  • Total Committers: 8
  • Avg Commits per committer: 83.625
  • Development Distribution Score (DDS): 0.543
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
DAPPERstats d****s@g****m 306
Renata Diaz d****m@g****m 195
Erica Christensen e****n@w****g 120
Hao Ye l****d@g****m 22
David J. Harris h****1@g****m 12
Hao Ye h****e@w****g 6
David J. Harris d****s 5
Ethan White e****n@w****g 3
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 52
  • Total pull requests: 49
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 11 days
  • Total issue authors: 4
  • Total pull request authors: 3
  • Average comments per issue: 1.42
  • Average comments per pull request: 1.08
  • Merged pull requests: 46
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • juniperlsimonis (39)
  • ethanwhite (6)
  • ha0ye (5)
  • diazrenata (2)
Pull Request Authors
  • juniperlsimonis (44)
  • diazrenata (3)
  • ha0ye (2)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 678 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 4
  • Total maintainers: 1
cran.r-project.org: LDATS

Latent Dirichlet Allocation Coupled with Time Series Analyses

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 678 Last month
Rankings
Stargazers count: 10.4%
Forks count: 10.9%
Downloads: 15.4%
Average: 17.8%
Dependent repos count: 24.3%
Dependent packages count: 27.9%
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.2.3 depends
  • coda * imports
  • digest * imports
  • extraDistr * imports
  • grDevices * imports
  • graphics * imports
  • lubridate * imports
  • magrittr * imports
  • memoise * imports
  • methods * imports
  • mvtnorm * imports
  • nnet * imports
  • progress * imports
  • stats * imports
  • topicmodels * imports
  • viridis * imports
  • knitr * suggests
  • pkgdown * suggests
  • rmarkdown * suggests
  • testthat * suggests
  • vdiffr * suggests