imputeTS

CRAN R Package: Time Series Missing Value Imputation

https://github.com/steffenmoritz/imputets

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (18.0%) to scientific vocabulary

Keywords

cran data-visualization imputation imputation-algorithm imputets missing-data time-series

Keywords from Contributors

hack bruteforce
Last synced: 6 months ago · JSON representation

Repository

CRAN R Package: Time Series Missing Value Imputation

Basic Info
Statistics
  • Stars: 168
  • Watchers: 9
  • Forks: 26
  • Open Issues: 9
  • Releases: 0
Topics
cran data-visualization imputation imputation-algorithm imputets missing-data time-series
Created over 9 years ago · Last pushed 6 months ago
Metadata Files
Readme License

README.md

<!-- badges: start --> Project Status: Active The project has reached a stable, usable state and is being actively developed. R-CMD-check Codecov test coverage CRAN Version CRAN Release CRAN Downloads <!-- badges: end -->

imputeTS: Time Series Missing Value Imputation imputeTS Logo

The imputeTS package specializes on (univariate) time series imputation. It offers several different imputation algorithm implementations. Beyond the imputation algorithms the package also provides plotting and printing functions of time series missing data statistics. Additionally three time series datasets for imputation experiments are included.

Installation

The imputeTS package can be found on CRAN. For installation execute in R:

install.packages("imputeTS")

If you want to install the latest version from GitHub (can be unstable) run:

library(devtools) install_github("SteffenMoritz/imputeTS")

Usage

  • #### Imputation

To impute (fill all missing values) in a time series x, run the following command: na_interpolation(x) Output is the time series x with all NA's replaced by reasonable values.

This is just one example for an imputation algorithm. In this case interpolation was the algorithm of choice for calculating the NA replacements. There are several other algorithms (see also under caption Imputation Algorithms). All imputation functions are named alike starting with na_ followed by a algorithm label e.g. namean, nakalman, ...

  • #### Plotting

To plot missing data statistics for a time series x, run the following command:

ggplot_na_distribution(x)

 

Example ggplot_na_distribution plot

This is just one exemplary plot. Overall there are five different types of missing data plots (see also under caption Missing Data Plots). There is an additional tutorial just about plots available - the Gallery of Visualizations.

  • #### Printing

To print descriptive statistics about the missing data in a time series x, run the following command: statsNA(x)

  • #### Example Datasets

To load the 'heating' time series (with missing values) into a variable y and the 'heating' time series (without missing values) into a variable z, run:

y <- tsHeating z <- tsHeatingComplete

There are three datasets provided with the package, the 'tsHeating', the 'tsAirgap' and the 'tsNH4' time series (see also under caption Datasets).

Imputation Algorithms {#imputation-algorithms}

Here is a table with available algorithms to choose from:

| Function | Description | | :--------------------|:-----------------------------------------------------------| | nainterpolation |Missing Value Imputation by Interpolation | | nakalman |Missing Value Imputation by Kalman Smoothing | | nalocf |Missing Value Imputation by Last Observation Carried Forward| | nama |Missing Value Imputation by Weighted Moving Average | | namean |Missing Value Imputation by Mean Value | | narandom |Missing Value Imputation by Random Sample | | naremove |Remove Missing Values | | nareplace |Replace Missing Values by a Defined Value | | naseadec |Seasonally Decomposed Missing Value Imputation | | naseasplit |Seasonally Splitted Missing Value Imputation |

This is a rather broad overview. The functions itself mostly offer more than just one algorithm. For example na_interpolation can be set to linear or spline interpolation.

More detailed information about the algorithms and their options can be found in the imputeTS reference manual.

Missing Data Plots {#missing-data-plots}

Here is a table with available plots to choose from:

| Function | Description | | :-----------------------|:-------------------------------------------------------------| | ggplotnadistribution |Visualize Distribution of Missing Values | | ggplotnadistribution2 |Missing Values Summarized in Time Intervals | | ggplotnagapsize |Visualize Distribution of NA Gapsizes | | ggplotnagapsize2 |Visualize Total NAs of Different NA Gapsizes | | ggplotnaimputations |Visualize Imputed Values |

More detailed information about the plots can be found in the imputeTS reference manual and in the Gallery of Visualizations.

Datasets {#datasets}

There are three datasets (each in two versions) available:

| Dataset | Description | | :----------------|:-----------------------------------------------------------------| | tsAirgap |Time series of monthly airline passengers (with NAs) | | tsAirgapComplete |Time series of monthly airline passengers (complete) | | tsHeating |Time series of a heating systems supply temperature (with NAs) | | tsHeatingComplete|Time series of a heating systems supply temperature (complete) | | tsNH4 |Time series of NH4 concentration in a wastewater system (with NAs)| | tsNH4Complete |Time series of NH4 concentration in a wastewater system (complete)|

The tsAirgap, tsHeating and tsNH4 time series are with NAs. Their complete versions are without NAs. Except the missing values their versions are identical. The NAs for the time series were artifically inserted by simulating the missing data pattern observed in similar non-complete time series from the same domain. Having a complete and incomplete version of the same dataset is useful for conducting experiments of imputation functions.

More detailed information about the datasets can be found in the imputeTS reference manual.

Reference

You can cite imputeTS the following:

Moritz, Steffen, and Bartz-Beielstein, Thomas. "imputeTS: Time Series Missing Value Imputation in R." R Journal 9.1 (2017). doi: 10.32614/RJ-2017-009.

Need Help?

If you have general programming problems or need help using the package please ask your question on StackOverflow. By doing so all users will be able to benefit in the future from your question.

Don't forget to mark your question with the imputets tag on StackOverflow to get me notified

Support

If you found a bug or have suggestions, feel free to get in contact via steffen.moritz10 at gmail.com.

All feedback is welcome

Version

3.4

License

GPL-3

Owner

  • Name: Steffen Moritz
  • Login: SteffenMoritz
  • Kind: user
  • Location: Germany

GitHub Events

Total
  • Issues event: 5
  • Watch event: 8
  • Issue comment event: 2
  • Push event: 5
Last Year
  • Issues event: 5
  • Watch event: 8
  • Issue comment event: 2
  • Push event: 5

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 273
  • Total Committers: 9
  • Avg Commits per committer: 30.333
  • Development Distribution Score (DDS): 0.103
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
SteffenMoritz s****0@g****m 245
Sebastian Gatscha g****a@t****u 11
Michael Chirico c****m@g****m 7
ImgBotApp I****p@g****m 4
YsoSirius k****1@g****t 2
earowang e****g@g****m 1
Teru Watanabe w****0@g****m 1
Ron Hause r****e@g****m 1
RicardaP r****t@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 55
  • Total pull requests: 15
  • Average time to close issues: 6 months
  • Average time to close pull requests: about 2 months
  • Total issue authors: 36
  • Total pull request authors: 8
  • Average comments per issue: 2.25
  • Average comments per pull request: 3.4
  • Merged pull requests: 15
  • Bot issues: 0
  • Bot pull requests: 4
Past Year
  • Issues: 2
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • SteffenMoritz (20)
  • bjarkeA (1)
  • cmohamma (1)
  • AndrewCunliffe (1)
  • yuchenw (1)
  • hariskr (1)
  • Breza (1)
  • ntthung (1)
  • englianhu (1)
  • MehrdadVaredi (1)
  • Lakminikw (1)
  • trafficonese (1)
  • SURmcapp (1)
  • renelikestacos (1)
  • eeedasc (1)
Pull Request Authors
  • trafficonese (5)
  • imgbot[bot] (4)
  • earowang (1)
  • watanabe8760 (1)
  • ronaldhause (1)
  • RicardaP (1)
  • MichaelChirico (1)
  • YsoSirius (1)
Top Labels
Issue Labels
ToDo List (15) enhancement (11) issue (2)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 14,010 last-month
  • Total docker downloads: 43,602
  • Total dependent packages: 27
  • Total dependent repositories: 52
  • Total versions: 21
  • Total maintainers: 1
cran.r-project.org: imputeTS

Time Series Missing Value Imputation

  • Versions: 21
  • Dependent Packages: 27
  • Dependent Repositories: 52
  • Downloads: 14,010 Last month
  • Docker Downloads: 43,602
Rankings
Dependent packages count: 2.6%
Stargazers count: 2.7%
Forks count: 3.1%
Dependent repos count: 3.4%
Downloads: 3.9%
Average: 6.8%
Docker downloads count: 24.9%
Maintainers (1)
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.6 depends
  • Rcpp * imports
  • forecast * imports
  • ggplot2 >= 3.3.0 imports
  • ggtext * imports
  • grDevices * imports
  • magrittr * imports
  • stats * imports
  • stinepack * imports
  • R.rsp * suggests
  • covr * suggests
  • knitr * suggests
  • rmarkdown * suggests
  • testthat * suggests
  • tibble * suggests
  • timeSeries * suggests
  • tis * suggests
  • tsibble * suggests
  • xts * suggests
  • zoo * suggests