https://github.com/markvanderloo/simputation

Making imputation easy

https://github.com/markvanderloo/simputation

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.0%) to scientific vocabulary

Keywords

data-science imputation officialstatistics r rstats
Last synced: 6 months ago · JSON representation

Repository

Making imputation easy

Basic Info
  • Host: GitHub
  • Owner: markvanderloo
  • License: gpl-3.0
  • Language: R
  • Default Branch: master
  • Homepage:
  • Size: 738 KB
Statistics
  • Stars: 91
  • Watchers: 3
  • Forks: 10
  • Open Issues: 13
  • Releases: 0
Topics
data-science imputation officialstatistics r rstats
Created over 9 years ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.md

CRANstatus DownloadsMentioned in Awesome Official Statistics

simputation

An R package to make imputation simple. Currently supported methods include

  • Model based (optionally add [non-]parametric random residual)
    • linear regression
    • robust linear regression (M-estimation)
    • ridge/elasticnet/lasso regression (from version >= 0.2.1)
    • CART models
    • Random forest
  • Model based, multivariate
    • Imputation based on EM-estimated parameters (from version >= 0.2.1)
    • missForest (from version >= 0.2.1)
  • Donor imputation (including various donor pool specifications)
    • k-nearest neigbour (based on gower's distance)
    • sequential hotdeck (LOCF, NOCB)
    • random hotdeck
    • Predictive mean matching
  • Other
    • (groupwise) median imputation (optional random residual)
    • Proxy imputation (copy from other variable)

Installation

To install simputation and all packages needed to support various imputation models do the following. r install.packages("simputation", dependencies=TRUE)

To install the development version.

{bash} git clone https://github.com/markvanderloo/simputation make install

Example usage

Create some data suffering from missings ```r library(simputation) # current package

dat <- iris

empty a few fields

dat[1:3,1] <- dat[3:7,2] <- dat[8:10,5] <- NA head(dat,10) Now impute `Sepal.Length` and `Sepal.Width` by regression on `Petal.Length` and `Species`, and impute `Species` using a CART model, that uses all other variables (including the imputed variables in this case). r dat |> imputelm(Sepal.Length + Sepal.Width ~ Petal.Length + Species) |> imputecart(Species ~ .) |> # use all variables except 'Species' as predictor head(10) ```

Materials

Owner

  • Name: Mark van der Loo
  • Login: markvanderloo
  • Kind: user
  • Location: Netherlands
  • Company: Statistics Netherlands | Tridata

math, programming, data

GitHub Events

Total
  • Watch event: 4
Last Year
  • Watch event: 4

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 174
  • Total Committers: 3
  • Avg Commits per committer: 58.0
  • Development Distribution Score (DDS): 0.011
Past Year
  • Commits: 1
  • Committers: 1
  • Avg Commits per committer: 1.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Mark van der Loo m****o@g****m 172
Edwin de Jonge e****e@g****m 1
Karl Dunkle Werner k****w 1

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 37
  • Total pull requests: 5
  • Average time to close issues: 9 months
  • Average time to close pull requests: 22 days
  • Total issue authors: 17
  • Total pull request authors: 4
  • Average comments per issue: 1.19
  • Average comments per pull request: 0.6
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • markvanderloo (19)
  • reijeridema (2)
  • sfallahpour (2)
  • jennybc (1)
  • MichaelLeviValensin (1)
  • Ranjeet-S (1)
  • tatsianapek (1)
  • bcjaeger (1)
  • karldw (1)
  • MilesMcBain (1)
  • topepo (1)
  • heejongkim (1)
  • jakobbossek (1)
  • alejosv (1)
  • mbac (1)
Pull Request Authors
  • reijeridema (4)
  • sfallahpour (1)
  • karldw (1)
  • edwindj (1)
Top Labels
Issue Labels
enhancement (14) bug (7) question (4) wontfix (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 1,071 last-month
  • Total docker downloads: 21,917
  • Total dependent packages: 2
  • Total dependent repositories: 3
  • Total versions: 11
  • Total maintainers: 1
cran.r-project.org: simputation

Simple Imputation

  • Versions: 11
  • Dependent Packages: 2
  • Dependent Repositories: 3
  • Downloads: 1,071 Last month
  • Docker Downloads: 21,917
Rankings
Stargazers count: 4.3%
Forks count: 5.8%
Downloads: 10.5%
Average: 12.7%
Dependent packages count: 13.7%
Dependent repos count: 16.5%
Docker downloads count: 25.7%
Maintainers (1)
Last synced: 7 months ago

Dependencies

pkg/DESCRIPTION cran
  • R >= 4.0.0 depends
  • MASS * imports
  • VIM * imports
  • glmnet * imports
  • gower * imports
  • missForest * imports
  • norm * imports
  • randomForest * imports
  • rpart * imports
  • stats * imports
  • utils * imports
  • dplyr * suggests
  • knitr * suggests
  • rmarkdown * suggests
  • tinytest * suggests