localFDA

Localization processes for functional data analysis. Software companion for the paper “Localization processes for functional data analysis” by Elías, A., Jiménez, R., and Yukich, J. (2020)

https://github.com/aefdz/localfda

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.6%) to scientific vocabulary

Keywords

classification functional-data-analysis imputation outliers-detection
Last synced: 6 months ago · JSON representation

Repository

Localization processes for functional data analysis. Software companion for the paper “Localization processes for functional data analysis” by Elías, A., Jiménez, R., and Yukich, J. (2020)

Basic Info
  • Host: GitHub
  • Owner: aefdz
  • License: gpl-3.0
  • Language: R
  • Default Branch: master
  • Homepage:
  • Size: 2.25 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Topics
classification functional-data-analysis imputation outliers-detection
Created over 5 years ago · Last pushed about 5 years ago
Metadata Files
Readme License

README.Rmd

---
title: "Localization processes for Functional Data Analysis"
author: "Antonio Elías"
date: "22/07/2020"
output:
  md_document:
    variant: markdown_github
---

```{r setup, include=FALSE, message = FALSE, warning = FALSE, fig.align = 'center'}
knitr::opts_chunk$set(echo = TRUE)

library(ggplot2)
library(patchwork)
library(dplyr)
```

localFDA
=======


[![License](https://img.shields.io/badge/license-GPL%20v3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![Travis build status](https://travis-ci.com/aefdz/localFDA.svg?branch=master)](https://travis-ci.com/aefdz/localFDA)


## Overview

Software companion for the paper "Localization processes for functional data analysis" by Elías, Antonio, Jiménez, Raúl, and Yukich, Joe, (2020) . It provides the code for computing localization processes and localization distances and their application to classification and outlier detection problems.

## Installation 

```{r, message = FALSE}
#install the package
devtools::install_github("aefdz/localFDA")
```

```{r}
#load the package
library(localFDA)
```

## Test usage
Load the example data and plot it.

```{r}
X <- exampleData
n <- ncol(X)
p <- nrow(X)
t <- as.numeric(rownames(X))

#plot the data set
df_functions <- data.frame(ids = rep(colnames(X), each = p),
                           y = c(X),
                           x = rep(t, n)
                           )

functions_plot <- ggplot(df_functions) + 
                  geom_line(aes(x = x, y = y, group = ids, color = ids), 
                            color = "black", alpha = 0.25) + 
                  xlab("t") + theme(legend.position = "none")


functions_plot
```

### Compute *kth empirical localization processes*
Empirical version of Equation (1) of the paper. For one focal,

```{r}
focal <- "1"

localizarionProcesses_focal <- localizationProcesses(X, focal)$lc
```

Plot localization processes of order $1, 50, 100$ and $200$:

```{r}
df_lc <- data.frame(k = rep(colnames(localizarionProcesses_focal), each = p),
                           y = c(localizarionProcesses_focal),
                           x = rep(t, n-1)
                           )

lc_plots <- list()
ks <- c(1, 50, 100, 200)

for(i in 1:4){
  lc_plots[[i]] <- functions_plot + 
                   geom_line(data = filter(df_lc, k == paste0("k=", ks[i])), 
                             aes(x = x, y = y, group = k), 
                             color = "blue", size = 1) +
                   geom_line(data = filter(df_functions, ids == focal), 
                             aes(x = x, y = y, group = ids), 
                             color = "red", linetype = "dashed", size = 1)+
                   ggtitle(paste("k = ", ks[i]))
}

wrap_plots(lc_plots)

```

### Compute *kth empirical localization distances*
Equation (18) of the paper. For one focal,

```{r}
localizationDistances_focal <- localizationDistances(X, focal)

head(localizationDistances_focal)
```

Plot the localization distances:

```{r}
df_ld <- data.frame(k = names(localizationDistances_focal),
                           y = localizationDistances_focal,
                           x = 1:c(n-1)
                           )


ldistances_plot <- ggplot(df_ld, aes(x = x, y = y)) + 
                   geom_point() + 
                   ggtitle("Localization distances for one focal") + 
                   xlab("kth") + ylab("L")

ldistances_plot
```

### Sample $\mu$ and $\sigma$ 

```{r}
localizationStatistics_full <- localizationStatistics(X, robustify = TRUE)

#See the mean and sd estimations for k = 1, 100, 200, 400, 600

localizationStatistics_full$trim_mean[c(1, 100, 200, 400, 600)]
localizationStatistics_full$trim_sd[c(1, 100, 200, 400, 600)]
```

### Classification

```{r}
X <- classificationData

ids_training <- sample(colnames(X), 90)
ids_testing <- setdiff(colnames(X), ids_training)

trainingSample <- X[,ids_training]
testSample <- X[,ids_testing]; colnames(testSample) <- NULL #blind 
classNames <- c("G1", "G2")

classification_results <- localizationClassifier(trainingSample, testSample, classNames, k_opt = 3)

checking <- data.frame(real_classs = ids_testing, 
                      predicted_class =classification_results$test$predicted_class)

checking
```

### Outlier detection

```{r}
X <- outlierData

outliers <- outlierLocalizationDistance(X, localrule = 0.95, whiskerrule = 1.5)

outliers$outliers_ld_rule
```

Plot results,

```{r}
df_functions <- data.frame(ids = rep(colnames(X), each = nrow(X)),
                           y = c(X),
                           x = rep(seq(from = 0, to = 1, length.out = nrow(X)), ncol(X)))
                           

functions_plot <- ggplot(df_functions) + 
                  geom_line(aes(x = x, y = y, group = ids), 
                            color = "black") + 
                  xlab("t") + 
  theme(legend.position = "bottom")+
                  geom_line(data = df_functions[df_functions$ids %in% outliers$outliers_ld_rule,], aes(x = x, y = y, group = ids, color = ids), size = 1) +
  guides(color = guide_legend(title="Detected outliers"))

functions_plot 

```

## References

Elías, Antonio, Jiménez, Raúl and Yukich, Joe (2020). Localization processes for functional data analysis [https://arxiv.org/abs/2007.16059].

Owner

  • Name: Antonio Elías
  • Login: aefdz
  • Kind: user
  • Location: Málaga, Spain
  • Company: OASYS, Universidad de Málaga

PhD in Statistics

GitHub Events

Total
Last Year

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 30
  • Total Committers: 2
  • Avg Commits per committer: 15.0
  • Development Distribution Score (DDS): 0.067
Top Committers
Name Email Commits
Antonio a****s@e****s 28
Antonio Elías a****s@u****s 2
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 164 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
  • Total maintainers: 1
cran.r-project.org: localFDA

Localization Processes for Functional Data Analysis

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 164 Last month
Rankings
Forks count: 21.9%
Dependent packages count: 29.8%
Stargazers count: 35.2%
Dependent repos count: 35.5%
Average: 40.6%
Downloads: 80.8%
Maintainers (1)
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 2.10 depends
  • graphics * imports
  • stats * imports