Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.1%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: JornLotsch
  • Language: HTML
  • Default Branch: main
  • Size: 3.52 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed 11 months ago
Metadata Files
Readme

README.md

detectXOR: XOR pattern detection and visualization in R

Provides tools for detecting XOR-like patterns in variable pairs. Includes visualizations for pattern exploration.

Overview

Traditional feature selection methods often miss complex non-linear relationships where variables interact to produce class differences. The detectXOR package specifically targets XOR patterns - relationships where class discrimination only emerges through variable interactions, not individual variables alone.

Key capabilities

🔍 XOR pattern detection - Statistical identification using χ² and Wilcoxon tests
📈 Correlation analysis - Class-wise Kendall τ coefficients
📊 Visualization - Spaghetti plots and decision boundary visualizations
Parallel processing - Multi-core acceleration for large datasets
🔬 Robust statistics - Winsorization and scaling options for outlier handling

Installation

The R libary can be found on The Comprehensive R Archive Network at https://cran.r-project.org/package=detectXOR

Install the libary from CRAN: ```r

Install detectXOR

install.packages("detectXOR") ```

Dependencies

The package requires R ≥ 3.5.0 and depends on: - dplyr, tibble (data manipulation) - ggplot2, ggh4x, scales (visualization) - future, future.apply, pbmcapply, parallel (parallel processing) - reshape2, glue (data processing and string manipulation) - DescTools (statistical tools) - Base R packages: stats, utils, methods, grDevices

Optional packages (suggested): - testthat, knitr, rmarkdown (development and documentation) - doParallel, foreach (additional parallel processing options)

Quick start

Basic XOR detection

```r library(detectXOR)

Load example data

data(XOR_data)

Detect XOR patterns with default settings

results <- detectXOR(XORdata, classcol = "class")

View summary

print(results$results_df) ```

Usage with custom parameters

```r

Detection with custom thresholds and parallel processing

results <- detectxor( data = XORdata, classcol = "class", pthreshold = 0.01, tauthreshold = 0.4, maxcores = 4, extremehandling = "winsorize", scaledata = TRUE ) ```

Function parameters

detectXOR() - Main detection function

| Parameter | Type | Default | Description | | --- | --- | --- | --- | | data | data.frame | required | Input dataset with variables and class column | | class_col | character | "class" | Name of the class/target variable column | | check_tau | logical | TRUE | Compute class-wise Kendall τ correlations | | compute_axes_parallel_significance | logical | TRUE | Perform group-wise Wilcoxon tests | | p_threshold | numeric | 0.05 | Significance threshold for statistical tests | | tau_threshold | numeric | 0.3 | Minimum absolute τ for "strong" correlation | | abs_diff_threshold | numeric | 20 | Minimum absolute difference for practical significance | | split_method | character | "quantile" | Tile splitting method: "quantile" or "range" | | max_cores | integer | NULL | Maximum cores for parallel processing (auto-detect if NULL) | | extreme_handling | character | "winsorize" | Outlier handling: "winsorize", "remove", or "none" | | winsor_limits | numeric vector | c(0.05, 0.95) | Winsorization percentiles | | scale_data | logical | TRUE | Standardize variables before analysis | | use_complete | logical | TRUE | Use only complete cases (remove NA values) |

Output structure

The detectXOR() function returns a list with two components:

results_df - Summary data frame

| Column | Description | | --- | --- | | var1, var2 | Variable pair names | | xor_shape_detected | Logical: XOR pattern identified | | chi_sq_p_value | χ² test p-value for tile independence || practical_significance | Logical: meets practical significance threshold | | tau_class_0, tau_class_1 | Class-wise Kendall τ coefficients | | tau_difference | Absolute difference between class τ values | | wilcox_p_x, wilcox_p_y | Wilcoxon test p-values for each axis | | significant_wilcox | Logical: significant group differences detected |

pair_list - Detailed results

Contains comprehensive analysis for each variable pair including: - Tile pattern analysis results - Statistical test outputs - Processed data subsets - Intermediate calculations

Visualization functions

| Function | Description | Key Parameters | | --- | --- | --- | | generate_spaghetti_plot_from_results() | Creates connected line plots showing variable trajectories for XOR-detected pairs | results, data, class_col, scale_data = TRUE | | generate_xy_plot_from_results() | Generates scatter plots with decision boundary lines for detected XOR patterns | results, data, class_col, scale_data = TRUE, quantile_lines = c(1/3, 2/3), line_method = "quantile" |

Both functions return ggplot objects that can be displayed or saved manually.

``` r

Generate plots

generatespaghettiplotfromresults(results, XORdata) generatexyplotfromresults(results, XORdata) ```

Example plots

Reporting functions

| Function | Description | Key Parameters | | --- | --- | --- | | generate_xor_reportConsole() | Creates console-friendly formatted report with optional plots | results, data, class_col, scale_data = TRUE, show_plots = TRUE | | generate_xor_reportHTML() | Generates comprehensive HTML report with interactive elements | results, data, class_col, output_file, open_browser = TRUE |

Example report

```r

Generate formatted report

generatexorreportHTML(results, XORdata, classcol = "class") ```

The report will be automaticlaly opened in the system standard web browser.

Methodology

XOR detection pipeline

  1. Pairwise dataset creation - Extract all variable pairs with preprocessing
  2. Tile pattern analysis - Divide variable space into 2×2 tiles and test for XOR-like distributions
  3. Statistical validation - Apply χ² tests for independence and Wilcoxon tests for group differences
  4. Correlation analysis - Compute class-wise Kendall τ to quantify relationship strength
  5. Result aggregation - Combine findings into interpretable summary format

Statistical tests

  • χ² Test: Tests independence of tile patterns vs. random distribution
  • Wilcoxon rank sum: Evaluates group differences along variable axes
  • Kendall τ: Measures monotonic correlation within each class separately

Use cases

Machine learning

  • Feature selection enhancement - Identify interaction features that complement traditional univariate methods
  • Variable interaction discovery - Find synergistic variable pairs where class separation emerges only through combined effects
  • Preprocessing for ensemble methods - Generate interaction features for boosting algorithms and neural networks
  • Dimensionality reduction guidance - Preserve important variable interactions when reducing feature space

Technical details

Cross-platform compatibility

  • Windows: Uses future::multisession for parallel processing
  • Unix/Linux/macOS: Uses pbmcapply::pbmclapply with fork-based parallelism
  • Memory management: Automatic chunk-based processing for large datasets

Package structure

r detectXOR/ ├── R/ # Package source code ├── man/ # Package documentation ├── data/ # Example dataset ├── issues/ # Problem reporting └── analyses/ # Files used to generate or plot publictaion data sets (not in library)

License

This project is licensed under the GNU General Public License v3.0 (GPL-3.0)

Citation

Lötsch, J, Kringel D & Ultsch A. (2025). Generalized extended OR (XOR) pattern discovery: Accurate predictors without statistical significance in multivariate data. [submitted]

Owner

  • Name: Jörn Lötsch
  • Login: JornLotsch
  • Kind: user
  • Company: Goethe - Universität

GitHub Events

Total
  • Push event: 14
Last Year
  • Push event: 14

Packages

  • Total packages: 1
  • Total downloads:
    • cran 276 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
  • Total maintainers: 1
cran.r-project.org: detectXOR

XOR Pattern Detection and Visualization

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 276 Last month
Rankings
Dependent packages count: 26.2%
Dependent repos count: 32.2%
Average: 48.3%
Downloads: 86.4%
Last synced: 10 months ago

Dependencies

DESCRIPTION cran
  • DescTools >= 0.99.50 imports
  • dplyr >= 1.1.0 imports
  • future >= 1.28.0 imports
  • future.apply >= 1.10.0 imports
  • ggh4x >= 0.2.3 imports
  • ggplot2 >= 3.4.0 imports
  • glue >= 1.6.0 imports
  • methods * imports
  • parallel >= 4.2.0 imports
  • pbmcapply >= 1.5.0 imports
  • reshape2 >= 1.4.4 imports
  • scales >= 1.2.0 imports
  • stats * imports
  • tibble >= 3.1.8 imports
  • utils * imports
  • doParallel * suggests
  • foreach * suggests
  • knitr * suggests
  • rmarkdown * suggests
  • testthat >= 3.0.0 suggests