kerntools
Kernel Functions and Tools for Machine Learning Applications
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: biorxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (19.0%) to scientific vocabulary
Keywords
kernel-methods
pca
r-package
Last synced: 5 months ago
·
JSON representation
Repository
Kernel Functions and Tools for Machine Learning Applications
Basic Info
- Host: GitHub
- Owner: elies-ramon
- License: gpl-3.0
- Language: R
- Default Branch: master
- Homepage: https://elies-ramon.github.io/kerntools/
- Size: 4.45 MB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
kernel-methods
pca
r-package
Created over 1 year ago
· Last pushed 11 months ago
Metadata Files
Readme
Changelog
License
README.Rmd
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "65%"
)
```
# kerntools
The goal of `kerntools` is to provide R tools for working with a family of Machine Learning methods called kernel methods. It can be used to complement other R packages like `kernlab`. Right now, `kerntools` implements several kernel functions for treating non-negative and real vectors, real matrices, categorical and ordinal variables, sets, and strings. Several tools for studying the resulting kernel matrix or to compare two kernel matrices are available. These diagnostic tools may be used to infer the kernel(s) matrix(ces) suitability in training models. This package also provides functions for computing the feature importance of Support Vector Machines (SVMs) models, and display customizable kernel Principal Components Analysis (PCA) plots. For convenience, widespread performance measures and feature importance barplots are available for the user.
If you want to see a real-life application of `kerntools`, you can check the following paper:
* Ramon, Elies. *Unraveling HIV protease drug resistance and genetic diversity with kernel methods.* bioRxiv 2025.03.26.644092; doi: [https://doi.org/10.1101/2025.03.26.644092](https://www.biorxiv.org/content/10.1101/2025.03.26.644092v1).
## Installation
### Installation and loading
Installing `kerntools` is easy. In the R console:
```{r,eval=FALSE}
install.packages("kerntools")
```
Once the package is installed, it can be loaded anytime typing:
```{r loading}
library(kerntools)
```
### Dependencies
`kerntools` requires R (>= 2.10). Currently, it also relies on the following packages:
* `dplyr`
* `ggplot2`
* `kernlab`
* `methods`
* `reshape2`
* `stringi`
Usually, if some of these packages are missing in your library, they will be installed automatically when `kerntools` is installed.
## A quick example: kernel PCA
Imagine that you want to perform a (kernel) PCA plot but your dataset consist of categorical variables. This can be done very easily with `kerntools`! First, you chose an appropriate kernel for your data (in this example, the Dirac kernel for categorical variables), and then you pass the output of the `Dirac()` function to the `kPCA()` function.
```{r example}
#| fig.alt: >
#| Dirac kernel PCA.
head(showdata)
KD <- Dirac(showdata[,1:4])
dirac_kpca <- kPCA(KD,plot=c(1,2),title="Survey", name_leg = "Liked the show?",
y=showdata$Liked.new.show, ellipse=0.66)
dirac_kpca$plot
```
You can customize your kernel PCA plot: apart from picking which principal components you want to display (in the example: PC1 and PC2), you may want to add a title, or a legend, or use different colors to represent an additional variable of interest, so you can check patterns on your data. To see in detail how to customize a `kPCA()` plot, please refer to the documentation. The projection matrix is also returned (`dirac_kpca$projection`), so you may use it for further analyses and/or creating your own plot.
## Main kerntools features
Right now, `kerntools` can deal effortlessly with the following kinds of data:
- Real vectors: Linear, RBF and Laplacian kernels.
- Real matrices: Frobenius kernel.
- Counts or frequencies (non-negative numbers): Bray-Curtis and Ruzicka (quantitative Jaccard) kernels.
- Compositional data (relative frequencies or proportions): Compositional-Linear and Aitchison kernels.
- Categorical data: Overlap / Dirac kernel.
- Sets: Intersect and Jaccard kernels.
- Ordinal data and rankings: Kendall's tau kernel.
- Strings, sequences and short texts: Spectrum kernel.
- Bag-of-words (text documents, sometimes images): Chi-squared kernel.
Several tools for visualizing and comparing kernel matrices are provided.
Regarding kernel PCA, `kerntools` allows the user to:
- Compute a kernel PCA from any kernel matrix, be it computed with `kerntools` or provided by the user.
- Display customizable PCA plots
- (When possible) Compute and display the contribution of variables to each principal component.
- Compare two or more PCAs generated from the same set of samples using Co-inertia and Procrustes analysis.
When using some specific kernels, `kerntools` computes the importance of each variable or feature in a Support Vector Machine (SVM) model. `kerntools` does not train SVMs or other prediction models, but it can recover the feature importance of models fitted with other packages (for instance `kernlab`). These importances can be sorted and summarized in a customizable barplot.
Finally, the following performance measures for regression, binary and multi-class classification are implemented:
- Regression: Normalized Mean Squared Error
- Classification: accuracy, specificity, sensitivity, precision and F1 with (optional) confidence intervals, computed using normal approximation or bootstrapping.
## Example data
`kerntools` contains a categorical toy dataset called `showdata` and a real-world count dataset called `soil`.
## Documentation
### Vignette
To see detailed and step-by-step examples that illustrate the main cases of use of `kerntools`, please have a look to the vignettes:
```{r,eval=FALSE}
browseVignettes(kerntools)
```
The basic vignette covers the typical `kerntools` workflow. Thorough documentation about the kernel functions implemented in this package is in the "Kernel functions" vignette. If you want instead to know more about kernel PCA and Coinertia analysis, you can refer to the corresponding vignette too.
### Additional help
Remember that detailed, argument-by-argument documentation is available for each function:
```{r,eval=FALSE}
help(kPCA) ## or the specific name of the function
?kPCA
```
The documentation of the example datasets is available in an analogous way, typing:
```{r,eval=FALSE}
help(showdata)
?showdata
```
### More about kernels
To know more about kernel functions, matrices and methods, you can consult the following reference materials:
- Bishop, C. M., & Nasrabadi, N. M. (2006). *Pattern recognition and machine learning* (Vol. 4, No. 4, p. 738). Chapter 6, pp. 291-323. New York: springer.
- Müller, K. R., Mika, S., Tsuda, K., & Schölkopf, K. (2018) *An introduction to kernel-based learning algorithms*. In Handbook of neural network signal processing (pp. 4-1). CRC Press.
- Shawe-Taylor, J., & Cristianini, N. (2004). *Kernel methods for pattern analysis*. Cambridge university press.
Owner
- Login: elies-ramon
- Kind: user
- Repositories: 1
- Profile: https://github.com/elies-ramon
GitHub Events
Total
- Issues event: 2
- Watch event: 1
- Issue comment event: 1
- Push event: 6
Last Year
- Issues event: 2
- Watch event: 1
- Issue comment event: 1
- Push event: 6
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: 1 day
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 1.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: 1 day
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 1.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- Beliavsky (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 155 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 4
- Total maintainers: 1
cran.r-project.org: kerntools
Kernel Functions and Tools for Machine Learning Applications
- Homepage: https://github.com/elies-ramon/kerntools
- Documentation: http://cran.r-project.org/web/packages/kerntools/kerntools.pdf
- License: GPL (≥ 3)
-
Latest release: 1.2.0
published about 1 year ago
Rankings
Dependent packages count: 28.3%
Dependent repos count: 35.0%
Average: 50.0%
Downloads: 86.7%
Maintainers (1)
Last synced:
6 months ago
Dependencies
.github/workflows/pkgdown.yaml
actions
- JamesIves/github-pages-deploy-action v4.5.0 composite
- actions/checkout v4 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION
cran
- R >= 2.10 depends
- dplyr * imports
- ggplot2 * imports
- kernlab * imports
- methods * imports
- reshape2 * imports
- stats * imports
- stringi * imports
- knitr * suggests
- rmarkdown * suggests
- spelling * suggests
- testthat >= 3.0.0 suggests