knrscore

KNRScore is a Python package for computing K-Nearest-Rank Similarity, a metric that quantifies local structural similarity between two maps or embeddings.

https://github.com/erdogant/knrscore

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: springer.com, nature.com, zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.6%) to scientific vocabulary

Keywords

dimensionality-reduction embeddings high-dimensional pca python quantification tsne umap
Last synced: 6 months ago · JSON representation ·

Repository

KNRScore is a Python package for computing K-Nearest-Rank Similarity, a metric that quantifies local structural similarity between two maps or embeddings.

Basic Info
Statistics
  • Stars: 9
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 8
Topics
dimensionality-reduction embeddings high-dimensional pca python quantification tsne umap
Created about 6 years ago · Last pushed 10 months ago
Metadata Files
Readme Funding License Citation

README.md

Python PyPI Version License Github Forks GitHub Open Issues Project Status Downloads Downloads DOI Sphinx Medium

⭐️ Star this repo if you like it ⭐️

KNRscore - K-Nearest-Rank Similarity

Medium Blog

Also checkout Medium Blog to get a structured overview and usage of KNRscore.

Documentation pages

On the documentation pages you can find detailed information about the working of the KNRscore with examples.

Method

To compare the embedding of samples in two different maps, we propose a scale dependent similarity measure. For a pair of maps X and Y, we compare the sets of the, respectively, kx and ky nearest neighbours of each sample. We first define the variable rxij as the rank of the distance of sample j among all samples with respect to sample i, in map X. The nearest neighbor of sample i will have rank 1, the second nearest neighbor rank 2, etc. Analogously, ryij is the rank of sample j with respect to sample i in map Y. Now we define a score on the interval [0, 1], as (eq. 1)

where the variable n is the total number of samples, and the indicator function is given by (eq. 2)

The score sx,y(kx, ky) will have value 1 if, for each sample, all kx nearest neighbours in map X are also the ky nearest neighbours in map Y, or vice versa. Note that a local neighborhood of samples can be set on the minimum number of samples in the class. Alternatively, kxy can be also set on the average class size.

Schematic overview

Schematic overview to systematically compare local and global differences between two sample projections. For illustration we compare two input maps (x and y) in which each map contains n samples (step 1). The second step is the ranking of samples based on Euclidean distance. The ranks of map x are subsequently compared to the ranks of map y for kx and ky nearest neighbours (step 3). The overlap between ranks (step 4), is subsequently summarized in Score: Sx,y(kx,ky).


Install KNRscore from PyPI

bash pip install KNRscore

Import KNRscore package

python import KNRscore as knrs

Functions in KNRscore

```python import KNRscore as knrs scores = knrs.compare(map1, map2) fig = knrs.plot(scores) fig = knrs.scatter(Xcoord,Ycoord)

```

Example

```python # Imort library import KNRscore as knrs

# Load data
X, y = KNRscore.import_example()

# Compute embeddings
embed_pca = decomposition.TruncatedSVD(n_components=50).fit_transform(X)
embed_tsne = manifold.TSNE(n_components=2, init='pca').fit_transform(X)

# Compare PCA vs. tSNE
scores = knrs.compare(embed_pca, embed_tsne, n_steps=25)

# plot PCA vs. tSNE
fig, ax = knrs.plot(scores, xlabel='PCA', ylabel='tSNE')
fig, ax = knrs.scatter(embed_tsne[:, 0], embed_tsne[:, 1], labels=y, cmap='Set1', title='tSNE Scatter Plot')
fig, ax = knrs.scatter(embed_pca[:, 0], embed_pca[:, 1], labels=y, cmap='Set1', title='PCA Scatter Plot')

```

Examples figures



References

  • Taskesen, E. et al. Pan-cancer subtyping in a 2D-map shows substructures that are driven by specific combinations of molecular characteristics. Sci. Rep. 6, 24949
  • https://static-content.springer.com/esm/art%3A10.1038%2Fsrep24949/MediaObjects/415982016BFsrep24949MOESM12ESM.pdf
  • https://www.nature.com/articles/srep24949

Owner

  • Name: Erdogan
  • Login: erdogant
  • Kind: user
  • Location: Den Haag

Machine Learning | Statistics | Bayesian | D3js | Visualizations

Citation (CITATION.cff)

# YAML 1.2
---
authors: 
  -
    family-names: Taskesen
    given-names: Erdogan
    orcid: "https://orcid.org/0000-0002-3430-9618"
cff-version: "1.1.0"
date-released: 2020-01-19
keywords: 
  - "python"
  - "embeddings"
  - "pca"
  - "tsne"
  - "dimensionality-reduction"
  - "umap"
  - "high-dimensional"
license: "MIT"
message: "If you use this software, please cite it using these metadata."
repository-code: "https://erdogant.github.io/KNRscore"
title: "KNRScore is a python package to compute the K-Nearest-Rank Similarity for the quantification of local similarities across two maps or embeddings."
version: "1.0.0"
...

GitHub Events

Total
  • Issues event: 1
  • Watch event: 1
  • Push event: 22
  • Create event: 1
Last Year
  • Issues event: 1
  • Watch event: 1
  • Push event: 22
  • Create event: 1

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 122
  • Total Committers: 1
  • Avg Commits per committer: 122.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 26
  • Committers: 1
  • Avg Commits per committer: 26.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
erdogant e****t@g****m 122

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 2
  • Total pull requests: 0
  • Average time to close issues: over 1 year
  • Average time to close pull requests: N/A
  • Total issue authors: 2
  • Total pull request authors: 0
  • Average comments per issue: 2.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • BradKML (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 14 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
  • Total maintainers: 1
pypi.org: knrscore

KNRScore is a Python package for computing K-Nearest-Rank Similarity, a metric that quantifies local structural similarity between two maps or embeddings.

  • Homepage: https://erdogant.github.io/KNRscore
  • Documentation: https://knrscore.readthedocs.io/
  • License: MIT License Copyright (c) 2020 Erdogan Taskesen KNRscore - Python package Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
  • Latest release: 2.0.0
    published 10 months ago
  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 14 Last month
Rankings
Dependent packages count: 9.3%
Average: 30.7%
Dependent repos count: 52.1%
Maintainers (1)
Last synced: 6 months ago

Dependencies

docs/source/requirements.txt pypi
  • pipinstallsphinx_rtd_theme *
requirements.txt pypi
  • imagesc *
  • matplotlib *
  • numpy *
  • scatterd *
  • scipy *
  • tqdm *
pyproject.toml pypi
  • imagesc *
  • matplotlib *
  • numpy *
  • requests *
  • scatterd *
  • scipy *
  • tqdm *
.github/workflows/codeql-analysis.yml actions
  • actions/checkout v2 composite
  • github/codeql-action/analyze v1 composite
  • github/codeql-action/autobuild v1 composite
  • github/codeql-action/init v1 composite
.github/workflows/pytest.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite