Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (18.4%) to scientific vocabulary
Keywords
Repository
TorchDR - PyTorch Dimensionality Reduction
Basic Info
- Host: GitHub
- Owner: TorchDR
- License: bsd-3-clause
- Language: Python
- Default Branch: main
- Homepage: https://torchdr.github.io
- Size: 6.2 MB
Statistics
- Stars: 151
- Watchers: 4
- Forks: 11
- Open Issues: 16
- Releases: 4
Topics
Metadata Files
README.md
Torch Dimensionality Reduction
TorchDR is an open-source library for dimensionality reduction (DR) built on PyTorch. DR constructs low-dimensional representations (or embeddings) that best preserve the intrinsic geometry of an input dataset encoded via a pairwise affinity matrix. TorchDR provides GPU-accelerated implementations of popular DR algorithms in a unified framework, ensuring high performance by leveraging the latest advances of the PyTorch ecosystem.
Key Features
🚀 Blazing Fast: engineered for speed with GPU acceleration, torch.compile support, and optimized algorithms leveraging sparsity and negative sampling.
🧩 Modular by Design: very component is designed to be easily customized, extended, or replaced to fit your specific needs.
🪶 Memory-Efficient: natively handles sparsity and memory-efficient symbolic operations to process massive datasets without memory overflows.
🤝 Seamless Integration: Fully compatible with the scikit-learn and PyTorch ecosystems. Use familiar APIs and integrate effortlessly into your existing workflows.
📦 Minimal Dependencies: requires only PyTorch, NumPy, and scikit‑learn; optionally add Faiss for fast k‑NN or KeOps for symbolic computation.
Getting Started
TorchDR offers a user-friendly API similar to scikit-learn where dimensionality reduction modules can be called with the fit_transform method. It seamlessly accepts both NumPy arrays and PyTorch tensors as input, ensuring that the output matches the type and backend of the input.
```python from sklearn.datasets import fetch_openml from torchdr import UMAP
x = fetchopenml("mnist784").data.astype("float32")
z = UMAP(nneighbors=30).fittransform(x) ```
🚀 GPU Acceleration
TorchDR is fully GPU compatible, enabling significant speed-ups when a GPU is available. To run computations on the GPU, simply set device="cuda" as shown in the example below:
python
z_gpu = UMAP(n_neighbors=30, device="cuda").fit_transform(x)
Device Management: By default (device="auto"), computations use the input data's device. For optimal memory management, you can keep input data on CPU while specifying device="cuda" to perform computations on GPU - TorchDR will handle transfers automatically.
🔥 PyTorch 2.0+ torch.compile Support
TorchDR supports torch.compile for an additional performance boost on modern PyTorch versions. Just add the compile=True flag as follows:
python
z_gpu_compile = UMAP(n_neighbors=30, device="cuda", compile=True).fit_transform(x)
⚙️ Backends
The backend keyword specifies which tool to use for handling kNN computations and memory-efficient symbolic computations.
- Set
backend="faiss"to rely on Faiss for fast kNN computations (Recommended). - To perform exact symbolic tensor computations on the GPU without memory limitations, you can leverage the KeOps library. This library also allows computing kNN graphs. To enable KeOps, set
backend="keops". - Finally, setting
backend=Nonewill use raw PyTorch for all computations.
Methods
Neighbor Embedding (optimal for data visualization)
TorchDR provides a suite of neighbor embedding methods.
Linear-time (Negative Sampling). State-of-the-art speed on large datasets: UMAP, LargeVis, InfoTSNE, PACMAP.
Quadratic-time (Exact Repulsion). Compute the full pairwise repulsion: SNE, TSNE, TSNEkhorn, COSNE.
Remark. For quadratic-time algorithms,
TorchDRprovides exact implementations that scale linearly in memory usingbackend=keops. ForTSNEspecifically, one can also explore fast approximations, such asFIt-SNEimplemented in tsne-cuda, which bypass full pairwise repulsion.
Spectral Embedding
TorchDR provides various spectral embedding methods: PCA, IncrementalPCA, ExactIncrementalPCA, KernelPCA, PHATE.
Benchmarks
Relying on TorchDR enables an orders-of-magnitude improvement in runtime performance compared to CPU-based implementations. See the code.
Examples
See the examples folder for all examples.
MNIST. (Code) A comparison of various neighbor embedding methods on the MNIST digits dataset.
CIFAR100. (Code)
Visualizing the CIFAR100 dataset using DINO features and TSNE.
Advanced Features
Affinities
TorchDR features a wide range of affinities which can then be used as a building block for DR algorithms. It includes:
- Affinities based on k-NN normalizations:
SelfTuningAffinity,MAGICAffinity,UMAPAffinity,PHATEAffinity,PACMAPAffinity. - Doubly stochastic affinities:
SinkhornAffinity,DoublyStochasticQuadraticAffinity. - Adaptive affinities with entropy control:
EntropicAffinity,SymmetricEntropicAffinity.
Evaluation Metric
TorchDR provides efficient GPU-compatible evaluation metrics: silhouette_score.
Installation
Install the core torchdr library from PyPI:
bash
pip install torchdr
:warning: torchdr does not install faiss-gpu or pykeops by default. You need to install them separately to use the corresponding backends.
Faiss (Recommended): For the fastest k-NN computations, install Faiss. Please follow their official installation guide. A common method is using
conda:bash conda install -c pytorch -c nvidia faiss-gpuKeOps: For memory-efficient symbolic computations, install PyKeOps.
bash pip install pykeops
Installation from Source
If you want to use the latest, unreleased version of torchdr, you can install it directly from GitHub:
bash
pip install git+https://github.com/torchdr/torchdr
Finding Help
If you have any questions or suggestions, feel free to open an issue on the issue tracker or contact Hugues Van Assel directly.
Owner
- Name: TorchDR
- Login: TorchDR
- Kind: organization
- Repositories: 1
- Profile: https://github.com/TorchDR
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: TorchDR
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Hugues
family-names: Van Assel
email: vanasselhugues@gmail.com
affiliation: ENS Lyon
- given-names: Nicolas
family-names: Courty
email: ncourty@irisa.fr
affiliation: Université Bretagne Sud
- given-names: Rémi
family-names: Flamary
email: remi.flamary@polytechnique.edu
affiliation: École Polytechnique
- given-names: Aurélien
family-names: Garivier
email: aurelien.garivier@ens-lyon.fr
affiliation: ENS Lyon
- given-names: Mathurin
family-names: Massias
email: mathurin.massias@ens-lyon.fr
affiliation: ENS Lyon
- given-names: Titouan
family-names: Vayer
email: titouan.vayer@ens-lyon.fr
affiliation: ENS Lyon
- given-names: Cédric
family-names: Vincent-Cuaz
email: cedric.vincent-cuaz@inria.fr
affiliation: EPFL
repository-code: 'https://github.com/TorchDR/TorchDR'
url: 'https://torchdr.github.io/'
abstract: Pytorch Dimensionality Reduction toolbox.
keywords:
- machine learning
- dimensionality reduction
- manifold learning
- clustering
- GPU acceleration
license: BSD-3-Clause
GitHub Events
Total
- Create event: 3
- Release event: 3
- Issues event: 27
- Watch event: 68
- Delete event: 2
- Issue comment event: 27
- Push event: 48
- Pull request event: 106
- Fork event: 4
Last Year
- Create event: 3
- Release event: 3
- Issues event: 27
- Watch event: 68
- Delete event: 2
- Issue comment event: 27
- Push event: 48
- Pull request event: 106
- Fork event: 4
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 45
- Total pull requests: 280
- Average time to close issues: about 1 month
- Average time to close pull requests: 5 days
- Total issue authors: 9
- Total pull request authors: 10
- Average comments per issue: 0.6
- Average comments per pull request: 0.25
- Merged pull requests: 227
- Bot issues: 0
- Bot pull requests: 1
Past Year
- Issues: 21
- Pull requests: 131
- Average time to close issues: about 1 month
- Average time to close pull requests: 4 days
- Issue authors: 6
- Pull request authors: 7
- Average comments per issue: 0.86
- Average comments per pull request: 0.24
- Merged pull requests: 104
- Bot issues: 0
- Bot pull requests: 1
Top Authors
Issue Authors
- huguesva (33)
- mathurinm (4)
- e-pet (2)
- jacobgil (1)
- sirluk (1)
- simon-burke (1)
- qgallouedec (1)
- KnSun99 (1)
- rflamary (1)
Pull Request Authors
- huguesva (209)
- mathurinm (40)
- rflamary (9)
- cedricvincentcuaz (7)
- ncourty (4)
- guillaumehu (4)
- sirluk (2)
- Danqi7 (2)
- tvayer (2)
- dependabot[bot] (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 440 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 4
- Total maintainers: 1
pypi.org: torchdr
Torch Dimensionality Reduction Library
- Documentation: https://torchdr.readthedocs.io/
- License: BSD (3-Clause)
-
Latest release: 0.3
published 7 months ago