fast_pytorch_kmeans

This is a pytorch implementation of k-means clustering algorithm

https://github.com/demoriarty/fast_pytorch_kmeans

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (4.5%) to scientific vocabulary
Last synced: 7 months ago · JSON representation ·

Repository

This is a pytorch implementation of k-means clustering algorithm

Basic Info
  • Host: GitHub
  • Owner: DeMoriarty
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 881 KB
Statistics
  • Stars: 323
  • Watchers: 8
  • Forks: 42
  • Open Issues: 4
  • Releases: 8
Created over 5 years ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

README.md

Fast Pytorch Kmeans

this is a pytorch implementation of K-means clustering algorithm

Installation

pip install fast-pytorch-kmeans

Quick Start

```python from fastpytorchkmeans import KMeans import torch

kmeans = KMeans(nclusters=8, mode='euclidean', verbose=1) x = torch.randn(100000, 64, device='cuda') labels = kmeans.fitpredict(x) ```

Speed Comparison

Tested on google colab with Intel(R) Xeon(R) CPU @ 2.00GHz and Nvidia Tesla T4 GPU

sklearn: sklearn.cluster.KMeans

  • n_init = 1
  • max_iter = 100
  • tol = -1 (to force 100 iterations)

faiss: faiss.Clustering

  • nredo = 1
  • niter = 100
  • max_point_per_centroid = 10**9 (to prevent subsample from dataset)

note: time cost for transfering data from cpu to gpu is also included

fast-pytorch: fast_pytorch_kmeans.KMeans

  • max_iter = 100
  • tol = -1 (to force 100 iterations)
  • minibatch = None

1. nsamples=100,000, nfeatures=256, time spent for 100 iterations

2. nsamples=100,000, nclusters=256, time spent for 100 iterations

3. nfeatures=256, nclusters=256, time spent for 100 iterations

4. nfeatures=32, nclusters=1024, time spent for 100 iterations

5. nfeatures=1024, nclusters=32, time spent for 100 iterations

Owner

  • Login: DeMoriarty
  • Kind: user

Beep boop

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Omer"
  given-names: "Sehban"
  orcid: "https://orcid.org/0000-0002-5465-5841"
title: "fast-pytorch-kmeans"
version: 0.16.1
doi: 10.5281/zenodo.7115601
date-released: 2020-09-14
url: "https://github.com/DeMoriarty/fast_pytorch_kmeans"

GitHub Events

Total
  • Issues event: 3
  • Watch event: 41
  • Issue comment event: 2
  • Push event: 2
  • Pull request review event: 1
  • Pull request event: 2
  • Fork event: 5
Last Year
  • Issues event: 3
  • Watch event: 41
  • Issue comment event: 2
  • Push event: 2
  • Pull request review event: 1
  • Pull request event: 2
  • Fork event: 5

Committers

Last synced: 10 months ago

All Time
  • Total Commits: 83
  • Total Committers: 6
  • Avg Commits per committer: 13.833
  • Development Distribution Score (DDS): 0.193
Past Year
  • Commits: 2
  • Committers: 2
  • Avg Commits per committer: 1.0
  • Development Distribution Score (DDS): 0.5
Top Committers
Name Email Commits
DeMoriarty 4****y 67
ancestor-mithril s****9@g****m 11
Fangrui Liu f****l@m****i 2
Steven Braun s****z@g****m 1
Sina Hajimiri s****i@g****m 1
Avneesh Mishra 1****h@g****m 1
Committer Domains (Top 20 + Academic)
moqi.ai: 1

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 18
  • Total pull requests: 10
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 3 months
  • Total issue authors: 13
  • Total pull request authors: 8
  • Average comments per issue: 2.94
  • Average comments per pull request: 0.9
  • Merged pull requests: 7
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 1
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 9 days
  • Issue authors: 2
  • Pull request authors: 1
  • Average comments per issue: 1.5
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • hammad2008 (6)
  • Burton123456 (1)
  • Data-reindeer (1)
  • juskuz (1)
  • unrealgeometry (1)
  • Yunski (1)
  • NatureGeorge (1)
  • monaldoj (1)
  • mhamilton723 (1)
  • cvm-a (1)
  • hardyho (1)
  • gcwang916 (1)
Pull Request Authors
  • ancestor-mithril (3)
  • sinahmr (2)
  • braun-steven (2)
  • mpskex (1)
  • DeMoriarty (1)
  • MoetaYuko (1)
  • TheProjectsGuy (1)
  • HaoKang-Timmy (1)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 28,614 last-month
  • Total dependent packages: 5
  • Total dependent repositories: 2
  • Total versions: 13
  • Total maintainers: 1
pypi.org: fast-pytorch-kmeans

a fast kmeans clustering algorithm implemented in pytorch

  • Versions: 13
  • Dependent Packages: 5
  • Dependent Repositories: 2
  • Downloads: 28,614 Last month
Rankings
Dependent packages count: 2.4%
Downloads: 4.4%
Stargazers count: 4.9%
Average: 6.1%
Forks count: 7.4%
Dependent repos count: 11.6%
Maintainers (1)
Last synced: 7 months ago

Dependencies

setup.py pypi
  • get *
  • numpy *
  • pynvml *
  • torch *