reservoir_sampling

Reservoir Sampling

https://github.com/samuellarkin/reservoir_sampling

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.1%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Reservoir Sampling

Basic Info
  • Host: GitHub
  • Owner: SamuelLarkin
  • Language: Python
  • Default Branch: master
  • Size: 31.3 KB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created about 4 years ago · Last pushed over 1 year ago
Metadata Files
Readme Citation

README.md

Reservoir Sampling

Python implementation of reservoir sampling a family of randomized algorithms for choosing a simple random sample, without replacement, of k items from a population of unknown size n in a single pass over the items. The size of the population n is not known to the algorithm and is typically too large for all n items to fit into main memory. The population is revealed to the algorithm over time, and the algorithm cannot look back at previous items. At any point, the current state of the algorithm must permit extraction of a simple random sample without replacement of size k over the part of the population seen so far.

Install

sh python3 -m pip install git+https://github.com/SamuelLarkin/reservoir_sampling.git

or

sh python -m pip install .

One file

PyInstaller Manual Install reservoir-sampling as a one binary file.

sh python -m venv venv source venv/bin/activate "" python -m pip install .[install] pyinstaller --onefile venv/bin/reservoir-sampling install dist/reservoir-sampling ~/.local/bin/

Owner

  • Name: Samuel Larkin
  • Login: SamuelLarkin
  • Kind: user
  • Location: Ottawa, ON
  • Company: National Research Council Canada

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: A Python Implementation of Reservoir Sampling.
message: A simple reimplementation of reservoir sampling in Python.
type: software
authors:
  - given-names: Samuel
    family-names: Larkin
repository-code: 'https://github.com/SamuelLarkin/reservoir_sampling'
url: 'https://github.com/SamuelLarkin/reservoir_sampling'
keywords:
  - Natural Language Processing
  - NLP
  - Sampling
  - Python
license: MIT
commit: 093af38
version: '0.1'
date-released: '2024-06-21'

GitHub Events

Total
Last Year