Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.2%) to scientific vocabulary
Last synced: 6 months ago
·
JSON representation
Repository
Improved Random Number Generator Seeding
Basic Info
- Host: GitHub
- Owner: reedacartwright
- License: other
- Language: R
- Default Branch: main
- Size: 319 KB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 2
- Open Issues: 1
- Releases: 1
Created 9 months ago
· Last pushed 7 months ago
Metadata Files
Readme
Changelog
License
README.Rmd
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# Ironseed
[](https://github.com/reedacartwright/ironseed/actions/workflows/R-CMD-check.yaml)
[](https://app.codecov.io/gh/reedacartwright/ironseed)
[](https://CRAN.R-project.org/package=ironseed)
## Overview
Ironseed is an R package that improves seeding for R's built in random number
generators. An ironseed is a finite-entropy (or fixed-entropy) hash digest that
can be used to generate an unlimited sequence of seeds for initializing the
state of a random number generator. It is inspired by the work of M.E. O'Neill
and others
[[1](https://www.pcg-random.org/posts/developing-a-seed_seq-alternative.html),
[2](https://www.pcg-random.org/posts/simple-portable-cpp-seed-entropy.html),
[3](https://gist.github.com/imneme/540829265469e673d045)].
An ironseed is a 256-bit hash digest constructed from a variable-length sequence
of 32-bit inputs. Each ironseed consists of eight 32-bit sub-digests. The
sub-digests are values of 32-bit multilinear hashes
[[4](https://arxiv.org/pdf/1202.4961)] that accumulate entropy from the input
sequence. Each input is included in every sub-digest. The coefficients for the
multilinear hashes are generated by a [Weyl
sequence](https://en.wikipedia.org/wiki/Weyl_sequence).
Multilinear hashes are also used to generate an output seed sequence from an
ironseed. Each 32-bit output value is generated by uniquely hashing the
sub-digests. The coefficients for the output are generated by a second Weyl
sequence.
To improve the observed randomness of each hash output, bits are mixed using
a finalizer adapted from SplitMix64
[[5](https://doi.org/10.1145/2714064.2660195)]. With the additional mixing from
the finalizer, the output seed sequence passes PractRand tests
[[6](https://pracrand.sourceforge.net/)].
## Installation
``` r
# Install the released version of the package from CRAN as usual:
install.packages("ironseed")
# Or the development version from GitHub:
# install.packages("pak")
pak::pak("reedacartwright/ironseed")
```
## Examples
### User Seeding
Ironseed can be used at the top of a script to robustly initialize R's builtin
random number generator. The resulting ironseed is returned invisibly, and a
message is generated notifying the user that initialization has occurred. This
message can be logged and later used to reproduce the run.
```{r}
#!/usr/bin/env -S Rscript --vanilla
ironseed::ironseed("Experiment", 20251031, 1)
runif(10)
```
```{r, include = FALSE}
ironseed:::rm_random_seed()
```
If your script is intended to be called multiple times as part of a large study,
you can also seed based on the command line arguments.
```{r}
#!/usr/bin/env -S Rscript --vanilla
args <- commandArgs(trailingOnly = TRUE)
ironseed::ironseed("A Simulation Script 1", args)
runif(10)
```
```{r, include = FALSE}
ironseed:::rm_random_seed()
```
Specific command line arguments can also be used. For large, nested studies, it
is useful for scripts to support seeding using multiple seeds. Ironseed makes
this easy to accomplish.
```{r}
#!/usr/bin/env -S Rscript --vanilla
args <- commandArgs(trailingOnly = TRUE)
ironseed::ironseed("A Simulation Script 2", args[grepl("--seed=", args)])
runif(10)
```
```{r, include = FALSE}
ironseed:::rm_random_seed()
```
### Automatic Seeding
Ironseed can also automatically initialize the random number generator using an
ironseed constructed from multiple sources of entropy. This occurs if no data
is passed to `ironseed()`.
```{r}
#!/usr/bin/env -S Rscript --vanilla
ironseed::ironseed()
runif(10)
# Since RNG initializing has occurred, the next call will simply
# return the ironseed used in previous seeding.
fe <- ironseed::ironseed()
fe
```
```{r, include = FALSE}
ironseed:::rm_random_seed()
```
Or achieving the same thing with one call. Note that the automatically generated
seed is different from the previous run.
```{r}
#!/usr/bin/env -S Rscript --vanilla
fe <- ironseed::ironseed()
runif(10)
fe
```
```{r, include = FALSE}
ironseed:::rm_random_seed()
```
### Reproducible Code
An ironseed can also be used directly to reproduce a previous initialization.
This is most useful when automatic seeding has been used, and the previously
generated seed has been logged.
```{r}
#!/usr/bin/env -S Rscript --vanilla
ironseed::ironseed("RW7vjwjeiHF-QG7RYPvrntR-6tGPoi65sVc-N1n5SQi5RH4")
runif(10)
```
```{r, include = FALSE}
ironseed:::rm_random_seed()
```
## Analysis
### Avalanche
A good hash function has good avalanche properties. If we change one
bit of information in the input, our goal is to change 50% of the bits
in the output. To test this we, will first build a function to
construct a random pair of ironseeds that differ by a single input
bit.
```{r}
rand_fe_pair <- function(w) {
x <- sample(0:1, w, replace=TRUE)
n <- sample(seq_along(x), 1)
y <- x
y[n] <- if(y[n] == 1) 0L else 1L
x <- packBits(x, "integer")
y <- packBits(y, "integer")
x <- ironseed::ironseed(x, set_seed = FALSE)
y <- ironseed::ironseed(y, set_seed = FALSE)
list(x = x, y = y)
}
```
Next we will generate 100,000 pairs using 32-bit inputs. We will use
R's built-in seeding algorithm so that the results are independent of
Ironseed's seeding algorithm. We will also measure how many hash bits
were flipped by flipping one input bit.
```{r}
set.seed(20251220)
z <- replicate(100000, rand_fe_pair(32), simplify = FALSE)
dat <- sapply(z, \(a) sum(intToBits(a$x) != intToBits(a$y)))
```
```{r analysis_32}
mean(dat) # expectation: 128
sd(dat) # expectation: 8
hist(dat, breaks = 86:170, main = NULL)
```
We will repeat the same analysis for 256-bit inputs.
```{r analysis_256}
set.seed(20251221)
z <- replicate(100000, rand_fe_pair(256), simplify = FALSE)
dat <- sapply(z, \(a) sum(intToBits(a$x) != intToBits(a$y)))
mean(dat) # expectation: 128
sd(dat) # expectation: 8
hist(dat, breaks = 86:170, main = NULL)
```
As one can see, the avalanche behavior of the input hash is excellent.
Owner
- Name: Reed A. Cartwright
- Login: reedacartwright
- Kind: user
- Location: Tempe, AZ
- Company: Arizona State University
- Website: http://cartwrig.ht/
- Repositories: 26
- Profile: https://github.com/reedacartwright
GitHub Events
Total
- Create event: 3
- Release event: 1
- Issues event: 4
- Watch event: 2
- Issue comment event: 7
- Push event: 63
- Pull request event: 3
- Fork event: 1
Last Year
- Create event: 3
- Release event: 1
- Issues event: 4
- Watch event: 2
- Issue comment event: 7
- Push event: 63
- Pull request event: 3
- Fork event: 1
Packages
- Total packages: 1
-
Total downloads:
- cran 248 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 2
- Total maintainers: 1
cran.r-project.org: ironseed
Improved Random Number Generator Seeding
- Homepage: https://github.com/reedacartwright/ironseed
- Documentation: http://cran.r-project.org/web/packages/ironseed/ironseed.pdf
- License: MIT + file LICENSE
-
Latest release: 0.2.0
published 7 months ago
Rankings
Dependent packages count: 26.0%
Dependent repos count: 32.0%
Average: 48.0%
Downloads: 85.9%
Maintainers (1)
Last synced:
7 months ago
Dependencies
DESCRIPTION
cran
- tinytest * suggests
.github/workflows/R-CMD-check.yaml
actions
- actions/checkout v4 composite
- r-lib/actions/check-r-package v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/rhub.yaml
actions
- r-hub/actions/checkout v1 composite
- r-hub/actions/platform-info v1 composite
- r-hub/actions/run-check v1 composite
- r-hub/actions/setup v1 composite
- r-hub/actions/setup-deps v1 composite
- r-hub/actions/setup-r v1 composite
.github/workflows/test-coverage.yaml
actions
- actions/checkout v4 composite
- actions/upload-artifact v4 composite
- codecov/codecov-action v5 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite