https://github.com/dallaylaen/stats-logscale-js

Memory efficient, fast approximate statistical analysis tool

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.2%) to scientific vocabulary

Keywords

approximate math statistics univariate

Last synced: 6 months ago · JSON representation

Repository

Memory efficient, fast approximate statistical analysis tool

Basic Info

Host: GitHub
Owner: dallaylaen
Language: JavaScript
Default Branch: main
Homepage:
Size: 841 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Topics

approximate math statistics univariate

Created almost 4 years ago · Last pushed almost 2 years ago

Metadata Files

Readme Changelog

stats-logscale

A memory-efficient approximate statistical analysis tool using logarithmic binning.

Example: repeated setTimeout(0) execution times

Description

data is split into bins (aka buckets), linear close to zero and logarithmic for large numbers (hence the name), thus maintaining desired absolute and relative precision;
can calculate mean, variance, median, moments, percentiles, cumulative distribution function (i.e. probability that a value is less than x), and expected values of arbitrary functions over the sample;
can generate histograms for plotting the data;
all calculated values are cached. Cache is reset upon adding new data;
(almost) every function has a "neat" counterpart which rounds the result to the shortest possible number within the precision bounds. E.g. foo.mean() // 1.0100047, but foo.neat.mean() // 1.01;
is (de)serializable;
can split out partial data or combine multiple samples into one.

Usage

Creating the sample container:

javascript const { Univariate } = require( 'stats-logscale' ); const stat = new Univariate();

Specifying absolute and relative precision. The defaults are 10^-9 and 1.001, respectivele. Less precision = less memory usage and faster data querying (but not insertion). javascript const stat = new Univariate({base: 1.01, precision: 0.001});

Use flat switch to avoid using logarithmic binning at all: javascript // this assumes the data is just integer numbers const stat = new Univariate({precision: 1, flat: true});

Adding data points, wither one by one, or as (value, frequency) pairs. Strings are OK (e.g. after parsing user input) but non-numeric values will cause an exception: javascript stat.add (3.14); stat.add ("Foo"); // Nope! stat.add ("3.14 3.15 3.16".split(" ")); stat.addWeighted([[0.5, 1], [1.5, 3], [2.5, 5]]);

Querying data: javascript stat.count(); // number of data points stat.mean(); // average stat.stdev(); // standard deviation stat.median(); // half of data is lower than this value stat.percentile(90); // 90% of data below this point stat.quantile(0.9); // ditto stat.cdf(0.5); // Cumulative distribution function, which means // the probability that a data point is less than 0.5 stat.moment(power); // central moment of an integer power stat.momentAbs(power); // < |x-<x>| ** power >, power may be fractional stat.E( x => x\*x ); // expected value of an arbitrary function

Each querying primitive has a "neat" counterpart that rounds its output to the shortest possible decimal number in the respective bin:

javascript stat.neat.mean(); stat.neat.stdev(); stat.neat.median();

Extract partial samples:

javascript stat.clone( { min: 0.5, max: 0.7 } ); stat.clone( { ltrim: 1, rtrim: 1 }); // cut off outer 1% of data stat.clone( { ltrim: 1, rtrim: 1, winsorize: true }}); // ditto but truncate outliers instead of discarding

Serialize, deserialize, and combine data from multiple sources

```javascript const str = JSON.stringify(stat); // send over the network here const copy = new Univariate (JSON.parse(str));

main.addWeighted( partialStat.getBins() ); main.addWeighted( JSON.parse(str).bins ); // ditto ```

Create histograms and plot data:

```javascript stat.histogram({scale: 768, count:1024}); // this produces 1024 bars of the form // [ barheight, lowerboundary, upper_boundary ] // The intervals are consecutive. // The bar heights are limited to 768.

stat.histogram({scale: 70, count:20}) .map( x => stat.shorten(x[1], x[2]) + '\t' + '+'.repeat(x[0]) ) .join('\n') // "Draw" a vertical histogram for text console // You'll use PNG in production instead, right? Right? ```

See the playground.

Performance

Data inserts are optimized for speed, and querying is cached where possible. The script example/speed.js can be used to benchmark the module on your system.

Memory usage for a dense sample spanning 6 orders of magnitude was around 1.6MB in Chromium, ~230KB for the data itself + ~1.2MB for the cache.

Bugs

Please report bugs and request features via the github bugtracker.

Copyright and license

This software is free software available under MIT license.

Owner

Name: Konstantin S. Uvarin
Login: dallaylaen
Kind: user

Twitter: KhedinTheMonk
Repositories: 14
Profile: https://github.com/dallaylaen

I'm a humble software developer. I love to sing, cycle, and make jokes & puns about my job.

GitHub Events

Total

Last Year

Committers

Last synced: 8 months ago

All Time

Total Commits: 190
Total Committers: 1
Avg Commits per committer: 190.0
Development Distribution Score (DDS): 0.0

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Konstantin S. Uvarin	k**n@g**m	190

Issues and Pull Requests

Last synced: 7 months ago

All Time

Total issues: 0
Total pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: less than a minute
Total issue authors: 0
Total pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

dallaylaen (1)

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- npm 597 last-month

Total dependent packages: 0
Total dependent repositories: 1
Total versions: 10
Total maintainers: 1

npmjs.org: stats-logscale

Approximate statistical analysis using logarithmic bins

Homepage: https://dallaylaen.github.io/stats-logscale-js/
License: MIT
Latest release: 1.0.9
published almost 2 years ago

Versions: 10
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 597 Last month

Rankings

Downloads: 10.0%

Dependent repos count: 10.3%

Forks count: 15.4%

Stargazers count: 20.9%

Average: 21.7%

Dependent packages count: 51.9%

Maintainers (1)

dallaylaen

Last synced: 7 months ago

Dependencies

package.json npm

chai ^4.3.4 development
eslint ^7.32.0 development
eslint-config-standard ^16.0.3 development
eslint-plugin-align-assignments ^1.1.2 development
eslint-plugin-import ^2.25.4 development
eslint-plugin-node ^11.1.0 development
eslint-plugin-promise ^5.2.0 development
mocha ^9.1.3 development
nyc ^15.1.0 development
webpack ^5.65.0 development

https://github.com/dallaylaen/stats-logscale-js

Science Score: 13.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

stats-logscale

Description

Usage

Performance

Bugs

Copyright and license

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

npmjs.org: stats-logscale

Rankings

Maintainers (1)

Dependencies