https://github.com/dallaylaen/stats-logscale-js
Memory efficient, fast approximate statistical analysis tool
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.2%) to scientific vocabulary
Keywords
Repository
Memory efficient, fast approximate statistical analysis tool
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
stats-logscale
A memory-efficient approximate statistical analysis tool using logarithmic binning.
Example: repeated setTimeout(0) execution times
Description
data is split into bins (aka buckets), linear close to zero and logarithmic for large numbers (hence the name), thus maintaining desired absolute and relative precision;
can calculate mean, variance, median, moments, percentiles, cumulative distribution function (i.e. probability that a value is less than x), and expected values of arbitrary functions over the sample;
can generate histograms for plotting the data;
all calculated values are cached. Cache is reset upon adding new data;
(almost) every function has a "neat" counterpart which rounds the result to the shortest possible number within the precision bounds. E.g.
foo.mean() // 1.0100047, butfoo.neat.mean() // 1.01;is (de)serializable;
can split out partial data or combine multiple samples into one.
Usage
Creating the sample container:
javascript
const { Univariate } = require( 'stats-logscale' );
const stat = new Univariate();
Specifying absolute and relative precision.
The defaults are 10-9 and 1.001, respectivele.
Less precision = less memory usage
and faster data querying (but not insertion).
javascript
const stat = new Univariate({base: 1.01, precision: 0.001});
Use flat switch to avoid using logarithmic binning at all:
javascript
// this assumes the data is just integer numbers
const stat = new Univariate({precision: 1, flat: true});
Adding data points, wither one by one,
or as (value, frequency) pairs.
Strings are OK (e.g. after parsing user input)
but non-numeric values will cause an exception:
javascript
stat.add (3.14);
stat.add ("Foo"); // Nope!
stat.add ("3.14 3.15 3.16".split(" "));
stat.addWeighted([[0.5, 1], [1.5, 3], [2.5, 5]]);
Querying data:
javascript
stat.count(); // number of data points
stat.mean(); // average
stat.stdev(); // standard deviation
stat.median(); // half of data is lower than this value
stat.percentile(90); // 90% of data below this point
stat.quantile(0.9); // ditto
stat.cdf(0.5); // Cumulative distribution function, which means
// the probability that a data point is less than 0.5
stat.moment(power); // central moment of an integer power
stat.momentAbs(power); // < |x-<x>| ** power >, power may be fractional
stat.E( x => x\*x ); // expected value of an arbitrary function
Each querying primitive has a "neat" counterpart that rounds its output to the shortest possible decimal number in the respective bin:
javascript
stat.neat.mean();
stat.neat.stdev();
stat.neat.median();
Extract partial samples:
javascript
stat.clone( { min: 0.5, max: 0.7 } );
stat.clone( { ltrim: 1, rtrim: 1 });
// cut off outer 1% of data
stat.clone( { ltrim: 1, rtrim: 1, winsorize: true }});
// ditto but truncate outliers instead of discarding
Serialize, deserialize, and combine data from multiple sources
```javascript const str = JSON.stringify(stat); // send over the network here const copy = new Univariate (JSON.parse(str));
main.addWeighted( partialStat.getBins() ); main.addWeighted( JSON.parse(str).bins ); // ditto ```
Create histograms and plot data:
```javascript stat.histogram({scale: 768, count:1024}); // this produces 1024 bars of the form // [ barheight, lowerboundary, upper_boundary ] // The intervals are consecutive. // The bar heights are limited to 768.
stat.histogram({scale: 70, count:20}) .map( x => stat.shorten(x[1], x[2]) + '\t' + '+'.repeat(x[0]) ) .join('\n') // "Draw" a vertical histogram for text console // You'll use PNG in production instead, right? Right? ```
See the playground.
See also full documentation.
Performance
Data inserts are optimized for speed, and querying is cached where possible. The script example/speed.js can be used to benchmark the module on your system.
Memory usage for a dense sample spanning 6 orders of magnitude was around 1.6MB in Chromium, ~230KB for the data itself + ~1.2MB for the cache.
Bugs
Please report bugs and request features via the github bugtracker.
Copyright and license
Copyright (c) 2022-2023 Konstantin Uvarin
This software is free software available under MIT license.
Owner
- Name: Konstantin S. Uvarin
- Login: dallaylaen
- Kind: user
- Twitter: KhedinTheMonk
- Repositories: 14
- Profile: https://github.com/dallaylaen
I'm a humble software developer. I love to sing, cycle, and make jokes & puns about my job.
GitHub Events
Total
Last Year
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Konstantin S. Uvarin | k****n@g****m | 190 |
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: less than a minute
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- dallaylaen (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- npm 597 last-month
- Total dependent packages: 0
- Total dependent repositories: 1
- Total versions: 10
- Total maintainers: 1
npmjs.org: stats-logscale
Approximate statistical analysis using logarithmic bins
- Homepage: https://dallaylaen.github.io/stats-logscale-js/
- License: MIT
-
Latest release: 1.0.9
published almost 2 years ago
Rankings
Maintainers (1)
Dependencies
- chai ^4.3.4 development
- eslint ^7.32.0 development
- eslint-config-standard ^16.0.3 development
- eslint-plugin-align-assignments ^1.1.2 development
- eslint-plugin-import ^2.25.4 development
- eslint-plugin-node ^11.1.0 development
- eslint-plugin-promise ^5.2.0 development
- mocha ^9.1.3 development
- nyc ^15.1.0 development
- webpack ^5.65.0 development