bientropy

BiEntropy Randomness Metrics for Python

https://github.com/sandialabs/bientropy

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.0%) to scientific vocabulary

Keywords

scr-2297 snl-data-analysis snl-science-libs
Last synced: 9 months ago · JSON representation

Repository

BiEntropy Randomness Metrics for Python

Basic Info
  • Host: GitHub
  • Owner: sandialabs
  • License: gpl-3.0
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 191 KB
Statistics
  • Stars: 5
  • Watchers: 3
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Topics
scr-2297 snl-data-analysis snl-science-libs
Created about 8 years ago · Last pushed almost 7 years ago
Metadata Files
Readme License

README.md

BiEntropy Randomness Metrics for Python

This Python package provides high-performance implementations of the functions and examples presented in "BiEntropy - The Approximate Entropy of a Finite Binary String" by Grenville J. Croll, presented at ANPA 34 in 2013. https://arxiv.org/abs/1305.0954

According to the paper, BiEntropy is "a simple algorithm which computes the approximate entropy of a finite binary string of arbitrary length" using "a weighted average of the Shannon Entropies of the string and all but the last binary derivative of the string." In other words, these metrics can be used to help assess the disorder or randomness of binary or byte strings, particularly those that are too short for other randomness tests.

This module includes both a Python C extension and a pure Python module implementing the BiEn and TBiEn metrics from the paper, as well as a suite of tests that verify their correctness. These implementations are available under the submodules 'cbientropy' and 'pybientropy'.

Aliases of C versions of BiEn and TBiEn are included at the top level of this module for convenience.

Basic Usage

The bien and tbien functions support inputs of both binary (i.e., not unicode) strings and object types, such as those provided by the bitstring package, that have both a tobytes() method that returns a binary string and a len() method that returns the length in bits.

``` In [1]: from bientropy import bien, tbien

In [2]: from bitstring import Bits

In [3]: bien(Bits('0b1011')), tbien(Bits('0b1011')) Out[3]: (0.9496956846525874, 0.9305948708049089)

In [4]: bien(Bits('0xfa1afe1')), tbien(Bits('0xfa1afe1')) Out[4]: (0.05957853232204588, 0.7189075024152897)

In [5]: bien(b'\xde\xad\xbe\xef'), tbien(b'\xde\xad\xbe\xef') Out[5]: (0.060189286721883305, 0.7898265151674035)

```

See demo.py for more examples.

Performance

According to the paper, the "BiEntropy algorithm evaluates the order and disorder of a binary string of length n in O(n^2) time using O(n) memory." In other words, the run time has quadratic growth and the memory requirement has linear growth with respect to the string length.

The metrics are implemented in Python using the 'bitstring' package for handling arbitrary length binary strings and in native C using the GNU Multiple Precision (GMP) arithmetic library.

The following is a table of speed-ups from the Python to the C implementation for various string byte lengths:

| Bytes | BiEn | TBiEn | |-------|---------|---------| | 16 | 229 | 155 | | 32 | 217 | 149 | | 48 | 212 | 150 | | 64 | 221 | 161 | | 128 | 267 | 196 | | 256 | 340 | 257 | | 512 | 502 | 370 | | 1024 | 802 | 537 |

Following is a log-log plot of the average time to compute the various implementations of BiEntropy on a 2.40GHz Intel(R) Xeon(R) E5645 CPU versus the length of the input in bytes.

Run Times

Requirements

This package is tested with Python versions 2.7, 3.4, 3.5 and 3.6.

Installation: * Python http://python.org/ (>= 2.7 or >= 3.4) * bitstring http://pythonhosted.org/bitstring/ * NumPy http://numpy.org/

Compiling: * GCC http://gcc.gnu.org/ on Linux * MSVC 9 if using Python 2.7 on Windows * https://www.microsoft.com/EN-US/DOWNLOAD/confirmation.aspx?id=44266 * MSVC 14 if using Python 3.x on Windows * http://landinghub.visualstudio.com/visual-cpp-build-tools * GMP http://gmplib.org/ or MPIR http://mpir.org/ on Windows

For running tests: * mock https://pypi.org/project/mock/ if using Python 2.7

To check which version you may already have installed, run the command: python -c "import pkg_resources; print('BiEntropy version: '+pkg_resources.get_distribution('bientropy').version)"

Install from pip

This package includes a C extension which has to be compiled for each platform. Python wheels include compiled binary code and allow the extension to be installed without requiring a compiler.

pip >= 1.4 with setuptools >= 0.8 will use a wheel if there is one available for the target platform: pip install --user BiEntropy

Once installed, the tests should be run with the command: python -m bientropy.tests

A list of available wheel files is available at: https://pypi.org/project/BiEntropy/#files

Install from Source

The source code for the bientropy package can be cloned or downloaded from: * GitHub: https://github.com/sandialabs/bientropy * PyPI: https://pypi.org/project/BiEntropy

The GMP library and headers need to be installed before compiling.

On Debian/Ubuntu: apt-get install libgmp-dev

On RedHat: yum install gmp-devel

Then, use setup.py to compile and install the package: python setup.py install --user

Once installed, the tests should be run with the command: python -m bientropy.tests

Compiling on Windows

Compiling GMP on Microsoft Windows is only supported under Cygwin, MinGW or DJGPP. However, this package can be compiled with MPIR, a fork of GMP, on Windows. The source for MPIR is available at http://mpir.org/ The setup.py script expects the header files, library files and DLL to be available under mpir/dll/x64/Release.

A compiled distribution of the MPIR library was also available at: http://www.holoborodko.com/pavel/mpfr/#download To use it, download the MPFR-MPIR-x86-x64-MSVC2010.zip file and extract mpir from the ZIP file to this directory.

Once MPIR is ready, proceed as usual. python setup.py install --user

After installing, the tests should be run with the command: python -m bientropy.tests

See https://github.com/cython/cython/wiki/CythonExtensionsOnWindows for more information.

Included Scripts

After installing, a demonstration can be run with this command: python -m bientropy.demo This runs demo.py, which also serves as an example for using the package.

The same benchmark script used to generate the data shown in the table and plot above is also included. It can be run with: python -m bientropy.benchmark

Development

To compile with debug symbols and with extra output, use: python setup.py build_ext --force --debug --define DEBUG

To also disable compiler optimizations, use: CFLAGS=-O0 python setup.py build_ext --force --debug --define DEBUG

To debug the extension with GDB: $ gdb python (gdb) run setup.py test

To run the Valgrind memcheck tool to check for memory corruption and leaks: valgrind --xml=yes --xml-file=valgrind.xml ${python} setup.py test

Authors

This package, consisting of the C implementations, Python implementations and Python bindings were written by Ryan Helinski rhelins@sandia.gov.

License

Copyright 2018 National Technology & Engineering Solutions of Sandia, LLC (NTESS). Under the terms of Contract DE-NA0003525 with NTESS, the U.S. Government retains certain rights in this software.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

Owner

  • Name: Sandia National Laboratories
  • Login: sandialabs
  • Kind: organization
  • Location: United States

Exceptional service in the national interest.

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Committers

Last synced: 10 months ago

All Time
  • Total Commits: 42
  • Total Committers: 2
  • Avg Commits per committer: 21.0
  • Development Distribution Score (DDS): 0.429
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Ryan Helinski r****s@s****v 24
Ryan Helinski r****i@g****m 18
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 2
  • Total pull requests: 1
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 4 days
  • Total issue authors: 1
  • Total pull request authors: 1
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • rlhelinski (2)
Pull Request Authors
  • rlhelinski (1)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 1,999 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 4
  • Total maintainers: 1
pypi.org: bientropy

High-performance implementations of BiEntropy metrics proposed by Grenville J. Croll

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 1,999 Last month
Rankings
Dependent packages count: 10.1%
Downloads: 13.0%
Stargazers count: 21.5%
Forks count: 22.6%
Average: 26.9%
Dependent repos count: 67.4%
Maintainers (1)
Last synced: over 1 year ago