byteparsing

byteparsing: a functional parser combinator for mixed ASCII/binary data - Published in JOSS (2023)

https://github.com/parallelwindfarms/byteparsing

Science Score: 100.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 7 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org, zenodo.org
  • Committers with academic emails
    1 of 3 committers (33.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software
Last synced: 6 months ago · JSON representation ·

Repository

Parser for mixed binary/ASCII data in Python

Basic Info
Statistics
  • Stars: 3
  • Watchers: 4
  • Forks: 0
  • Open Issues: 4
  • Releases: 4
Created over 6 years ago · Last pushed almost 3 years ago
Metadata Files
Readme Changelog Contributing License Code of conduct Citation Zenodo

README.md

Byteparsing

Python package PyPI version codecov fair-software.eu DOI DOI

Byteparsing is a package for mixed text and binary parsing in Python. The main driver for developing this package was to write a parser for binary OpenFOAM files. The binary file format in OpenFOAM is "special". It is the same as the ASCII based text format, except where large blocks of floating point data are concerned.

When not to use byteparsing:

  • You just need to parse some text: use pyparsing, it is the industry's standard.

Do use byteparsing if:

  • You need to tinker with large binary OpenFOAM files directly from Python.
  • There is a different package that does not adhere to data standards and hacked together its own mixed ASCII/binary file format. You will have to roll out your own parser. Byteparsing can make this easier.

Coolest feature:

  • Works with mmap and numpy! This means you can open the file without reading it entirely into memory, change the NumPy array data and the changes are automatically saved to disk.

Documentation

Our documentation explains the architecture behind byteparsing and shows some examples of parsing mixed binary and ASCII data.

Example: PPM files

Byteparsing is a functional parser combinator (recursive descent) library. To show how we can mix ASCII and binary data, we have an example where we parse Portable PixMap files (PPM). These files have a small ASCII header and the image itself in binary. The header looks something like this:

P6 # this marks the file type in the Netpbm family 640 480 256 <<binary rgb values: 3*w*h bytes>>

The implementation of the parser:

```python import numpy as np from dataclasses import dataclass from byteparsing import parsebytes from byteparsing.parsers import ( textliteral, integer, eol, namedsequence, sequence, construct, tokenize, item, array, fmap, textend_by, optional)

comment = sequence(textliteral("#"), textend_by("\n"))

@dataclass class Header: width: int height: int maxint: int

header = namedsequence( _1 = tokenize(textliteral("P6")), _2 = optional(comment), width = tokenize(integer), height = tokenize(integer), maxint = tokenize(integer)) >> construct(Header)

def image_bytes(header: Header): shape = (header.height, header.width, 3) size = header.height * header.width * 3 return array(np.uint8, size) >> fmap(lambda a: a.reshape(shape))

ppmimage = header >> imagebytes ```

For more, check out the documentation!

Requirements

This package requires Python (>=3.9), and optionally Numpy.

Installation

With pip

To install the latest release of byteparsing, do:

{.console} pip install byteparsing

Development with Poetry

This project uses Poetry to maintain pyproject.toml.

{.console} git clone https://github.com/parallelwindfarms/byteparsing.git cd byteparsing poetry install

Run tests (including coverage) with:

{.console} poetry run pytest

Contributing

If you want to contribute to the development of byteparsing, have a look at the contribution guidelines.

License

Copyright (c) 2019, Netherlands eScience Center, University of Groningen

Licensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Credits

This package was created with Cookiecutter and the NLeSC/python-template.

Owner

  • Name: parallelwindfarms
  • Login: parallelwindfarms
  • Kind: organization

JOSS Publication

byteparsing: a functional parser combinator for mixed ASCII/binary data
Published
April 25, 2023
Volume 8, Issue 84, Page 5293
Authors
Johan Hidding ORCID
Netherlands eScience Center
Pablo Rodríguez-Sánchez ORCID
Netherlands eScience Center
Editor
Aoife Hughes ORCID
Tags
parsing binary ascii functional-programming

Citation (CITATION.cff)

cff-version: "1.2.0"
authors:
- family-names: Hidding
  given-names: Johan
  orcid: "https://orcid.org/0000-0002-7550-1796"
- family-names: Rodríguez-Sánchez
  given-names: Pablo
  orcid: "https://orcid.org/0000-0002-2855-940X"
contact:
- family-names: Hidding
  given-names: Johan
  orcid: "https://orcid.org/0000-0002-7550-1796"
doi: 10.5281/zenodo.7839894
message: If you use this software, please cite our article in the
  Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: Hidding
    given-names: Johan
    orcid: "https://orcid.org/0000-0002-7550-1796"
  - family-names: Rodríguez-Sánchez
    given-names: Pablo
    orcid: "https://orcid.org/0000-0002-2855-940X"
  date-published: 2023-04-25
  doi: 10.21105/joss.05293
  issn: 2475-9066
  issue: 84
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 5293
  title: "byteparsing: a functional parser combinator for mixed
    ASCII/binary data"
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.05293"
  volume: 8
title: "byteparsing: a functional parser combinator for mixed
  ASCII/binary data"

GitHub Events

Total
  • Watch event: 2
  • Fork event: 1
Last Year
  • Watch event: 2
  • Fork event: 1

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 170
  • Total Committers: 3
  • Avg Commits per committer: 56.667
  • Development Distribution Score (DDS): 0.476
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Johan Hidding j****g@e****l 89
Pablo Rodríguez Sánchez p****z@g****m 79
Daniel S. Katz d****z@i****g 2
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 16
  • Total pull requests: 7
  • Average time to close issues: 2 months
  • Average time to close pull requests: 2 months
  • Total issue authors: 4
  • Total pull request authors: 2
  • Average comments per issue: 1.13
  • Average comments per pull request: 1.29
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 6
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • PabRod (8)
  • dvberkel (6)
  • jhidding (1)
  • inakleinbottle (1)
Pull Request Authors
  • dependabot[bot] (6)
  • danielskatz (1)
Top Labels
Issue Labels
enhancement (4) generalization (4) documentation (2) question (1)
Pull Request Labels
dependencies (6)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 22 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 3
  • Total maintainers: 1
pypi.org: byteparsing

Parser for mixed ASCII/binary data formats

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 22 Last month
Rankings
Dependent packages count: 10.0%
Dependent repos count: 21.7%
Forks count: 22.6%
Average: 26.7%
Stargazers count: 38.8%
Downloads: 40.3%
Maintainers (1)
Last synced: 6 months ago

Dependencies

pyproject.toml pypi
  • ipympl ^0.8.8 develop
  • jupyterlab ^3.2.9 develop
  • matplotlib ^3.5.1 develop
  • pytest ^6.2.4 develop
  • pytest-cov ^2.12.1 develop
  • pytest-flake8 ^1.0.7 develop
  • pytest-mypy ^0.8.1 develop
  • numpy ^1.21.2
  • python >=3.9,<3.11
.github/workflows/cffconvert.yml actions
  • actions/checkout v2 composite
  • citation-file-format/cffconvert-github-action main composite
.github/workflows/python-package.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • codecov/codecov-action v1 composite
.github/workflows/draft-pdf.yml actions
  • actions/checkout v2 composite
  • actions/upload-artifact v1 composite
  • openjournals/openjournals-draft-action master composite