pe

Fastest general-purpose parsing library for Python with a familiar API

https://github.com/goodmami/pe

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.4%) to scientific vocabulary

Keywords

parser parser-generator parsing parsing-expression-grammar parsing-expressions parsing-library peg python
Last synced: 6 months ago · JSON representation

Repository

Fastest general-purpose parsing library for Python with a familiar API

Basic Info
  • Host: GitHub
  • Owner: goodmami
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 485 KB
Statistics
  • Stars: 44
  • Watchers: 4
  • Forks: 4
  • Open Issues: 12
  • Releases: 12
Topics
parser parser-generator parsing parsing-expression-grammar parsing-expressions parsing-library peg python
Created about 6 years ago · Last pushed 8 months ago
Metadata Files
Readme Changelog License

README.md

pe logo
Parsing Expressions
PyPI link Python Support tests


pe is a library for parsing expressions, including parsing expression grammars. It aims to join the expressive power of parsing expressions with the familiarity of regular expressions. For example:

```python

import pe pe.match(r'"-"? [0-9]+', '-38') # match an integer

```

A grammar can be used for more complicated or recursive patterns:

```python

floatparser = pe.compile(r''' ... Start <- INTEGER FRACTION? EXPONENT? ... INTEGER <- "-"? ("0" / [1-9] [0-9]*) ... FRACTION <- "." [0-9]+ ... EXPONENT <- [Ee] [-+]? [0-9]+ ... ''') floatparser.match('6.02e23')

```

Quick Links

Features and Goals

  • Grammar notation is backward-compatible with standard PEG with few extensions
  • A specification describes the semantic effect of parsing (e.g., for mapping expressions to function calls)
  • Parsers are often faster than other parsing libraries, sometimes by a lot; see the benchmarks
  • The API is intuitive and familiar; it's modeled on the standard API's re module
  • Grammar definitions and parser implementations are separate

Syntax Overview

pe is backward compatible with standard PEG syntax and it is conservative with extensions.

```regex

terminals

. # any single character "abc" # string literal 'abc' # string literal [abc] # character class

repeating expressions

e # exactly one e? # zero or one (optional) e* # zero or more e+ # one or more e{5} # exactly 5 e{3,5} # three to five

combining expressions

e1 e2 # sequence of e1 and e2 e1 / e2 # ordered choice of e1 and e2 (e) # subexpression

lookahead

&e # positive lookahead !e # negative lookahead

(extension) capture substring

~e # result of e is matched substring

(extension) binding

name:e # bind result of e to 'name'

grammars

Name <- ... # define a rule named 'Name' ... <- Name # refer to rule named 'Name'

(extension) auto-ignore

X < e1 e2 # define a rule 'X' with auto-ignore ```

Matching Inputs with Parsing Expressions

When a parsing expression matches an input, it returns a Match object, which is similar to those of Python's re module for regular expressions. By default, nothing is captured, but the capture operator (~) emits the substring of the matching expression, similar to regular expression's capturing groups:

```python

e = pe.compile(r'[0-9] [.] [0-9]') m = e.match('1.4') m.group() '1.4' m.groups() () e = pe.compile(r'~([0-9] [.] [0-9])') m = e.match('1.4') m.group() '1.4' m.groups() ('1.4',)

```

Value Bindings

A value binding extracts the emitted values of a match and associates it with a name that is made available in the Match.groupdict() dictionary. This is similar to named-capture groups in regular expressions, except that it extracts the emitted values and not the substring of the bound expression.

```python

e = pe.compile(r'~[0-9] x:(~[.]) ~[0-9]') m = e.match('1.4') m.groups() ('1', '4') m.groupdict() {'x': '.'}

```

Actions

Actions (also called "semantic actions") are callables that transform parse results. When an arbitrary function is given, it is called as follows:

python func(*match.groups(), **match.groupdict())

The result of this function call becomes the only emitted value going forward and all bound values are cleared.

For more control, pe provides the Action class and a number of subclasses for various use-cases. These actions have access to more information about a parse result and more control over the match. For example, the Pack class takes a function and calls it with the emitted values packed into a list:

python func(match.groups())

And the Join class joins all emitted strings with a separator:

python func(sep.join(match.groups()), **match.groupdict())

Auto-ignore

The grammar can be defined such that some rules ignore occurrences of a pattern between sequence items. Most commonly, this is used to ignore whitespace, so the default ignore pattern is simple whitespace.

```python

pe.match("X <- 'a' 'b'", "a b") # regular rule does not match pe.match("X < 'a' 'b'", "a b") # auto-ignore rule matches

```

This feature can help to make grammars more readable.

Example

Here is one way to parse a list of comma-separated integers:

```python

from pe.actions import Pack p = pe.compile( ... r''' ... Start <- "[" Values? "]" ... Values <- Int ("," Int)* ... Int < ~( "-"? ("0" / [1-9] [0-9]*) ) ... ''', ... actions={'Values': Pack(list), 'Int': int}) m = p.match('[5, 10, -15]') m.value() [5, 10, -15]

```

Similar Projects

Owner

  • Name: Michael Wayne Goodman
  • Login: goodmami
  • Kind: user
  • Location: Oregon

Computational Linguist, Data Scientist

GitHub Events

Total
  • Create event: 2
  • Release event: 1
  • Issues event: 12
  • Watch event: 5
  • Delete event: 5
  • Issue comment event: 9
  • Push event: 5
  • Pull request event: 2
  • Fork event: 1
Last Year
  • Create event: 2
  • Release event: 1
  • Issues event: 12
  • Watch event: 5
  • Delete event: 5
  • Issue comment event: 9
  • Push event: 5
  • Pull request event: 2
  • Fork event: 1

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 234
  • Total Committers: 2
  • Avg Commits per committer: 117.0
  • Development Distribution Score (DDS): 0.004
Past Year
  • Commits: 8
  • Committers: 1
  • Avg Commits per committer: 8.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Michael Wayne Goodman g****w@g****m 233
Tom t****n@e****t 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 36
  • Total pull requests: 16
  • Average time to close issues: 6 months
  • Average time to close pull requests: 3 days
  • Total issue authors: 4
  • Total pull request authors: 3
  • Average comments per issue: 0.69
  • Average comments per pull request: 0.31
  • Merged pull requests: 15
  • Bot issues: 0
  • Bot pull requests: 1
Past Year
  • Issues: 1
  • Pull requests: 4
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 day
  • Issue authors: 1
  • Pull request authors: 2
  • Average comments per issue: 2.0
  • Average comments per pull request: 0.75
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 1
Top Authors
Issue Authors
  • goodmami (35)
  • Krzmbrzl (2)
  • JesseTG (2)
  • grizlupo (1)
  • TomHodson (1)
  • kirberich (1)
Pull Request Authors
  • goodmami (18)
  • dependabot[bot] (2)
  • TomHodson (1)
Top Labels
Issue Labels
enhancement (17) bug (11) maintenance (7) documentation (2) question (2) help wanted (1) duplicate (1)
Pull Request Labels
dependencies (2)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 735 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 8
  • Total versions: 11
  • Total maintainers: 1
pypi.org: pe

Library for Parsing Expression Grammars (PEG)

  • Homepage: https://github.com/goodmami/pe
  • Documentation: https://github.com/goodmami/pe/blob/main/docs/README.md
  • License: MIT License Copyright (c) 2020 Michael Wayne Goodman Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
  • Latest release: 0.6.0
    published 8 months ago
  • Versions: 11
  • Dependent Packages: 0
  • Dependent Repositories: 8
  • Downloads: 735 Last month
Rankings
Dependent repos count: 5.2%
Dependent packages count: 10.1%
Average: 10.9%
Stargazers count: 11.0%
Downloads: 11.2%
Forks count: 16.8%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/build-publish.yml actions
  • actions/checkout v3 composite
  • actions/download-artifact v2 composite
  • actions/setup-python v4 composite
  • actions/upload-artifact v3 composite
  • pypa/cibuildwheel v2.13.0 composite
  • pypa/gh-action-pypi-publish release/v1 composite
.github/workflows/pythonpackage.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
pyproject.toml pypi
setup.py pypi