pe
Fastest general-purpose parsing library for Python with a familiar API
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.4%) to scientific vocabulary
Keywords
Repository
Fastest general-purpose parsing library for Python with a familiar API
Basic Info
Statistics
- Stars: 44
- Watchers: 4
- Forks: 4
- Open Issues: 12
- Releases: 12
Topics
Metadata Files
README.md
pe is a library for parsing expressions, including parsing expression grammars. It aims to join the expressive power of parsing expressions with the familiarity of regular expressions. For example:
```python
import pe pe.match(r'"-"? [0-9]+', '-38') # match an integer
```
A grammar can be used for more complicated or recursive patterns:
```python
floatparser = pe.compile(r''' ... Start <- INTEGER FRACTION? EXPONENT? ... INTEGER <- "-"? ("0" / [1-9] [0-9]*) ... FRACTION <- "." [0-9]+ ... EXPONENT <- [Ee] [-+]? [0-9]+ ... ''') floatparser.match('6.02e23')
```
Quick Links
Features and Goals
- Grammar notation is backward-compatible with standard PEG with few extensions
- A specification describes the semantic effect of parsing (e.g., for mapping expressions to function calls)
- Parsers are often faster than other parsing libraries, sometimes by a lot; see the benchmarks
- The API is intuitive and familiar; it's modeled on the standard API's re module
- Grammar definitions and parser implementations are separate
- Optimizations target the abstract grammar definitions
- Multiple parsers are available (currently packrat for recursive descent and machine for an iterative "parsing machine" as from Medeiros and Ierusalimschy, 2008 and implemented in LPeg).
Syntax Overview
pe is backward compatible with standard PEG syntax and it is conservative with extensions.
```regex
terminals
. # any single character "abc" # string literal 'abc' # string literal [abc] # character class
repeating expressions
e # exactly one e? # zero or one (optional) e* # zero or more e+ # one or more e{5} # exactly 5 e{3,5} # three to five
combining expressions
e1 e2 # sequence of e1 and e2 e1 / e2 # ordered choice of e1 and e2 (e) # subexpression
lookahead
&e # positive lookahead !e # negative lookahead
(extension) capture substring
~e # result of e is matched substring
(extension) binding
name:e # bind result of e to 'name'
grammars
Name <- ... # define a rule named 'Name' ... <- Name # refer to rule named 'Name'
(extension) auto-ignore
X < e1 e2 # define a rule 'X' with auto-ignore ```
Matching Inputs with Parsing Expressions
When a parsing expression matches an input, it returns a Match
object, which is similar to those of Python's
re module for regular
expressions. By default, nothing is captured, but the capture operator
(~) emits the substring of the matching expression, similar to
regular expression's capturing groups:
```python
e = pe.compile(r'[0-9] [.] [0-9]') m = e.match('1.4') m.group() '1.4' m.groups() () e = pe.compile(r'~([0-9] [.] [0-9])') m = e.match('1.4') m.group() '1.4' m.groups() ('1.4',)
```
Value Bindings
A value binding extracts the emitted values of a match and associates
it with a name that is made available in the Match.groupdict()
dictionary. This is similar to named-capture groups in regular
expressions, except that it extracts the emitted values and not the
substring of the bound expression.
```python
e = pe.compile(r'~[0-9] x:(~[.]) ~[0-9]') m = e.match('1.4') m.groups() ('1', '4') m.groupdict() {'x': '.'}
```
Actions
Actions (also called "semantic actions") are callables that transform parse results. When an arbitrary function is given, it is called as follows:
python
func(*match.groups(), **match.groupdict())
The result of this function call becomes the only emitted value going forward and all bound values are cleared.
For more control, pe provides the Action class and a number of subclasses for various use-cases. These actions have access to more information about a parse result and more control over the match. For example, the Pack class takes a function and calls it with the emitted values packed into a list:
python
func(match.groups())
And the Join class joins all emitted strings with a separator:
python
func(sep.join(match.groups()), **match.groupdict())
Auto-ignore
The grammar can be defined such that some rules ignore occurrences of a pattern between sequence items. Most commonly, this is used to ignore whitespace, so the default ignore pattern is simple whitespace.
```python
pe.match("X <- 'a' 'b'", "a b") # regular rule does not match pe.match("X < 'a' 'b'", "a b") # auto-ignore rule matches
```
This feature can help to make grammars more readable.
Example
Here is one way to parse a list of comma-separated integers:
```python
from pe.actions import Pack p = pe.compile( ... r''' ... Start <- "[" Values? "]" ... Values <- Int ("," Int)* ... Int < ~( "-"? ("0" / [1-9] [0-9]*) ) ... ''', ... actions={'Values': Pack(list), 'Int': int}) m = p.match('[5, 10, -15]') m.value() [5, 10, -15]
```
Similar Projects
Owner
- Name: Michael Wayne Goodman
- Login: goodmami
- Kind: user
- Location: Oregon
- Website: http://www.goodmami.org
- Repositories: 49
- Profile: https://github.com/goodmami
Computational Linguist, Data Scientist
GitHub Events
Total
- Create event: 2
- Release event: 1
- Issues event: 12
- Watch event: 5
- Delete event: 5
- Issue comment event: 9
- Push event: 5
- Pull request event: 2
- Fork event: 1
Last Year
- Create event: 2
- Release event: 1
- Issues event: 12
- Watch event: 5
- Delete event: 5
- Issue comment event: 9
- Push event: 5
- Pull request event: 2
- Fork event: 1
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Michael Wayne Goodman | g****w@g****m | 233 |
| Tom | t****n@e****t | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 36
- Total pull requests: 16
- Average time to close issues: 6 months
- Average time to close pull requests: 3 days
- Total issue authors: 4
- Total pull request authors: 3
- Average comments per issue: 0.69
- Average comments per pull request: 0.31
- Merged pull requests: 15
- Bot issues: 0
- Bot pull requests: 1
Past Year
- Issues: 1
- Pull requests: 4
- Average time to close issues: N/A
- Average time to close pull requests: 1 day
- Issue authors: 1
- Pull request authors: 2
- Average comments per issue: 2.0
- Average comments per pull request: 0.75
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 1
Top Authors
Issue Authors
- goodmami (35)
- Krzmbrzl (2)
- JesseTG (2)
- grizlupo (1)
- TomHodson (1)
- kirberich (1)
Pull Request Authors
- goodmami (18)
- dependabot[bot] (2)
- TomHodson (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 735 last-month
- Total dependent packages: 0
- Total dependent repositories: 8
- Total versions: 11
- Total maintainers: 1
pypi.org: pe
Library for Parsing Expression Grammars (PEG)
- Homepage: https://github.com/goodmami/pe
- Documentation: https://github.com/goodmami/pe/blob/main/docs/README.md
- License: MIT License Copyright (c) 2020 Michael Wayne Goodman Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
Latest release: 0.6.0
published 8 months ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v3 composite
- actions/download-artifact v2 composite
- actions/setup-python v4 composite
- actions/upload-artifact v3 composite
- pypa/cibuildwheel v2.13.0 composite
- pypa/gh-action-pypi-publish release/v1 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite