rfc3987-syntax
Helper functions to syntactically validate strings according to RFC 3987.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.6%) to scientific vocabulary
Repository
Helper functions to syntactically validate strings according to RFC 3987.
Basic Info
Statistics
- Stars: 1
- Watchers: 2
- Forks: 1
- Open Issues: 3
- Releases: 1
Metadata Files
README.md
rfc3987-syntax
Helper functions to parse and validate the syntax of terms defined in RFC 3987 — the IETF standard for Internationalized Resource Identifiers (IRIs).
🎯 Purpose
The goal of rfc3987-syntax is to provide a lightweight, permissively licensed Python module for validating that strings conform to the ABNF grammar defined in RFC 3987. These helpers are:
- ✅ Strictly aligned with the syntax rules of RFC 3987
- ✅ Built using a permissive MIT license
- ✅ Designed for both open source and proprietary use
- ✅ Powered by Lark, a fast, EBNF-based parser
🧠 Note: This project focuses on syntax validation only. RFC 3987 specifies additional semantic rules (e.g., Unicode normalization, BiDi constraints, percent-encoding requirements) that must be enforced separately.
📄 License, Attribution, and Citation
rfc3987-syntax is licensed under the MIT License, which allows reuse in both open source and commercial software.
This project:
- ❌ Does not depend on the
rfc3987Python package (GPL-licensed) - ✅ Uses
lark, licensed under MIT - ✅ Implements grammar from RFC 3987, using RFC 3986 where RFC 3987 delegates syntax
⚠️ This project is not affiliated with or endorsed by the authors of RFC 3987 or the
rfc3987Python package.
Please cite this software in accordance with the enclosed CITATION.cff file.
⚠️ Limitations
The grammar and parser enforce only the ABNF syntax defined in RFC 3987. The following are not validated and must be handled separately for full compliance:
- ✅ Unicode Normalization Form C (NFC)
- ✅ Bidirectional text (BiDi) constraints (RFC 3987 §4.1)
- ✅ Port number ranges (must be 0–65535)
- ✅ Valid IPv6 compression (only one
::, max segments) - ✅ Context-aware percent-encoding requirements
ChatGPT 40 was used during the original development process. Errors may exist due to this assistance. Additional review, testing, and bug fixes by human experts is welcome.
📦 Installation
bash
pip install rfc3987-syntax
🛠 Usage
List all supported "terms" (i.e., non-terminals and terminals within ABNF production rules) used to validate the syntax of an IRI according to RFC 3987
```python from rfc3987syntax import RFC3987SYNTAX_TERMS
print("Supported terms:") for term in RFC3987SYNTAXTERMS: print(term) ```
Syntactically validate a string using the general-purpose validator
```python from rfc3987syntax import isvalid_syntax
if isvalidsyntax(term='iri', value='http://github.com'): print("✓ Valid IRI syntax")
if not isvalidsyntax(term='iri', value='bob'): print("✗ Invalid IRI syntax")
if not isvalidsyntax(term='iri_reference', value='bob'): print("✓ Valid IRI-reference syntax") ```
Alternatively, use term-specific helpers to validate RFC 3987 syntax.
```python from rfc3987syntax import isvalidsyntaxiri from rfc3987syntax import isvalidsyntaxiri_reference
if isvalidsyntax_iri('http://github.com'): print("✓ Valid IRI syntax")
if not isvalidsyntax_iri('bob'): print("✗ Invalid IRI syntax")
if isvalidsyntaxirireference('bob'): print("✓ Valid IRI-reference syntax") ```
Get the Lark parse tree for a syntax validation (useful for additional semantic validation)
```python from rfc3987_syntax import parse
ptree: ParseTree = parse(term="iri", value="http://github.com")
print(ptree) ```
📚 Sources
This grammar was derived from:
[RFC 3987 – Internationalized Resource Identifiers (IRIs)]
→ Defines IRI syntax and extensions to URI (e.g. Unicode characters,ucschar)
→ https://datatracker.ietf.org/doc/html/rfc3987[RFC 3986 – Uniform Resource Identifier (URI): Generic Syntax)]
→ Provides reusable components likescheme,authority,ipv4address, etc.
→ https://datatracker.ietf.org/doc/html/rfc3986
📝 When
RFC 3986is listed as the source, it is used in accordance with RFC 3987, which explicitly references it for foundational elements.
Rule-to-Source Mapping
| Rule/Component | Source | Notes |
|----------------------|------------|-------|
| iri | RFC 3987 | Top-level IRI rule |
| iri_reference | RFC 3987 | Top-level IRI Reference rule |
| absolute_iri | RFC 3987 | Top-level Absolute IRI rule |
| scheme | RFC 3986 | Referenced by RFC 3987 §2.2 |
| ihier_part | RFC 3987 | IRI-specific hierarchy |
| irelative_ref | RFC 3987 | IRI-specific relative ref |
| irelative_part | RFC 3987 | IRI-specific relative part |
| iauthority | RFC 3986 | Standard URI authority |
| ipath_abempty | RFC 3986 | Path format variant |
| ipath_absolute | RFC 3986 | Absolute path |
| ipath_noscheme | RFC 3986 | Path disallowing scheme prefix |
| ipath_rootless | RFC 3986 | Used in non-scheme contexts |
| iquery | RFC 3987 | Query extension to URI |
| ifragment | RFC 3987 | Fragment extension to URI |
| ipchar, isegment | RFC 3986 | Path characters and segments |
| isegment_nz_nc | RFC 3987 | IRI-specific path constraint |
| iunreserved | RFC 3987 | Includes ucschar |
| ucschar, iprivate| RFC 3987 | Unicode support |
| sub_delims | RFC 3986 | Reserved characters |
| ip_literal | RFC 3986 | IPv6 or IPvFuture in [] |
| ipv6address | RFC 3986 | Expanded forms only |
| ipvfuture | RFC 3986 | Forward-compatible |
| ipv4address | RFC 3986 | Dotted-decimal IPv4 |
| ls32 | RFC 3986 | Final 32 bits of IPv6 |
| h16, dec_octet | RFC 3986 | Hex and decimal chunks |
| port | RFC 3986 | Optional numeric |
| pct_encoded | RFC 3986 | Percent encoding (e.g. %20) |
| alpha, digit, hexdig | RFC 3986 | Character classes |
Owner
- Name: Will Riley
- Login: willynilly
- Kind: user
- Location: Arnhem, The Nederlands
- Company: Wageningen University & Research
- Website: http://willriley.net
- Repositories: 36
- Profile: https://github.com/willynilly
Ph.D. in Educational Psychology (Applied Cognition and Development) from University of Georgia
Citation (CITATION.cff)
cff-version: 1.2.0
title: rfc3987-syntax
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Will
family-names: Riley
email: wanderingwill@gmail.com
orcid: "https://orcid.org/0000-0003-1822-6756"
- given-names: Jan
family-names: Kowalleck
repository-code: >-
https://github.com/willynilly/rfc3987-syntax
abstract: >-
Helper functions to syntactically validate strings according to RFC 3987
keywords:
- RFC 3987
- RFC3987
- validator
- syntax
- parser
license: MIT
version: "1.1.0"
date-released: "2025-07-18"
references:
- title: "abnf-to-regexp"
type: software
version: "1.1.3"
license: MIT
authors:
- given-names: Marko
family-names: Ristin
email: marko@ristin.ch
orcid: ""
- given-names: Oliver Steensen-Bech
family-names: Haagh
email: oliver@dmc.international
orcid: ""
- given-names: Sebastian
family-names: Heppner
email: s.heppner@iat.rwth-aachen.de
orcid: ""
repository-code: https://github.com/aas-core-works/abnf-to-regexp
- title: "lark"
type: software
version: 1.2.2
license: MIT
authors:
- family-names: Shinan
given-names: Erez
email: erezshin@gmail.com
repository-code: https://github.com/lark-parser/lark
- title: "Internationalized Resource Identifiers (IRIs)"
authors:
- family-names: Dürst
given-names: Martin
- family-names: Suignard
given-names: Michel
date-released: 2005-01-01
doi: "10.17487/RFC3987"
url: "https://www.rfc-editor.org/info/rfc3987"
type: standard
- title: "ChatGPT"
authors:
- name: OpenAI
type: software
version: "GPT-4o"
url: "https://chat.openai.com/chat"
GitHub Events
Total
- Create event: 2
- Issues event: 2
- Release event: 2
- Watch event: 1
- Delete event: 1
- Issue comment event: 3
- Push event: 7
- Pull request review event: 1
- Pull request event: 5
Last Year
- Create event: 2
- Issues event: 2
- Release event: 2
- Watch event: 1
- Delete event: 1
- Issue comment event: 3
- Push event: 7
- Pull request review event: 1
- Pull request event: 5
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- jkowalleck (1)
Pull Request Authors
- jkowalleck (4)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 20,880,293 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 2
- Total maintainers: 1
pypi.org: rfc3987-syntax
Helper functions to syntactically validate strings according to RFC 3987.
- Homepage: https://github.com/willynilly/rfc3987-syntax
- Documentation: https://github.com/willynilly/rfc3987-syntax#readme
- License: Apache Software License
-
Latest release: 1.1.0
published 7 months ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v3 composite
- actions/setup-python v3 composite
- pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite
- lark >=1.2.2