https://github.com/alexeyev/apertium2ud

tag parser and converter between the two tagsets: Apertium (enhanced Leipzig?) and the one used in UD

https://github.com/alexeyev/apertium2ud

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.9%) to scientific vocabulary

Keywords

apertium morphology natural-language-processing universal-dependencies
Last synced: 5 months ago · JSON representation

Repository

tag parser and converter between the two tagsets: Apertium (enhanced Leipzig?) and the one used in UD

Basic Info
  • Host: GitHub
  • Owner: alexeyev
  • License: gpl-3.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 72.3 KB
Statistics
  • Stars: 2
  • Watchers: 2
  • Forks: 0
  • Open Issues: 1
  • Releases: 0
Topics
apertium morphology natural-language-processing universal-dependencies
Created almost 3 years ago · Last pushed 12 months ago
Metadata Files
Readme License

README.md

apertium2ud

Obtaining the mapping between the two tagsets based on the information from Apertium Wiki.

Loosely based on this code, hence the GPLv3 license.

To install, run

bash python -m pip install apertium2ud The latest uploaded version is 0.0.8.

NB!

  1. The instrument is far from being perfect.
  2. It was originally developed for working with apertium-kir, i.e. with Kyrgyz language.
  3. The latest version from PyPI is equipped with the apertium-kir .udx file rules. For other languages, you may need to make some updates.

To build the machine-readable mapping, run

bash python apertium_wiki_parser.py

Apertium to Universal tags

```

from apertium2ud.convert import a2ud tags = ["n", "pl", "acc"] a2ud(tags) (['NOUN'], ['Number=Plur', 'Case=Acc']) tagssophisticated = ["v", "tv", "ger", "nom", "cop", "aor", "p3", "pl"] a2ud(tagssophisticated) (['VERB', 'AUX'], ['Subcat=Tran', 'VerbForm=Vnoun', 'Case=Nom', 'Tense=Past', 'Person=3', 'Number=Plur']) ```

Universal tags to Apertium

So far the conversion is far from perfect ``` Кыз NOUN {'Number[psor]=Sing', 'Number=Sing', 'Case=Nom', 'Person[psor]=3', 'Person=3'} ->

досуна NOUN {'Number[psor]=Sing', 'Number=Sing', 'Person[psor]=3', 'Case=Dat', 'Person=3'} ->

кат NOUN {'Case=Nom', 'Person=3', 'Number=Sing'} ->

жазган VERB {'Aspect=Perf', 'Polarity=Pos', 'Number=Sing', 'Tense=Past', 'Person=3', 'Evident=Fh'} ->

. PUNCT set() -> ```

TODO

  • Should sections chunks and XML tags be added? No.
  • Tests: Apertium -> UD -> Apertium, UD -> Apertium -> UD (sometimes losses are inevitable)
  • Add the possibility to add the rules based on a .udx file, which usually describes custom tags

How to cite

Greatly appreciated, if you use this work.

@misc{apertium2ud2023alekseev, title = {{alexeyev/apertium2ud: mapping tagsets}}, year = {2023}, url = {https://github.com/alexeyev/apertium2ud} }

Owner

  • Name: Anton Alekseev
  • Login: alexeyev
  • Kind: user

GitHub Events

Total
  • Push event: 3
Last Year
  • Push event: 3

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 1
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • alexeyev (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 49 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 8
  • Total maintainers: 1
pypi.org: apertium2ud

Converting universal tags to Apertium tags.

  • Versions: 8
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 49 Last month
Rankings
Dependent packages count: 7.3%
Average: 27.8%
Forks count: 30.5%
Stargazers count: 32.5%
Dependent repos count: 40.9%
Maintainers (1)
Last synced: 6 months ago

Dependencies

requirements.txt pypi
  • apertium-streamparser >=5.0.2
  • conllu >=4.5.2
  • numpy >=1.24.3
setup.py pypi