rhoknp

Yet another Python binding for Juman++/KNP/KWJA

https://github.com/ku-nlp/rhoknp

Science Score: 52.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
    Organization ku-nlp has institutional domain (nlp.ist.i.kyoto-u.ac.jp)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.7%) to scientific vocabulary

Keywords

jumanpp knp kwja natural-language-processing nlp
Last synced: 6 months ago · JSON representation ·

Repository

Yet another Python binding for Juman++/KNP/KWJA

Basic Info
Statistics
  • Stars: 33
  • Watchers: 4
  • Forks: 4
  • Open Issues: 2
  • Releases: 33
Topics
jumanpp knp kwja natural-language-processing nlp
Created over 4 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Citation Authors

README.md

rhoknp logo

rhoknp: Yet another Python binding for Juman++/KNP/KWJA

Test Codecov CodeFactor PyPI PyPI - Python Version Documentation Ruff


Documentation: https://rhoknp.readthedocs.io/en/latest/

Source Code: https://github.com/ku-nlp/rhoknp


rhoknp is a Python binding for Juman++, KNP, and KWJA.[^1]

[^1]: The logo was generated by OpenAI DALL·E 2.

```python import rhoknp

Perform morphological analysis by Juman++

jumanpp = rhoknp.Jumanpp() sentence = jumanpp.applytosentence( "電気抵抗率は電気の通しにくさを表す物性値である。" )

Access to the result

for morpheme in sentence.morphemes: # a.k.a. keitai-so ...

Save the result

with open("result.jumanpp", "wt") as f: f.write(sentence.to_jumanpp())

Load the result

with open("result.jumanpp", "rt") as f: sentence = rhoknp.Sentence.from_jumanpp(f.read()) ```

Requirements

  • Python 3.9+
  • (Optional) Juman++ v2.0.0-rc3+
  • (Optional) KNP 5.0+
  • (Optional) KWJA 1.0.0+

Installation

shell pip install rhoknp

Quick tour

Let's begin by using Juman++ with rhoknp. Here, we present a simple example demonstrating how Juman++ can be used to analyze a sentence.

```python

Perform morphological analysis by Juman++

jumanpp = rhoknp.Jumanpp() sentence = jumanpp.applytosentence("電気抵抗率は電気の通しにくさを表す物性値である。") ```

You can easily access the individual morphemes that make up the sentence.

python for morpheme in sentence.morphemes: # a.k.a. keitai-so ...

Sentence objects can be saved in the JUMAN format.

```python

Save the sentence in the JUMAN format

with open("sentence.jumanpp", "wt") as f: f.write(sentence.to_jumanpp())

Load the sentence

with open("sentence.jumanpp", "rt") as f: sentence = rhoknp.Sentence.from_jumanpp(f.read()) ```

Almost the same APIs are available for KNP.

```python

Perform language analysis by KNP

knp = rhoknp.KNP() sentence = knp.applytosentence("電気抵抗率は電気の通しにくさを表す物性値である。") ```

KNP performs language analysis at multiple levels.

python for clause in sentence.clauses: # a.k.a., setsu ... for phrase in sentence.phrases: # a.k.a. bunsetsu ... for base_phrase in sentence.base_phrases: # a.k.a. kihon-ku ... for morpheme in sentence.morphemes: # a.k.a. keitai-so ...

Sentence objects can be saved in the KNP format.

```python

Save the sentence in the KNP format

with open("sentence.knp", "wt") as f: f.write(sentence.to_knp())

Load the sentence

with open("sentence.knp", "rt") as f: sentence = rhoknp.Sentence.from_knp(f.read()) ```

Furthermore, rhoknp provides convenient APIs for document-level language analysis.

```python document = rhoknp.Document.fromrawtext( "電気抵抗率は電気の通しにくさを表す物性値である。単に抵抗率とも呼ばれる。" )

If you know sentence boundaries, you can use Document.from_sentences instead.

document = rhoknp.Document.from_sentences( [ "電気抵抗率は電気の通しにくさを表す物性値である。", "単に抵抗率とも呼ばれる。", ] ) ```

Document objects can be handled in a similar manner as Sentence objects.

```python

Perform morphological analysis by Juman++

document = jumanpp.applytodocument(document)

Access language units in the document

for sentence in document.sentences: ... for morpheme in document.morphemes: ...

Save language analysis by Juman++

with open("document.jumanpp", "wt") as f: f.write(document.to_jumanpp())

Load language analysis by Juman++

with open("document.jumanpp", "rt") as f: document = rhoknp.Document.from_jumanpp(f.read()) ```

For more information, please refer to the examples and documentation.

Main differences from pyknp

pyknp serves as the official Python binding for Juman++ and KNP. In the development of rhoknp, we redesigned the API, considering the current use cases of pyknp. The key differences between the two are as follows:

  • Support for document-level language analysis: rhoknp allows you to load and instantiate the results of document-level language analysis, including cohesion analysis and discourse relation analysis.
  • Strict type-awareness: rhoknp has been thoroughly annotated with type annotations, ensuring strict type checking and improved code clarity.
  • Comprehensive test suite: rhoknp is extensively tested with a comprehensive test suite. You can view the code coverage report on Codecov.

License

MIT

Contributing

We warmly welcome contributions to rhoknp. You can get started by reading the contribution guide.

Reference

Owner

  • Name: Language Media Processing Lab, Kyoto University
  • Login: ku-nlp
  • Kind: organization
  • Location: Kyoto, Japan

We are working on making NLP better

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "rhoknp: Yet another Python binding for Juman++/KNP/KWJA"
authors:
  - family-names: Kiyomaru
    given-names: Hirokazu
  - family-names: Ueda
    given-names: Nobuhiro
version: 1.6.0
repository-code: "https://github.com/ku-nlp/rhoknp"
date-released: 2023-11-08

GitHub Events

Total
  • Create event: 9
  • Issues event: 1
  • Release event: 1
  • Watch event: 2
  • Delete event: 12
  • Issue comment event: 6
  • Push event: 53
  • Pull request review event: 1
  • Pull request review comment event: 1
  • Pull request event: 36
  • Fork event: 1
Last Year
  • Create event: 9
  • Issues event: 1
  • Release event: 1
  • Watch event: 2
  • Delete event: 12
  • Issue comment event: 6
  • Push event: 53
  • Pull request review event: 1
  • Pull request review comment event: 1
  • Pull request event: 36
  • Fork event: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 35
  • Total pull requests: 190
  • Average time to close issues: 2 months
  • Average time to close pull requests: 23 days
  • Total issue authors: 7
  • Total pull request authors: 5
  • Average comments per issue: 0.63
  • Average comments per pull request: 1.07
  • Merged pull requests: 125
  • Bot issues: 0
  • Bot pull requests: 110
Past Year
  • Issues: 1
  • Pull requests: 36
  • Average time to close issues: N/A
  • Average time to close pull requests: 22 days
  • Issue authors: 1
  • Pull request authors: 3
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.44
  • Merged pull requests: 21
  • Bot issues: 0
  • Bot pull requests: 30
Top Authors
Issue Authors
  • nobu-g (14)
  • hkiyomaru (10)
  • murawaki (4)
  • omukazu (2)
  • conan1024hao (2)
  • YasuOhara (1)
  • cromz22 (1)
Pull Request Authors
  • dependabot[bot] (110)
  • hkiyomaru (43)
  • nobu-g (35)
  • pre-commit-ci[bot] (9)
  • omukazu (1)
Top Labels
Issue Labels
enhancement (3) bug (1)
Pull Request Labels
dependencies (110) python (79) github_actions (19) python:uv (10)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 34,968 last-month
  • Total docker downloads: 203,237
  • Total dependent packages: 8
  • Total dependent repositories: 4
  • Total versions: 33
  • Total maintainers: 1
pypi.org: rhoknp

Yet another Python binding for Juman++/KNP/KWJA

  • Versions: 33
  • Dependent Packages: 8
  • Dependent Repositories: 4
  • Downloads: 34,968 Last month
  • Docker Downloads: 203,237
Rankings
Docker downloads count: 1.0%
Dependent packages count: 1.1%
Downloads: 2.5%
Average: 6.8%
Dependent repos count: 7.5%
Stargazers count: 12.0%
Forks count: 16.8%
Maintainers (1)
Last synced: 6 months ago

Dependencies

docs/requirements.txt pypi
  • furo *
  • sphinx *
  • sphinx-copybutton *
  • sphinx-prompt *
poetry.lock pypi
  • alabaster 0.7.12 develop
  • appnope 0.1.3 develop
  • asttokens 2.0.5 develop
  • atomicwrites 1.4.0 develop
  • attrs 21.4.0 develop
  • babel 2.10.1 develop
  • backcall 0.2.0 develop
  • beautifulsoup4 4.11.1 develop
  • certifi 2022.5.18.1 develop
  • charset-normalizer 2.0.12 develop
  • colorama 0.4.4 develop
  • coverage 6.4.1 develop
  • decorator 5.1.1 develop
  • docutils 0.17.1 develop
  • executing 0.8.3 develop
  • filelock 3.7.1 develop
  • flake8 4.0.1 develop
  • furo 2022.4.7 develop
  • idna 3.3 develop
  • imagesize 1.3.0 develop
  • importlib-metadata 4.11.4 develop
  • iniconfig 1.1.1 develop
  • ipdb 0.13.9 develop
  • ipython 8.4.0 develop
  • jedi 0.18.1 develop
  • jinja2 3.1.2 develop
  • markupsafe 2.1.1 develop
  • matplotlib-inline 0.1.3 develop
  • mccabe 0.6.1 develop
  • mypy 0.960 develop
  • mypy-extensions 0.4.3 develop
  • packaging 21.3 develop
  • parso 0.8.3 develop
  • pexpect 4.8.0 develop
  • pickleshare 0.7.5 develop
  • pluggy 1.0.0 develop
  • prompt-toolkit 3.0.29 develop
  • ptyprocess 0.7.0 develop
  • pure-eval 0.2.2 develop
  • py 1.11.0 develop
  • pycodestyle 2.8.0 develop
  • pyflakes 2.4.0 develop
  • pygments 2.12.0 develop
  • pyparsing 3.0.9 develop
  • pytest 7.1.2 develop
  • pytest-cov 3.0.0 develop
  • pytest-mypy 0.9.1 develop
  • pytz 2022.1 develop
  • requests 2.27.1 develop
  • six 1.16.0 develop
  • snowballstemmer 2.2.0 develop
  • soupsieve 2.3.2.post1 develop
  • sphinx 4.5.0 develop
  • sphinx-copybutton 0.5.0 develop
  • sphinx-prompt 1.5.0 develop
  • sphinxcontrib-applehelp 1.0.2 develop
  • sphinxcontrib-devhelp 1.0.2 develop
  • sphinxcontrib-htmlhelp 2.0.0 develop
  • sphinxcontrib-jsmath 1.0.1 develop
  • sphinxcontrib-qthelp 1.0.3 develop
  • sphinxcontrib-serializinghtml 1.1.5 develop
  • stack-data 0.2.0 develop
  • toml 0.10.2 develop
  • tomli 2.0.1 develop
  • traitlets 5.2.2.post1 develop
  • typing-extensions 4.2.0 develop
  • urllib3 1.26.9 develop
  • wcwidth 0.2.5 develop
  • zipp 3.8.0 develop
pyproject.toml pypi
  • Sphinx ^4.2.0 develop
  • flake8 ^4.0.1 develop
  • furo ^2022.4.7 develop
  • ipdb ^0.13.9 develop
  • pytest ^7.1.2 develop
  • pytest-cov ^3.0.0 develop
  • pytest-mypy ^0.9.1 develop
  • sphinx-copybutton ^0.5.0 develop
  • sphinx-prompt ^1.5.0 develop
  • python >=3.9,<3.11
.github/workflows/codeql-analysis.yml actions
  • actions/checkout v3 composite
  • github/codeql-action/analyze v2 composite
  • github/codeql-action/autobuild v2 composite
  • github/codeql-action/init v2 composite
.github/workflows/lint.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/publish.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • pypa/gh-action-pypi-publish release/v1 composite
.github/workflows/release.yml actions
  • actions/checkout v3 composite
  • actions/create-release v1 composite
.github/workflows/test-example.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/test.yml actions
  • actions/cache v3 composite
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • codecov/codecov-action v3 composite
.github/workflows/build.yml actions
  • actions/checkout v4 composite
  • actions/setup-python v4 composite
.github/workflows/dependabot-auto-merge.yml actions
  • dependabot/fetch-metadata v1 composite
  • lewagon/wait-on-check-action v1.3.1 composite