https://github.com/bigbuildbench/rskmoi_namedivider-python

https://github.com/bigbuildbench/rskmoi_namedivider-python

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.1%) to scientific vocabulary
Last synced: 6 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: BigBuildBench
  • License: mit
  • Language: Python
  • Default Branch: master
  • Size: 55.7 KB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.md

namedivider-python

logo

NameDivider is a tool for dividing the Japanese full name into a family name and a given name. input: 菅義偉 -> output: 菅 義偉

NameDivider divides the name using statistical information of the kanji used in the names.

Measuring the accuracy using a privately held data set, the accuracy is 99.91%.

You can see how it works with this demo.

Documents

NameDivider(日本語)

Installation

pip install namedivider-python

Usage

It's simple to use.

```python from namedivider import BasicNameDivider, GBDTNameDivider from pprint import pprint

basicdivider = BasicNameDivider() # BasicNameDivider is fast but accuracy is 99.2% dividedname = basicdivider.dividename("菅義偉")

gbdtdivider = GBDTNameDivider() # GBDTNameDivider is slow but accuracy is 99.9% dividedname = gbdtdivider.dividename("菅義偉")

print(divided_name)

菅 義偉

pprint(dividedname.todict())

{'algorithm': 'kanji_feature',

'family': '菅',

'given': '義偉',

'score': 0.7300634880343344,

'separator': ' '}

```

For more advanced features, see here.

NameDivider API

NameDivider API is a Docker container that provides a RESTful API for dividing the Japanese full name into a family name and a given name.

I am developing NameDivider API to provide NameDivider functionality to non-Python language users.

Installation

docker pull rskmoi/namedivider-api

Usage

  • Run Docker Image

docker run -d --rm -p 8000:8000 rskmoi/namedivider-api

  • Send HTTP request

curl -X POST -H "Content-Type: application/json" -d '{"names":["竈門炭治郎", "竈門禰豆子"]}' localhost:8000/divide

  • Response

{ "divided_names": [ {"family":"竈門","given":"炭治郎","separator":" ","score":0.3004587452426102,"algorithm":"kanji_feature"}, {"family":"竈門","given":"禰豆子","separator":" ","score":0.30480429696983175,"algorithm":"kanji_feature"} ] }

Notice

  • names is a list of undivided name. The maximum length of the list is 1000.
  • If you require speed or want to use GBDTNameDivider, please try v0.2.0-beta.

CLI

Read namedivider/cli.py for more information. $ nmdiv name 菅義偉 菅 義偉 $ nmdiv file undivided_names.txt 100%|███████████████████████████████████████████| 4/4 [00:00<00:00, 4194.30it/s] 原 敬 菅 義偉 阿部 晋三 中曽根 康弘 $ nmdiv accuracy divided_names.txt 100%|███████████████████████████████████████████| 5/5 [00:00<00:00, 3673.41it/s] 0.8 True: 滝 登喜男, Pred: 滝登 喜男

License

Source code and gbdtmodelv1.txt

MIT License

bertkatakanav030.pt

cc-by-sa-4.0

familynamerepository.pickle

  • English

(1) Purpose of use

familynamerepository.pickle is available for commercial/non-commercial use if you use this software to divide name, and to develop algorithms for dividing name.

Any other use of familynamerepository.pickle is prohibited.

(2) Liability

The author or copyright holder assumes no responsibility for the software.

  • Japanese

(1) 利用目的

このソフトウェアを用いて姓名分割、および姓名分割アルゴリズムの開発をする場合、familynamerepository.pickleは商用/非商用問わず利用可能です。

それ以外の目的でのfamilynamerepository.pickleの利用を禁じます。

(2) 責任

作者または著作権者は、familynamerepository.pickleに関して一切の責任を負いません。

The family name data used in familynamerepository.pickle is provided by Myoji-Yurai.net(名字由来net).

Ongoing Projects

  • Porting Python to Rust

https://github.com/rskmoi/namedivider-rs

Owner

  • Name: BigBuildBench
  • Login: BigBuildBench
  • Kind: organization

abbr. B3, benchmarking the repo-level understanding capability of your LLMs by reconstructing project build-file.

GitHub Events

Total
  • Create event: 4
Last Year
  • Create event: 4

Dependencies

.github/workflows/python-package.yml actions
  • actions/checkout v3 composite
  • actions/checkout v4 composite
  • actions/setup-python v3 composite
  • actions/setup-python v5 composite
namedivider-api/Dockerfile docker
  • python 3.8 build
examples/demo/requirements.txt pypi
  • lightgbm *
  • namedivider-python *
  • requests *
  • streamlit *
namedivider-api/requirements.txt pypi
  • fastapi ==0.68.1
  • namedivider-python ==0.1.0rc1
  • uvicorn ==0.11.3
pyproject.toml pypi
requirements-test.txt pypi
  • black ==23.12.1 test
  • coverage ==7.6.1 test
  • mypy ==1.11.2 test
  • pandas-stubs * test
  • pytest >=7.1.3,<8.0.0 test
  • ruff ==0.6.8 test
  • types-regex * test
requirements.txt pypi