dict-from-pypinyin
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 5 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.4%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: stefantaubert
- License: mit
- Language: Python
- Default Branch: master
- Size: 173 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 2
Metadata Files
README.md
dict-from-pypinyin
Command-line interface (CLI) to create a pronunciation dictionary by looking up pinyin transcriptions using pypinyin including the possibility of ignoring punctuation and splitting words on hyphens before transcribing them.
Installation
sh
pip install dict-from-pypinyin --user
Usage
sh
dict-from-pypinyin-cli
Example
```sh
Create example vocabulary
cat > /tmp/vocabulary.txt << EOF 社会语言学? 㐻, 『㑐 鲜-亮。 『占斌? 『机具-机呀? EOF
Create dictionary from vocabulary
dict-from-pypinyin-cli \ /tmp/vocabulary.txt \ /tmp/result.dict \ --split-on-hyphen
cat /tmp/result.dict ```
Output:
txt
社会语言学? shè huì yǔ yán xué ?
社会语言学? shè huì yǔ yàn xué ?
社会语言学? shè huì yǔ yín xué ?
社会语言学? shè huì yù yán xué ?
社会语言学? shè huì yù yàn xué ?
社会语言学? shè huì yù yín xué ?
社会语言学? shè kuài yǔ yán xué ?
社会语言学? shè kuài yǔ yàn xué ?
社会语言学? shè kuài yǔ yín xué ?
社会语言学? shè kuài yù yán xué ?
社会语言学? shè kuài yù yàn xué ?
社会语言学? shè kuài yù yín xué ?
㐻, nèi ,
『㑐 『 shū
鲜-亮。 xiān - liàng 。
鲜-亮。 xiān - liáng 。
鲜-亮。 xiǎn - liàng 。
鲜-亮。 xiǎn - liáng 。
『占斌? 『 zhàn bīn ?
『占斌? 『 zhān bīn ?
『占斌? 『 tiē bīn ?
『机具-机呀? 『 jī jù - jī ya ?
『机具-机呀? 『 jī jù - jī yā ?
『机具-机呀? 『 jī jù - jī xiā ?
『机具-机呀? 『 jī jù - wèi ya ?
『机具-机呀? 『 jī jù - wèi yā ?
『机具-机呀? 『 jī jù - wèi xiā ?
『机具-机呀? 『 wèi jù - jī ya ?
『机具-机呀? 『 wèi jù - jī yā ?
『机具-机呀? 『 wèi jù - jī xiā ?
『机具-机呀? 『 wèi jù - wèi ya ?
『机具-机呀? 『 wèi jù - wèi yā ?
『机具-机呀? 『 wèi jù - wèi xiā ?
Development setup
```sh
update
sudo apt update
install Python 3.8, 3.9, 3.10 & 3.11 for ensuring that tests can be run
sudo apt install python3-pip \ python3.8 python3.8-dev python3.8-distutils python3.8-venv \ python3.9 python3.9-dev python3.9-distutils python3.9-venv \ python3.10 python3.10-dev python3.10-distutils python3.10-venv \ python3.11 python3.11-dev python3.11-distutils python3.11-venv \ python3.12 python3.12-dev python3.12-distutils python3.12-venv
install pipenv for creation of virtual environments
python3.8 -m pip install pipenv --user
check out repo
git clone https://github.com/stefantaubert/dict-from-pypinyin.git cd dict-from-pypinyin
create virtual environment
python3.8 -m pipenv install --dev ```
Running the tests
```sh
first install the tool like in "Development setup"
then, navigate into the directory of the repo (if not already done)
cd dict-from-pypinyin
activate environment
python3.8 -m pipenv shell
run tests
tox ```
Final lines of test result output:
log
py38: commands succeeded
py39: commands succeeded
py310: commands succeeded
py311: commands succeeded
py312: commands succeeded
congratulations :)
License
MIT License
Acknowledgments
Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410
Citation
If you want to cite this repo, you can use this BibTeX-entry generated by GitHub (see About => Cite this repository).
txt
Taubert, S. (2024). dict-from-pypinyin (Version 0.0.2) [Computer software]. https://doi.org/10.5281/zenodo.10554720
Owner
- Name: Stefan Taubert
- Login: stefantaubert
- Kind: user
- Location: Chemnitz, Germany
- Company: Chemnitz University of Technology
- Website: https://stefantaubert.com
- Twitter: Stefan_Taubert
- Repositories: 75
- Profile: https://github.com/stefantaubert
Currently I am working on my PhD about the topic of speech synthesis at Chemnitz University of Technology.
Citation (CITATION.cff)
cff-version: 1.2.0
title: dict-from-pypinyin
abstract: Command-line interface (CLI) to create a pronunciation dictionary by looking up pinyin transcriptions using pypinyin including the possibility of ignoring punctuation and splitting words on hyphens before transcribing them.
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- email: github@stefantaubert.com
given-names: Stefan
family-names: Taubert
affiliation: Chemnitz University of Technology
orcid: 'https://orcid.org/0000-0002-4932-2874'
website: 'https://stefantaubert.com'
version: 0.0.2
date-released: 2024-01-23
license: MIT
url: https://github.com/stefantaubert/dict-from-pypinyin
doi: 10.5281/zenodo.10554720
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 15 last-month
- Total dependent packages: 1
- Total dependent repositories: 1
- Total versions: 2
- Total maintainers: 1
pypi.org: dict-from-pypinyin
Command-line interface (CLI) to create a pronunciation dictionary by looking up pinyin transcriptions using pypinyin including the possibility of ignoring punctuation and splitting words on hyphens before transcribing them.
- Homepage: https://github.com/stefantaubert/dict-from-pypinyin
- Documentation: https://dict-from-pypinyin.readthedocs.io/
- License: MIT
-
Latest release: 0.0.2
published about 2 years ago
Rankings
Maintainers (1)
Dependencies
- autoflake * develop
- autopep8 * develop
- build * develop
- dict-from-pypinyin * develop
- isort * develop
- pycodestyle * develop
- pylint * develop
- pytest * develop
- rope * develop
- tox * develop
- twine * develop
- ordered-set >=4.1.0
- pronunciation-dictionary >=0.0.5
- pypinyin >=0.47.1
- tqdm *
- word-to-pronunciation >=0.0.1
- astroid ==2.12.13 develop
- attrs ==22.2.0 develop
- autoflake ==2.0.0 develop
- autopep8 ==2.0.1 develop
- bleach ==5.0.1 develop
- build ==0.9.0 develop
- cachetools ==5.2.0 develop
- certifi ==2022.12.7 develop
- cffi ==1.15.1 develop
- chardet ==5.1.0 develop
- charset-normalizer ==2.1.1 develop
- colorama ==0.4.6 develop
- commonmark ==0.9.1 develop
- cryptography ==39.0.0 develop
- dict-from-pypinyin * develop
- dill ==0.3.6 develop
- distlib ==0.3.6 develop
- docutils ==0.19 develop
- exceptiongroup ==1.1.0 develop
- filelock ==3.9.0 develop
- idna ==3.4 develop
- importlib-metadata ==6.0.0 develop
- iniconfig ==1.1.1 develop
- isort ==5.11.4 develop
- jaraco.classes ==3.2.3 develop
- jeepney ==0.8.0 develop
- keyring ==23.13.1 develop
- lazy-object-proxy ==1.9.0 develop
- mccabe ==0.7.0 develop
- more-itertools ==9.0.0 develop
- ordered-set ==4.1.0 develop
- packaging ==22.0 develop
- pep517 ==0.13.0 develop
- pkginfo ==1.9.4 develop
- platformdirs ==2.6.2 develop
- pluggy ==1.0.0 develop
- pronunciation-dictionary ==0.0.5 develop
- pycodestyle ==2.10.0 develop
- pycparser ==2.21 develop
- pyflakes ==3.0.1 develop
- pygments ==2.14.0 develop
- pylint ==2.15.9 develop
- pypinyin ==0.47.1 develop
- pyproject-api ==1.4.0 develop
- pytest ==7.2.0 develop
- pytoolconfig ==1.2.4 develop
- readme-renderer ==37.3 develop
- requests ==2.28.1 develop
- requests-toolbelt ==0.10.1 develop
- rfc3986 ==2.0.0 develop
- rich ==13.0.0 develop
- rope ==1.6.0 develop
- secretstorage ==3.3.3 develop
- six ==1.16.0 develop
- tomli ==2.0.1 develop
- tomlkit ==0.11.6 develop
- tox ==4.2.4 develop
- tqdm ==4.64.1 develop
- twine ==4.0.2 develop
- urllib3 ==1.26.13 develop
- virtualenv ==20.17.1 develop
- webencodings ==0.5.1 develop
- word-to-pronunciation ==0.0.1 develop
- wrapt ==1.14.1 develop
- zipp ==3.11.0 develop
- ordered-set ==4.1.0
- pronunciation-dictionary ==0.0.5
- pypinyin ==0.47.1
- tqdm ==4.64.1
- word-to-pronunciation ==0.0.1
- ordered-set >= 4.1.0
- pronunciation-dictionary >= 0.0.6
- pypinyin >=0.50, < 0.51
- tqdm *
- word-to-pronunciation >= 0.0.1