dict-from-g2pe

Create pronuciation dictionary using g2p

https://github.com/stefantaubert/dict-from-g2pe

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 5 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.5%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Create pronuciation dictionary using g2p

Basic Info

Host: GitHub
Owner: stefantaubert
License: mit
Language: Python
Default Branch: master
Size: 88.9 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 2

Created about 4 years ago · Last pushed over 2 years ago

Metadata Files

Readme Changelog Contributing License Code of conduct Citation

dict-from-g2pE

PyPI

CLI to create a pronunciation dictionary by predicting English ARPAbet phonemes using seq2seq model from g2pE and the possibility of ignoring punctuation and splitting on hyphens before prediction.

Installation

sh pip install dict-from-g2pE --user

Usage

sh dict-from-g2pE-cli

Example

```sh

Create example vocabulary

cat > /tmp/vocabulary.txt << EOF Test? abc, "def Test-def. "xyz? "uv-w? EOF

Create dictionary from vocabulary and example dictionary

dict-from-g2pE-cli \ /tmp/vocabulary.txt \ /tmp/result.dict \ --split-on-hyphen \ --n-jobs 4

cat /tmp/result.dict ```

Output:

dict Test? T EH1 S T ? abc, AE1 B K , "def " D EH1 F Test-def. T EH1 S T - D EH1 F . "xyz? " Z IH1 JH IH0 Z ? "uv-w? " AH1 V - V IY1 ?

Development setup

```sh

update

sudo apt update

install Python 3.8-3.12 for ensuring that tests can be run

sudo apt install python3-pip \ python3.8 python3.8-dev python3.8-distutils python3.8-venv \ python3.9 python3.9-dev python3.9-distutils python3.9-venv \ python3.10 python3.10-dev python3.10-distutils python3.10-venv \ python3.11 python3.11-dev python3.11-distutils python3.11-venv \ python3.12 python3.12-dev python3.12-distutils python3.12-venv

install pipenv for creation of virtual environments

python3.8 -m pip install pipenv --user

check out repo

git clone https://github.com/stefantaubert/dict-from-g2p.git cd dict-from-g2p

create virtual environment

python3.8 -m pipenv install --dev ```

Running the tests

```sh

first install the tool like in "Development setup"

then, navigate into the directory of the repo (if not already done)

cd dict-from-g2p

activate environment

python3.8 -m pipenv shell

run tests

tox ```

Final lines of test result output:

log py38: commands succeeded py39: commands succeeded py310: commands succeeded py311: commands succeeded py312: commands succeeded congratulations :)

License

MIT License

Acknowledgments

g2pE: A Simple Python Module for English Grapheme To Phoneme Conversion

Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410

Citation

If you want to cite this repo, you can use this BibTeX-entry generated by GitHub (see About => Cite this repository).

txt Taubert, S. (2024). dict-from-g2pE (Version 0.0.2) [Computer software]. https://doi.org/10.5281/zenodo.10561178

Owner

Name: Stefan Taubert
Login: stefantaubert
Kind: user
Location: Chemnitz, Germany
Company: Chemnitz University of Technology

Website: https://stefantaubert.com
Twitter: Stefan_Taubert
Repositories: 75
Profile: https://github.com/stefantaubert

Currently I am working on my PhD about the topic of speech synthesis at Chemnitz University of Technology.

Citation (CITATION.cff)

cff-version: 1.2.0
title: dict-from-g2pE
abstract: CLI to create a pronunciation dictionary by predicting English ARPAbet phonemes using seq2seq model from g2pE and the possibility of ignoring punctuation and splitting on hyphens before prediction.
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - email: github@stefantaubert.com
    given-names: Stefan
    family-names: Taubert
    affiliation: Chemnitz University of Technology
    orcid: 'https://orcid.org/0000-0002-4932-2874'
    website: 'https://stefantaubert.com/'
version: 0.0.2
date-released: 2024-01-24
license: MIT
url: https://github.com/stefantaubert/dict-from-g2p
doi: 10.5281/zenodo.10561178

GitHub Events

Total

Last Year

Issues and Pull Requests

Last synced: 12 months ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies

Pipfile pypi

autopep8 * develop
dict-from-g2pE * develop
isort * develop
pycodestyle * develop
pylint * develop
pytest * develop
rope * develop
g2p-en >=2.1.0
ordered-set >=4.1.0
pronunciation-dictionary >=0.0.4
word-to-pronunciation >=0.0.1

Pipfile.lock pypi

astroid ==2.11.3 develop
attrs ==21.4.0 develop
autopep8 ==1.6.0 develop
click ==8.1.2 develop
dict-from-g2pe * develop
dill ==0.3.4 develop
distance ==0.1.3 develop
g2p-en ==2.1.0 develop
inflect ==5.5.2 develop
iniconfig ==1.1.1 develop
isort ==5.10.1 develop
joblib ==1.1.0 develop
lazy-object-proxy ==1.7.1 develop
mccabe ==0.7.0 develop
nltk ==3.7 develop
numpy ==1.22.3 develop
ordered-set ==4.1.0 develop
packaging ==21.3 develop
platformdirs ==2.5.2 develop
pluggy ==1.0.0 develop
pronunciation-dictionary ==0.0.3 develop
py ==1.11.0 develop
pycodestyle ==2.8.0 develop
pylint ==2.13.6 develop
pyparsing ==3.0.8 develop
pytest ==7.1.1 develop
regex ==2022.3.15 develop
rope ==1.0.0 develop
setuptools ==62.1.0 develop
toml ==0.10.2 develop
tomli ==2.0.1 develop
tqdm ==4.64.0 develop
typing-extensions ==4.2.0 develop
word-to-pronunciation ==0.0.1 develop
wrapt ==1.14.0 develop
click ==8.1.2
distance ==0.1.3
g2p-en ==2.1.0
inflect ==5.5.2
joblib ==1.1.0
nltk ==3.7
numpy ==1.22.3
ordered-set ==4.1.0
pronunciation-dictionary ==0.0.3
regex ==2022.3.15
tqdm ==4.64.0
word-to-pronunciation ==0.0.1

pyproject.toml pypi

g2p-en >=2.1.0
nltk >=3.2.4
ordered-set >=4.1.0
pronunciation-dictionary >=0.0.6
word-to-pronunciation >=0.0.1

dict-from-g2pe

Science Score: 67.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

dict-from-g2pE

Installation

Usage

Example

Create example vocabulary

Create dictionary from vocabulary and example dictionary

Development setup

update

install Python 3.8-3.12 for ensuring that tests can be run

install pipenv for creation of virtual environments

check out repo

create virtual environment

Running the tests

first install the tool like in "Development setup"

then, navigate into the directory of the repo (if not already done)

activate environment

run tests

License

Acknowledgments

Citation

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies