textgrid-tools
Command-line interface which provides methods to modify TextGrids (.TextGrid) and their corresponding audio files (.wav).
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.7%) to scientific vocabulary
Keywords
Repository
Command-line interface which provides methods to modify TextGrids (.TextGrid) and their corresponding audio files (.wav).
Basic Info
Statistics
- Stars: 10
- Watchers: 3
- Forks: 0
- Open Issues: 0
- Releases: 13
Topics
Metadata Files
README.md
textgrid-tools
Command-line interface (CLI) to modify TextGrids and their corresponding audio files.
Features
- grids
merge: merge grids togetherplot-durations: plot durationsmark-durations: mark intervals with specific durations with a textcreate-dictionary: create pronunciation dictionary out of a word and a pronunciation tierplot-stats: plot statisticsexport-vocabulary: export vocabulary out of multiple grid filesexport-marks: exports marks of a tier to a fileexport-durations: exports durations of grids to a fileexport-paths: exports grid paths to a fileexport-audio-paths: exports audio paths to a fileimport-paths: import grids from paths written in a fileimport-audio-paths: import audio files from paths written in a filecompare-interval-boundaries: compare interval boundaries
- grid
create: convert text files to grid filessync: synchronize grid minTime and maxTime according to the corresponding audio filesplit: split a grid file on intervals into multiple grid files (incl. audio files)print-stats: print statistics
- tiers
apply-mapping: apply mapping table to markstranscribe: transcribe words of tiers using a pronunciation dictionaryremove: remove tiers
- tier
rename: rename tierclone: clone tiermap: map tier to other tiersmove: move tier to another positionexport: export content of tier to a txt fileimport: import content of tier from a txt file
- intervals
join: join adjacent intervalsjoin-between-marks: join intervals between marksjoin-by-boundary: join intervals by boundaries of a tierjoin-by-duration: join intervals by a durationjoin-marks: join intervals containing specific marksjoin-symbols: join intervals containing specific symbolsjoin-template: join intervals according to a templatesplit: split intervalsfix-boundaries: align boundaries of tiers according to a reference tierremove: remove intervalsplot-durations: plot durationsreplace-text: replace text using regex pattern
Roadmap
- Performance improvement
- Adding more tests
Installation
sh
pip install textgrid-tools --user
Usage
```txt usage: textgrid-tools-cli [-h] [-v] {grids,grid,tiers,tier,intervals} ...
This program provides methods to modify TextGrids (.TextGrid) and their corresponding audio files (.wav).
positional arguments: {grids,grid,tiers,tier,intervals} description grids execute commands targeted at multiple grids at once grid execute commands targeted at single grids tiers execute commands targeted at multiple tiers at once tier execute commands targeted at single tiers intervals execute commands targeted at intervals of tiers
optional arguments: -h, --help show this help message and exit -v, --version show program's version number and exit ```
Dependencies
numpy>=1.18.5scipy>=1.8.0tqdm>=4.63.0TextGrid>=1.5pandas>=1.4.0ordered_set>=4.1.0matplotlib>=3.5.0pronunciation_dictionary>=0.0.5
Contributing
If you notice an error, please don't hesitate to open an issue.
Development setup
```sh
update
sudo apt update
install Python 3.8-3.12 for ensuring that tests can be run
sudo apt install python3-pip \ python3.8 python3.8-dev python3.8-distutils python3.8-venv \ python3.9 python3.9-dev python3.9-distutils python3.9-venv \ python3.10 python3.10-dev python3.10-distutils python3.10-venv \ python3.11 python3.11-dev python3.11-distutils python3.11-venv \ python3.12 python3.12-dev python3.12-distutils python3.12-venv
install pipenv for creation of virtual environments
python3.8 -m pip install pipenv --user
check out repo
git clone https://github.com/stefantaubert/textgrid-ipa.git cd textgrid-ipa
create virtual environment
python3.8 -m pipenv install --dev ```
Running the tests
```sh
first install the tool like in "Development setup"
then, navigate into the directory of the repo (if not already done)
cd textgrid-ipa
activate environment
python3.8 -m pipenv shell
run tests
tox ```
Final lines of test result output:
log
py38: commands succeeded
py39: commands succeeded
py310: commands succeeded
py311: commands succeeded
py312: commands succeeded
congratulations :)
Troubleshooting
If recordings/audio files are not in .wav format they need to be converted, e.g.:
```sh sudo apt install ffmpeg -y
e.g., mp3 to wav conversion
ffmpeg -i *.mp3 -acodec pcm_s16le -ar 22050 *.wav ```
License
MIT License
Acknowledgments
Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410
Citation
If you want to cite this repo, you can use this BibTeX-entry generated by GitHub (see About => Cite this repository).
Changelog
- v0.0.9 (unreleased)
- Fixed:
- Bugfix
grids importerror raise on file not found - Added:
- Added
grids compare-interval-boundaries
- v0.0.8 (2023-05-30)
- Fixed:
- Bugfix
intervals removecopying on different in/out-locations - Bugfix
import-pathsandimport-audio-pathsoption--symlinkis now creating symbolic links instead of hard links - Changed:
- Improved logging in
import-pathsandimport-audio-paths - Improved logging of durations in
grids plot-stats - Added:
- Added option to get durations from audio files on
grids export-durations
- v0.0.7 (2023-01-12)
- Fixed:
- Bugfix
grids import-pathsandgrids import-audio-paths - Added:
- Added option
--ignoreto ignore custom marks ingrids export-vocabulary - Added option
--modetointervals replace-textto replace text on different interval positions - Added returning of an exit code
- Removed:
- Removed
tiers mark-silencebecausegrids mark-durationsshould be used - Removed
tiers remove-symbolsbecauseintervals replace-textshould be used - Removed
intervals join-between-pausesbecausejoin-between-marksshould be used
- v0.0.6 (2022-12-23)
- improved validation for pronunciation dictionary creation
- bugfix replace text logging
- added intervals join-template
- support Python 3.11
- update pylint config
- fix description of grid/audio import
- v0.0.5 (2022-11-25)
intervals remove: added parametermodeto better choose which intervals should be removed- Added method to plot statistics for all grids together
tiers transcribe: added optionassign-mark-to-missingto replace missing transcriptions with a custom mark- Bugfix:
mark-durationsempty couldn't be assigned - Added
--min-counttomark-durations - Improved sorting of phonemes in durations plotting
- Changed marks exporting format to only contain tier marks
- Added exporting/importing of audio paths
- Added durations exporting
- Added exporting/importing of grid paths
- Added replacement of marks using regex pattern
- Added
--dryoption to most methods - Make split symbol on split mandatory
- Upper-cased metavars
- v0.0.4 (2022-06-09)
- fixed bug while saving TextGrids
- improved robustness against file system errors
- v0.0.3 (2022-05-31)
- fixed invalid installation format and clarified dependencies
- adjusted textgrid serialization equal to praat output
- added option
include-emptyon vocabulary export - set default chunksize to
1 - added missing
__init__.pyfiles - improved logging
- v0.0.2 (2022-05-06)
- improved logging
- improved reading/saving speed of TextGrids
- removed n_digits argument
- added option to define encoding of TextGrids
- added option to insert interval between grids which should be merged together
- removed tier copy
- added parser for tier export
- v0.0.1 (2022-04-29)
- initial release
Owner
- Name: Stefan Taubert
- Login: stefantaubert
- Kind: user
- Location: Chemnitz, Germany
- Company: Chemnitz University of Technology
- Website: https://stefantaubert.com
- Twitter: Stefan_Taubert
- Repositories: 75
- Profile: https://github.com/stefantaubert
Currently I am working on my PhD about the topic of speech synthesis at Chemnitz University of Technology.
Citation (CITATION.cff)
cff-version: 1.2.0
title: textgrid-tools
abstract: Command-line interface (CLI) to modify TextGrids and their corresponding audio files.
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- email: github@stefantaubert.com
given-names: Stefan
family-names: Taubert
affiliation: Chemnitz University of Technology
orcid: 'https://orcid.org/0000-0002-4932-2874'
website: 'https://stefantaubert.com/'
version: 0.0.8
date-released: 2023-01-12
license: MIT
url: https://github.com/stefantaubert/textgrid-ipa
doi: 10.5281/zenodo.7986716
GitHub Events
Total
Last Year
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Stefan Taubert | s****t@p****e | 443 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- autoflake * develop
- autopep8 * develop
- build * develop
- isort * develop
- pycodestyle * develop
- pylint * develop
- pytest * develop
- rope * develop
- textgrid-tools * develop
- twine * develop
- TextGrid >=1.5
- matplotlib >=3.5.0
- numpy >=1.18.5
- ordered-set >=4.1.0
- pandas >=1.4.0
- pronunciation-dictionary >=0.0.5
- scipy >=1.8.0
- tqdm >=4.63.0
- astroid ==2.13.2 develop
- attrs ==22.2.0 develop
- autoflake ==2.0.0 develop
- autopep8 ==2.0.1 develop
- bleach ==5.0.1 develop
- build ==0.10.0 develop
- certifi ==2022.12.7 develop
- cffi ==1.15.1 develop
- charset-normalizer ==2.1.1 develop
- commonmark ==0.9.1 develop
- contourpy ==1.0.6 develop
- cryptography ==39.0.0 develop
- cycler ==0.11.0 develop
- dill ==0.3.6 develop
- docutils ==0.19 develop
- fonttools ==4.38.0 develop
- idna ==3.4 develop
- importlib-metadata ==6.0.0 develop
- iniconfig ==2.0.0 develop
- isort ==5.11.4 develop
- jaraco.classes ==3.2.3 develop
- jeepney ==0.8.0 develop
- keyring ==23.13.1 develop
- kiwisolver ==1.4.4 develop
- lazy-object-proxy ==1.9.0 develop
- matplotlib ==3.6.3 develop
- mccabe ==0.7.0 develop
- more-itertools ==9.0.0 develop
- numpy ==1.24.1 develop
- ordered-set ==4.1.0 develop
- packaging ==23.0 develop
- pandas ==1.5.2 develop
- pillow ==9.4.0 develop
- pkginfo ==1.9.6 develop
- platformdirs ==2.6.2 develop
- pluggy ==1.0.0 develop
- pronunciation-dictionary ==0.0.5 develop
- pycodestyle ==2.10.0 develop
- pycparser ==2.21 develop
- pyflakes ==3.0.1 develop
- pygments ==2.14.0 develop
- pylint ==2.15.10 develop
- pyparsing ==3.0.9 develop
- pyproject-hooks ==1.0.0 develop
- pytest ==7.2.0 develop
- python-dateutil ==2.8.2 develop
- pytoolconfig ==1.2.4 develop
- pytz ==2022.7 develop
- readme-renderer ==37.3 develop
- requests ==2.28.1 develop
- requests-toolbelt ==0.10.1 develop
- rfc3986 ==2.0.0 develop
- rich ==13.0.1 develop
- rope ==1.6.0 develop
- scipy ==1.10.0 develop
- secretstorage ==3.3.3 develop
- six ==1.16.0 develop
- textgrid ==1.5 develop
- textgrid-tools * develop
- tomlkit ==0.11.6 develop
- tqdm ==4.64.1 develop
- twine ==4.0.2 develop
- typing-extensions ==4.4.0 develop
- urllib3 ==1.26.14 develop
- webencodings ==0.5.1 develop
- wrapt ==1.14.1 develop
- zipp ==3.11.0 develop
- contourpy ==1.0.6
- cycler ==0.11.0
- fonttools ==4.38.0
- kiwisolver ==1.4.4
- matplotlib ==3.6.3
- numpy ==1.24.1
- ordered-set ==4.1.0
- packaging ==23.0
- pandas ==1.5.2
- pillow ==9.4.0
- pronunciation-dictionary ==0.0.5
- pyparsing ==3.0.9
- python-dateutil ==2.8.2
- pytz ==2022.7
- scipy ==1.10.0
- six ==1.16.0
- textgrid ==1.5
- tqdm ==4.64.1