Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.9%) to scientific vocabulary
Keywords
Repository
Pitch-shifting and time-stretching with TD-PSOLA
Basic Info
Statistics
- Stars: 83
- Watchers: 3
- Forks: 2
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Time-domain pitch-synchronous overlap-add (TD-PSOLA)
This module is for constant- and variable-rate pitch-shifting and
time-stretching of speech. It is a wrapper around the
parselmouth [1]
wrapper around the Praat [2] implementation of TD-PSOLA [3]. Pitch-shifting
is performed by providing a numpy array of target pitch values equally spaced
over time. Variable-rate time stretching uses forced phoneme alignment via
pypar.
If you need to extract pitch features or phoneme alignments, see
penn for pitch estimation
and pyfoal for forced alignment.
If you only want to perform pitch-shifting, you do not need to extract
forced alignments. If you want to do variable-rate time stretching, you do not
need to perform pitch estimation.
Installation
pip install psola
Usage
If you want to perform pitch-shifting or time-stretching on audio already
loaded into memory, use psola.vocode. If you want to do this with audio
saved in a file, use psola.from_file. You can use psola.to_file or
psola.from_file_to_file to save the results to a file. To process many
files at once with multiprocessing, use psola.from_files_to_files.
Each of these functions is documented below. The command-line interface
wraps the arguments of psola.from_files_to_files and is described in
the next section.
psola.vocode
```python """Performs pitch vocoding using Praat
Arguments audio : np.array(shape=(samples,)) The speech signal to process samplerate : int The audio sampling rate. sourcealignment : pypar.Alignment The current alignment if performing time-stretching targetalignment : pypar.Alignment The target alignment if performing time-stretching targetpitch : np.array(shape=(frames,)) The target pitch contour constant_stretch : float or None A constant value for time-stretching fmin : int The minimum allowable frequency in Hz. fmax : int The maximum allowable frequency in Hz.
Returns audio : np.array(shape=(samples,)) The vocoded audio """ ```
psola.from_file
```python """Performs vocoding using Praat
Arguments audiofile : string The file containing the speech signal to process sourcealignmentfile : string or None The file containing the original alignment targetalignmentfile : string or None The file containing the target alignment targetpitchfile : string or None The file containing the target pitch constantstretch : float or None A constant value for time-stretching fmin : int The minimum allowable frequency in Hz. fmax : int The maximum allowable frequency in Hz.
Returns audio : np.array(shape=(samples,)) The vocoded audio sample_rate : int The audio sampling rate """ ```
psola.to_file
```python """Performs pitch vocoding and saves audio to disk
Arguments audio : np.array(shape=(samples,)) The speech signal to process samplerate : int The audio sampling rate outputfile : string The file to save the vocoded speech sourcealignment : pypar.Alignment The current alignment if performing time-stretching targetalignment : pypar.Alignment The target alignment if performing time-stretching targetpitch : np.array(shape=(frames,)) The target pitch contour constantstretch : float or None A constant value for time-stretching fmin : int The minimum allowable frequency in Hz. fmax : int The maximum allowable frequency in Hz. """ ```
psola.from_file_to_file
```python """Performs vocoding using Praat and save to disk
Arguments audiofile : string The file containing the speech signal to process outputfile : string The file to save the vocoded speech sourcealignmentfile : string or None The file containing the original alignment targetalignmentfile : string or None The file containing the target alignment targetpitchfile : string or None The file containing the target pitch constant_stretch : float or None A constant value for time-stretching fmin : int The minimum allowable frequency in Hz. fmax : int The maximum allowable frequency in Hz. """ ```
psola.from_files_to_files
```python """Performs vocoding using Praat and save to disk
Arguments audiofiles : list The files containing the speech signals to process outputfiles : list The files to save the vocoded speech sourcealignmentfiles : string or None The files containing the original alignments targetalignmentfiles : list or None The files containing the target alignments targetpitchfiles : list or None The files containing the target pitch constant_stretch : float or None A constant value for time-stretching fmin : int The minimum allowable frequency in Hz. fmax : int The maximum allowable frequency in Hz. """ ```
Command-line interface
``` usage: python -m psola [-h] [--audiofiles AUDIOFILES [AUDIOFILES ...]] [--sourcealignmentfiles SOURCEALIGNMENTFILES [SOURCEALIGNMENTFILES ...]] [--targetalignmentfiles TARGETALIGNMENTFILES [TARGETALIGNMENTFILES ...]] [--constantstretch CONSTANTSTRETCH] [--targetpitchfiles TARGETPITCHFILES [TARGETPITCHFILES ...]] [--fmin FMIN] [--fmax FMAX] [--outputfiles OUTPUTFILES [OUTPUTFILES ...]]
optional arguments: -h, --help show this help message and exit --audiofiles AUDIOFILES [AUDIOFILES ...] The speech signal to process --sourcealignmentfiles SOURCEALIGNMENTFILES [SOURCEALIGNMENTFILES ...] The files containing the original alignments --targetalignmentfiles TARGETALIGNMENTFILES [TARGETALIGNMENTFILES ...] The files containing the target alignments --constantstretch CONSTANTSTRETCH A constant value for time-stretching --targetpitchfiles TARGETPITCHFILES [TARGETPITCHFILES ...] The target pitch contour --fmin FMIN The minimum allowable frequency in Hz --fmax FMAX The maximum allowable frequency in Hz --outputfiles OUTPUTFILES [OUTPUTFILES ...] Where to save the vocoded audio ```
References
[1] Y. Jadoul, B. Thompson, and B. De Boer, "Introducing parselmouth: A python interface to praat," Journal of Phonetics, vol. 71, pp. 1–15, 2018.
[2] P. Boersma, "Praat: doing phonetics by computer", http://www.praat.org/, 2006.
[3] E. Moulines and F. Charpentier, "Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones," Speech communication, 1990.
Owner
- Name: Max Morrison
- Login: maxrmorrison
- Kind: user
- Website: https://www.maxrmorrison.com
- Twitter: maxrmorrison
- Repositories: 7
- Profile: https://github.com/maxrmorrison
Computer Science PhD student at Northwestern University researching machine learning and audio technology
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it using the following metadata." authors: - family-names: "Morrison" given-names: "Max" title: "psola" version: 0.0.1 date-released: 2021-03-31 url: "https://github.com/maxrmorrison/psola"
GitHub Events
Total
- Watch event: 9
Last Year
- Watch event: 9
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Max Morrison | m****n@g****m | 17 |
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 2,664 last-month
- Total dependent packages: 1
- Total dependent repositories: 6
- Total versions: 1
- Total maintainers: 1
pypi.org: psola
Time-domain pitch-synchronous overlap-add
- Homepage: https://github.com/maxrmorrison/psola
- Documentation: https://psola.readthedocs.io/
- License: GPLv3
-
Latest release: 0.0.1
published almost 5 years ago
Rankings
Maintainers (1)
Dependencies
- numpy *
- praat-parselmouth *
- pypar *
- soundfile *
- tqdm *