conrad
Client for the Microsoft Cognitive Services Text to Speech REST API (reboot of the mscstts package)
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: science.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.4%) to scientific vocabulary
Keywords
azure
r
text-to-speech
tts
Last synced: 9 months ago
·
JSON representation
Repository
Client for the Microsoft Cognitive Services Text to Speech REST API (reboot of the mscstts package)
Basic Info
- Host: GitHub
- Owner: fhdsl
- License: other
- Language: R
- Default Branch: main
- Homepage: http://hutchdatascience.org/conrad/
- Size: 2.83 MB
Statistics
- Stars: 1
- Watchers: 2
- Forks: 1
- Open Issues: 0
- Releases: 1
Topics
azure
r
text-to-speech
tts
Created about 3 years ago
· Last pushed over 1 year ago
Metadata Files
Readme
Changelog
License
Codemeta
README.Rmd
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# conrad
[](https://github.com/fhdsl/conrad/actions/workflows/R-CMD-check.yaml)
[](https://CRAN.R-project.org/package=conrad)
:exclamation:*conrad is a reboot of [mscstts](https://github.com/jhudsl/mscstts). Instead of [httr](https://httr.r-lib.org/#status), which is superseded and not recommended, we use [httr2](https://httr2.r-lib.org/) to perform HTTP requests to the Microsoft Cognitive Services Text to Speech REST API.*
conrad serves as a client to the [Microsoft Cognitive Services Text to Speech REST API](https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-text-to-speech?tabs=streaming). The Text to Speech REST API supports neural text to speech voices, which support specific languages and dialects that are identified by locale. Each available endpoint is associated with a region.
Before you use the text to speech REST API, a valid account must be registered at the [Microsoft Azure Cognitive Services](https://azure.microsoft.com/en-us/free/ai-services/) and you must obtain an API key. Without an API key, this package will not work.
## Installation
Install the CRAN version:
```{r, eval = FALSE}
install.packages("conrad")
```
Or install the development version from GitHub:
```{r, eval = FALSE}
# install.packages("devtools")
devtools::install_github("fhdsl/conrad")
```
## Getting an API key
1. Sign in/Create an Azure account on [Microsoft Azure Cognitive Services](https://azure.microsoft.com/en-us/free/cognitive-services/).
2. Click `+ Create a resource` (below "Azure services" or click on the Hamburger button)
3. Search for "Speech" and Click `Create` -> `Speech`
4. Create a [Resource group](https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/manage-resource-groups-portal#what-is-a-resource-group) and a "Name".
5. Choose `Pricing tier` (you can choose the free version with `Free F0`)
6. Click `Review + create`, review the Terms, and click `Create`.
If the deployment was successful, you should see :white_check_mark: **Your deployment is complete** on the next page.
7. Under `Next steps`, click `Go to resource`
8. Look on the left sidebar and under `Resource Management`, click `Keys and Endpoint`
9. Copy either `KEY 1` or `KEY 2` to clipboard. Only one key is necessary to make an API call.
Once you complete these steps, you have successfully retrieved your API keys to access the API.
:warning: Remember your `Location/Region`, which you use to make calls to the API. Specifying a different region will lead to a [HTTP 403 Forbidden response](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403).
For more detailed information on each step, refer to the [API Key vignette](http://hutchdatascience.org/conrad/articles/api-key.html).
## Setting your API key
You can set your API key in a number of ways:
1. Edit `~/.Renviron` and set `MS_TTS_API_KEY = "YOUR_API_KEY"`
2. In `R`, use `options(ms_tts_key = "YOUR_API_KEY")`.
3. Set `export MS_TTS_API_KEY=YOUR_API_KEY` in `.bash_profile`/`.bashrc` if you're using `R` in the terminal.
4. Pass `api_key = "YOUR_API_KEY"` in arguments of functions such as `ms_list_voices(api_key = "YOUR_API_KEY")`.
## Get a list of voices
`ms_list_voice()` uses the `tts.speech.microsoft.com/cognitiveservices/voices/list` endpoint to get a full [list of voices](https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-text-to-speech?tabs=streaming#get-a-list-of-voices) for a specific region. It attaches a region prefix to this endpoint to get a list of voices for that region.
For example, to get a list of all the voices for the `westus` region, it uses the `https://westus.tts.speech.microsoft.com/cognitiveservices/voices/list` endpoint.
:warning: Be sure to specify the Speech resource region that corresponds to your API Key.
```{r, eval = FALSE}
ms_list_voice(api_key = "YOUR_API_KEY", region = "westus")
```
## Convert text to speech
`ms_synthesize()` uses the `tts.speech.microsoft.com/cognitiveservices/v1` endpoint to convert [text to speech](https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-text-to-speech?tabs=streaming#convert-text-to-speech). The endpoint requires [Speech Synthesis Markup Language (SSML)](https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-synthesis-markup) to specify the language, gender, and full voice name.
:warning: Be sure to specify the Speech resource region that corresponds to your API Key.
```{r, eval = FALSE}
# Convert text to speech
res <- ms_synthesize(script = "Hello world, this is a talking computer", region = "westus", gender = "Male")
# Returns hexadecimal representation of binary data
# Create file to store audio output
output_path <- tempfile(fileext = ".wav")
# Write binary data to output path
writeBin(res, con = output_path)
# Play audio in browser
play_audio(audio = output_path)
```
If you want more examples of different voices with different scripts, refer to the Introduction to conrad [vignette](http://hutchdatascience.org/conrad/articles/conrad.html).
## Get an access token
`ms_get_token()` makes a request to the `issueToken` endpoint to get an [access token](https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-text-to-speech?tabs=streaming#how-to-get-an-access-token). The function require an API key and region as inputs. The access token is used to send requests to the API.
:warning: Be sure to specify the Speech resource region that corresponds to your API Key.
```{r, eval = FALSE}
ms_get_token(api_key = "YOUR_API_KEY", region = "westus")
```
## Major differences to mscstts
- To enhance the reliability of our package, we have transitioned from using [httr](https://httr.r-lib.org/) to [httr2](https://httr2.r-lib.org/) for handling HTTP requests to the [Text to Speech REST API](https://learn.microsoft.com/en-us/azure/cognitive-services/Speech-Service/rest-text-to-speech?tabs=streaming). This change was motivated by the fact that httr is [no longer](https://httr.r-lib.org/#status) being actively maintained, with updates limited to those necessary for CRAN compatibility. In contrast, httr2 represents a modern reimagining of httr and is strongly recommended for usage.
- It resolves the HTTP 403 Forbidden [issue](https://github.com/jhudsl/mscstts/issues/13). An HTTP 403 Forbidden response status code signifies that the server comprehends the request but denies authorization. In the case of [`mscstts::ms_synthesize()`](https://github.com/jhudsl/mscstts/blob/master/R/ms_synthesize.R), the [problem](https://github.com/jhudsl/mscstts/issues/13) arose due to the use of an invalid voice within the HTTP request, specifically concerning the chosen region. For instance, the SSML might have contained a voice name that was not supported in the `westus` region. As a consequence, the server would reject the HTTP request.
- We have made significant improvements to the documentation across the entire package. These enhancements include simpler function names, commented functions for clarity, removal of unnecessary functions and arguments, and URLs directing users to pages that explain text-to-speech jargon.
We believe that these improvements will greatly enhance the usability of the package and make it even more reliable in the long-term.
## Acknowledgements
conrad wouldn't be possible without prior work on [mscstts](https://github.com/jhudsl/mscstts) by [John Muschelli](https://github.com/muschellij2) and [httr2](https://github.com/r-lib/httr2) by [Hadley Wickham](https://github.com/hadley).
Owner
- Name: Fred Hutch Data Science Lab
- Login: fhdsl
- Kind: organization
- Location: United States of America
- Website: https://hutchdatascience.org/
- Repositories: 19
- Profile: https://github.com/fhdsl
CodeMeta (codemeta.json)
{
"@context": "https://doi.org/10.5063/schema/codemeta-2.0",
"@type": "SoftwareSourceCode",
"identifier": "conrad",
"description": "Convert text into synthesized speech and get a list of supported voices for a region. The Text to Speech REST API supports neural text to speech voices, which support specific languages and dialects that are identified by locale. ",
"name": "conrad: Client for the Microsoft Cognitive Services Text to Speech REST API",
"license": "https://spdx.org/licenses/MIT",
"version": "1.0.0",
"programmingLanguage": {
"@type": "ComputerLanguage",
"name": "R",
"url": "https://r-project.org"
},
"runtimePlatform": "R version 4.2.3 (2023-03-15)",
"author": [
{
"@type": "Person",
"givenName": "Howard",
"familyName": "Baik",
"email": "howardbaik43@gmail.com",
"@id": "https://orcid.org/0009-0000-8942-1618"
},
{
"@type": "Person",
"givenName": "John",
"familyName": "Muschelli",
"email": "muschellij2@gmail.com",
"@id": "https://orcid.org/0009-0000-8942-1618"
}
],
"copyrightHolder": [
{
"@type": "Person",
"givenName": "Howard",
"familyName": "Baik",
"email": "howardbaik43@gmail.com",
"@id": "https://orcid.org/0009-0000-8942-1618"
}
],
"maintainer": [
{
"@type": "Person",
"givenName": "Howard",
"familyName": "Baik",
"email": "howardbaik43@gmail.com",
"@id": "https://orcid.org/0009-0000-8942-1618"
}
],
"softwareSuggestions": [
{
"@type": "SoftwareApplication",
"identifier": "testthat",
"name": "testthat",
"version": ">= 3.0.0",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=testthat"
}
],
"softwareRequirements": {
"1": {
"@type": "SoftwareApplication",
"identifier": "R",
"name": "R",
"version": ">= 2.10"
},
"2": {
"@type": "SoftwareApplication",
"identifier": "httr2",
"name": "httr2",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=httr2"
},
"3": {
"@type": "SoftwareApplication",
"identifier": "jsonlite",
"name": "jsonlite",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=jsonlite"
},
"4": {
"@type": "SoftwareApplication",
"identifier": "magrittr",
"name": "magrittr",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=magrittr"
},
"SystemRequirements": null
},
"fileSize": "41.491KB",
"codeRepository": "https://github.com/fhdsl/conrad",
"releaseNotes": "https://github.com/fhdsl/conrad/blob/master/NEWS.md",
"readme": "https://github.com/fhdsl/conrad/blob/main/README.md",
"contIntegration": "https://github.com/fhdsl/conrad/actions/workflows/R-CMD-check.yaml",
"keywords": [
"azure",
"r",
"text-to-speech",
"tts"
]
}
GitHub Events
Total
- Issues event: 2
- Issue comment event: 1
- Push event: 1
- Pull request review event: 1
- Pull request event: 2
- Fork event: 1
Last Year
- Issues event: 2
- Issue comment event: 1
- Push event: 1
- Pull request review event: 1
- Pull request event: 2
- Fork event: 1
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 2
- Total pull requests: 3
- Average time to close issues: 4 months
- Average time to close pull requests: 7 days
- Total issue authors: 1
- Total pull request authors: 1
- Average comments per issue: 1.0
- Average comments per pull request: 0.33
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 2
- Average time to close issues: about 20 hours
- Average time to close pull requests: 3 days
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 1.0
- Average comments per pull request: 0.0
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- howardbaik (1)
- howardbaek (1)
Pull Request Authors
- howardbaik (3)
- howardbaek (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 596 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 2
- Total maintainers: 1
cran.r-project.org: conrad
Client for the Microsoft's 'Cognitive Services Text to Speech REST' API
- Homepage: https://github.com/fhdsl/conrad
- Documentation: http://cran.r-project.org/web/packages/conrad/conrad.pdf
- License: MIT + file LICENSE
-
Latest release: 1.0.0
published almost 3 years ago
Rankings
Forks count: 28.6%
Dependent packages count: 29.1%
Dependent repos count: 34.8%
Stargazers count: 35.2%
Average: 43.4%
Downloads: 89.3%
Maintainers (1)
Last synced:
9 months ago
Dependencies
.github/workflows/R-CMD-check.yaml
actions
- actions/checkout v3 composite
- r-lib/actions/check-r-package v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION
cran
- R >= 2.10 depends
- httr2 * imports
- jsonlite * imports
- magrittr * imports
- testthat >= 3.0.0 suggests
.github/workflows/pkgdown.yaml
actions
- JamesIves/github-pages-deploy-action v4.4.1 composite
- actions/checkout v3 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite