https://github.com/ropensci/googlelanguager

R client for the Google Translation API, Google Cloud Natural Language API and Google Cloud Speech API

https://github.com/ropensci/googlelanguager

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.1%) to scientific vocabulary

Keywords

cloud-speech-api cloud-translation-api google-api-client google-cloud google-cloud-speech google-nlp googleauthr natural-language-processing peer-reviewed r r-package rstats sentiment-analysis speech-api translation-api

Keywords from Contributors

travis-ci
Last synced: 6 months ago · JSON representation

Repository

R client for the Google Translation API, Google Cloud Natural Language API and Google Cloud Speech API

Basic Info
Statistics
  • Stars: 198
  • Watchers: 21
  • Forks: 42
  • Open Issues: 12
  • Releases: 4
Topics
cloud-speech-api cloud-translation-api google-api-client google-cloud google-cloud-speech google-nlp googleauthr natural-language-processing peer-reviewed r r-package rstats sentiment-analysis speech-api translation-api
Created almost 9 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog Contributing License

README.md

googleLanguageR - R client for the Google Translation API, Natural Language API, Speech-to-Text API and Text-to-Speech API

CRAN Build
Status codecov.io

Language tools for R via Google Machine Learning APIs

Read the introduction blogpost on rOpenSci's blog

This package contains functions for analysing language through the Google Cloud Machine Learning APIs

Note all are paid services, you will need to provide your credit card details for your own Google Project to use them.

The package can be used by any user who is looking to take advantage of Google’s massive dataset to train these machine learning models. Some applications include:

  • Translation of speech into another language text, via speech-to-text then translation and having the results spoen back to you
  • Talking Shiny apps
  • Identification of sentiment within text, such as from Twitter feeds
  • Pulling out the objects of a sentence, to help classify texts and get metadata links from Wikipedia about them.

The applications of the API results could be relevant to business or researchers looking to scale text analysis.

Google Natural Language API

Google Natural Language API reveals the structure and meaning of text by offering powerful machine learning models in an easy to use REST API. You can use it to extract information about people, places, events and much more, mentioned in text documents, news articles or blog posts. You can also use it to understand sentiment about your product on social media or parse intent from customer conversations happening in a call center or a messaging app.

Read more on the Google Natural Language API

Google Cloud Translation API

Google Cloud Translation API provides a simple programmatic interface for translating an arbitrary string into any supported language. Translation API is highly responsive, so websites and applications can integrate with Translation API for fast, dynamic translation of source text from the source language to a target language (e.g. French to English).

Read more on the Google Cloud Translation Website

Google Cloud Speech-to-Text API

Google Cloud Speech-to-Text API enables you to convert audio to text by applying neural network models in an easy to use API. The API recognizes over 80 languages and variants, to support your global user base. You can transcribe the text of users dictating to an application’s microphone or enable command-and-control through voice among many other use cases.

Read more on the Google Cloud Speech Website

Google Cloud Text-to-Speech API

Google Cloud Text-to-Speech enables developers to synthesize natural-sounding speech with 30 voices, available in multiple languages and variants. It applies DeepMind’s groundbreaking research in WaveNet and Google’s powerful neural networks to deliver the highest fidelity possible. With this easy-to-use API, you can create lifelike interactions with your users, across many applications and devices.

Read more on the Google Cloud Text-to-Speech Website

Installation

  1. Create a Google API Console Project
  2. Within your project, add a payment method to the project
  3. Within your project, check the relevant APIs are activated
  1. Generate a service account credential as a JSON file by first creating a service account and then creating credentials for a service account
  2. Return to R, and install the official release via install.packages("googleLanguageR"), or the development version with remotes::install_github("ropensci/googleLanguageR")

Docker image

Some Docker images are publicly available. In general gcr.io/gcer-public/googleLanguageR:$BRANCH_NAME carries that GitHub branch's version.

  • gcr.io/gcer-public/googleLanguageR:CRAN - the latest CRAN version CRAN
  • gcr.io/gcer-public/googleLanguageR:master - latest GitHub master version Build
Status
  • gcr.io/gcer-public/googleLanguageR:feature - a feature branch from GitHub

Usage

Authentication

The best way to authenticate is to use an environment file. See ?Startup. I usually place this in my home directory. (e.g. if using RStudio, click on Home in the file explorer, create a new TEXT file and call it .Renviron)

Set the file location of your download Google Project JSON file in a GL_AUTH argument:

#.Renviron
GL_AUTH=location_of_json_file.json

Then, when you load the library you should auto-authenticate:

r library(googleLanguageR)

You can also authenticate directly using the gl_auth function pointing at your JSON auth file:

r library(googleLanguageR) gl_auth("location_of_json_file.json")

You can then call the APIs via the functions:

  • gl_nlp() - Natural Langage API
  • gl_speech() - Cloud Speech-to-Text API
  • gl_translate() - Cloud Translation API
  • gl_talk() - Cloud Text-to-Speech API

Natural Language API

The Natural Language API returns natural language understanding technolgies. You can call them individually, or the default is to return them all. The available returns are:

  • Entity analysis - Finds named entities (currently proper names and common nouns) in the text along with entity types, salience, mentions for each entity, and other properties. If possible, will also return metadata about that entity such as a Wikipedia URL. If using the v1beta2 endpoint this also includes sentiment for each entity.
  • Syntax - Analyzes the syntax of the text and provides sentence boundaries and tokenization along with part of speech tags, dependency trees, and other properties.
  • Sentiment - The overall sentiment of the text, represented by a magnitude [0, +inf] and score between -1.0 (negative sentiment) and 1.0 (positive sentiment).

Demo for Entity Analysis

You can pass a vector of text which will call the API for each element. The return is a list of responses, each response being a list of tibbles holding the different types of analysis.

``` r texts <- c("to administer medicince to animals is frequently a very difficult matter, and yet sometimes it's necessary to do so", "I don't know how to make a text demo that is sensible") nlpresult <- glnlp(texts)

two results of lists of tibbles

str(nlp_result, max.level = 2) ```

See more examples and details on the vignette("nlp", package = "googleLanguageR")

Google Translation API

You can detect the language via gl_translate_detect, or translate and detect language via gl_translate

Note this is a lot more refined than the free version on Google’s translation website.

``` r text <- "to administer medicine to animals is frequently a very difficult matter, and yet sometimes it's necessary to do so"

translate British into Danish

gl_translate(text, target = "da")$translatedText ```

See more examples and details on the vignette("translate", package = "googleLanguageR")

Google Cloud Speech-to-Text API

The Cloud Speech-to-Text API provides audio transcription. Its accessible via the gl_speech function.

A test audio file is installed with the package which reads:

“To administer medicine to animals is frequently a very difficult matter, and yet sometimes it’s necessary to do so”

The file is sourced from the University of Southampton’s speech detection (https://www-mobile.ecs.soton.ac.uk/newcomms/) group and is fairly difficult for computers to parse, as we see below:

``` r

get the sample source file

testaudio <- system.file("woman1wb.wav", package = "googleLanguageR")

its not perfect but...:)

glspeech(testaudio)$transcript

## # A tibble: 1 x 2
##   transcript                                                    confidence
##   <chr>                                                         <chr>     
## 1 to administer medicine to animals is frequency of very diffi… 0.9180294

```

See more examples and details on the vignette("speech", package = "googleLanguageR")

Google Cloud Text-to-Speech API

The Cloud Text-to-Speech API turns text into talk audio files. Its accessible via the gl_talk function.

To use, supply your text to the function:

r gl_talk("This is a talking computer. Hello Dave.")

See more examples and details on the vignette("text-to-speech", package = "googleLanguageR")

ropensci\_footer

Owner

  • Name: rOpenSci
  • Login: ropensci
  • Kind: organization
  • Email: info@ropensci.org
  • Location: Berkeley, CA

GitHub Events

Total
  • Issues event: 3
  • Watch event: 3
  • Issue comment event: 8
  • Pull request event: 2
  • Fork event: 1
Last Year
  • Issues event: 3
  • Watch event: 3
  • Issue comment event: 8
  • Pull request event: 2
  • Fork event: 1

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 344
  • Total Committers: 9
  • Avg Commits per committer: 38.222
  • Development Distribution Score (DDS): 0.262
Past Year
  • Commits: 13
  • Committers: 2
  • Avg Commits per committer: 6.5
  • Development Distribution Score (DDS): 0.077
Top Committers
Name Email Commits
Mark Edmondson g****b@m****e 254
googleCloudRunner c****p@g****m 56
Aleksander Dietrichson d****n@g****m 12
muschellij2 m****2@g****m 9
Mark IIH m****k@i****m 6
smmurphy s****4@g****m 4
Maëlle Salmon m****n@y****e 1
David Dobolyi 4****d 1
Howard Baek 5****k 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 72
  • Total pull requests: 20
  • Average time to close issues: 3 months
  • Average time to close pull requests: 26 days
  • Total issue authors: 40
  • Total pull request authors: 10
  • Average comments per issue: 3.78
  • Average comments per pull request: 1.85
  • Merged pull requests: 18
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 9 days
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 1.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • MarkEdmondson1234 (29)
  • LukasWallrich (3)
  • eric-kruger (2)
  • kgarnick (2)
  • iamserious (1)
  • GustavPeper (1)
  • dataning (1)
  • englianhu (1)
  • smmurphy (1)
  • xazip (1)
  • lohithmc44 (1)
  • davidmeza1 (1)
  • alovenegas (1)
  • caitlin91 (1)
  • albertostefanelli (1)
Pull Request Authors
  • MarkEdmondson1234 (5)
  • maelle (3)
  • muschellij2 (3)
  • retowyss (2)
  • smmurphy (2)
  • davedgd (1)
  • howardbaek (1)
  • cherylisabella (1)
  • dietrichson (1)
  • jeroen (1)
Top Labels
Issue Labels
enhancement (9) hacktoberfest (5) help wanted (3) documentation (3) question (1)
Pull Request Labels

Packages

  • Total packages: 3
  • Total downloads:
    • cran 1,229 last-month
  • Total docker downloads: 131,478
  • Total dependent packages: 3
    (may contain duplicates)
  • Total dependent repositories: 5
    (may contain duplicates)
  • Total versions: 14
  • Total maintainers: 1
proxy.golang.org: github.com/ropensci/googleLanguageR
  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.4%
Average: 5.6%
Dependent repos count: 5.8%
Last synced: 6 months ago
proxy.golang.org: github.com/ropensci/googlelanguager
  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.4%
Average: 5.6%
Dependent repos count: 5.8%
Last synced: 6 months ago
cran.r-project.org: googleLanguageR

Call Google's 'Natural Language' API, 'Cloud Translation' API, 'Cloud Speech' API and 'Cloud Text-to-Speech' API

  • Versions: 6
  • Dependent Packages: 3
  • Dependent Repositories: 5
  • Downloads: 1,229 Last month
  • Docker Downloads: 131,478
Rankings
Docker downloads count: 0.0%
Average: 9.2%
Dependent packages count: 10.9%
Downloads: 12.8%
Dependent repos count: 13.1%
Maintainers (1)
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.3 depends
  • assertthat * imports
  • base64enc * imports
  • googleAuthR >= 1.1.1 imports
  • jsonlite * imports
  • magrittr * imports
  • purrr >= 0.2.4 imports
  • stats * imports
  • tibble * imports
  • utils * imports
  • cld2 * suggests
  • knitr * suggests
  • rmarkdown * suggests
  • rvest * suggests
  • shiny * suggests
  • shinyjs * suggests
  • stringdist * suggests
  • testthat * suggests
  • tidyr * suggests
  • tuneR * suggests
  • xml2 * suggests
inst/shiny/capture_speech/DESCRIPTION cran
Dockerfile docker
  • gcr.io/mark-edmondson-gde/googleauthr latest build