spezispeech

Spezi Module to Support Speech-related Features Including Text-to-Speech and Speech-to-Text

https://github.com/stanfordspezi/spezispeech

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 3 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.1%) to scientific vocabulary

Last synced: 6 months ago · JSON representation ·

Repository

Spezi Module to Support Speech-related Features Including Text-to-Speech and Speech-to-Text

Basic Info

Host: GitHub
Owner: StanfordSpezi
License: mit
Language: Swift
Default Branch: main
Homepage: https://swiftpackageindex.com/StanfordSpezi/SpeziSpeech/documentation
Size: 47.9 KB

Statistics

Stars: 3
Watchers: 10
Forks: 1
Open Issues: 2
Releases: 7

Created over 2 years ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

SpeziSpeech

Recognize and synthesize natural language speech.

Overview

The Spezi Speech component provides an easy and convenient way to recognize (speech-to-text) and synthesize (text-to-speech) natural language content, facilitating seamless interaction with an application. It builds on top of Apple's Speech and AVFoundation frameworks.

Setup

1. Add Spezi Speech as a Dependency

You need to add the Spezi Speech Swift package to your app in Xcode or Swift package.

[!IMPORTANT]
If your application is not yet configured to use Spezi, follow the Spezi setup article to set up the core Spezi infrastructure.

2. Configure the `SpeechRecognizer` and the `SpeechSynthesizer` in the Spezi `Configuration`

The SpeechRecognizer and SpeechSynthesizer modules need to be registered in a Spezi-based application using the configuration in a SpeziAppDelegate: swift class ExampleAppDelegate: SpeziAppDelegate { override var configuration: Configuration { Configuration { SpeechRecognizer() SpeechSynthesizer() // ... } } }

[!NOTE]
You can learn more about a Module in the Spezi documentation.

3. Configure target properties

To ensure that your application has the necessary permissions for microphone access and speech recognition, follow the steps below to configure the target properties within your Xcode project:

Open your project settings in Xcode by selecting PROJECTNAME > TARGETNAME > Info tab.
You will need to add two entries to the Custom iOS Target Properties (so the Info.plist file) to provide descriptions for why your app requires these permissions:
- Add a key named Privacy - Microphone Usage Description and provide a string value that describes why your application needs access to the microphone. This description will be displayed to the user when the app first requests microphone access.
- Add another key named Privacy - Speech Recognition Usage Description with a string value that explains why your app requires the speech recognition capability. This will be presented to the user when the app first attempts to perform speech recognition.

These entries are mandatory for apps that utilize microphone and speech recognition features. Failing to provide them will result in your app being unable to access these features.

Example

SpeechTestView provides a demonstration of the capabilities of Spezi Speech. It showcases the interaction with the SpeechRecognizer to provide speech-to-text capabilities and the SpeechSynthesizer to generate speech from text.

``swift struct SpeechTestView: View { // Get theSpeechRecognizerandSpeechSynthesizerfrom the SwiftUIEnvironment`. @Environment(SpeechRecognizer.self) private var speechRecognizer @Environment(SpeechSynthesizer.self) private var speechSynthesizer // The transcribed message from the user's voice input. @State private var message = ""

var body: some View { VStack { // Button used to start and stop recording by triggering the microphoneButtonPressed() function. Button("Record") { microphoneButtonPressed() } .padding(.bottom)

     // Button used to start and stop playback of the transcribed message by triggering the `playbackButtonPressed()` function.
     Button("Playback") {
         playbackButtonPressed()
     }
         .padding(.bottom)

     Text(message)
  }

}

private func microphoneButtonPressed() { if speechRecognizer.isRecording { // If speech is currently recognized, stop the transcribing. speechRecognizer.stop() } else { // If the recognizer is idle, start a new recording. Task { do { // The speechRecognizer.start() function returns an AsyncThrowingStream that yields the transcribed text. for try await result in speechRecognizer.start() { // Access the string-based result of the transcribed result. message = result.bestTranscription.formattedString } } } } }

private func playbackButtonPressed() { if speechSynthesizer.isSpeaking { // If speech is currently synthezized, pause the playback. speechSynthesizer.pause() } else { // If synthesizer is idle, start with the text-to-speech functionality. speechSynthesizer.speak(message) } } } ```

SpeziSpeech also supports selecting voices, including personal voices.

The following example shows how a user can be given a choice of voices in their current locale and the selected voice can be used to synthesize speech.

```swift struct SpeechVoiceSelectionExample: View { @Environment(SpeechSynthesizer.self) private var speechSynthesizer @State private var selectedVoiceIndex = 0 @State private var message = ""

var body: some View { VStack { TextField("Enter text to be spoken", text: $message) .textFieldStyle(RoundedBorderTextFieldStyle()) .padding() Picker("Voice", selection: $selectedVoiceIndex) { ForEach(speechSynthesizer.voices.indices, id: .self) { index in Text(speechSynthesizer.voices[index].name) .tag(index) } } .pickerStyle(.inline) .accessibilityIdentifier("voicePicker") .padding() Button("Speak") { speechSynthesizer.speak( message, voice: speechSynthesizer.voices[selectedVoiceIndex] ) } } .padding() } } ```

Personal voices are supported on iOS 17 and above. Users must first create a personal voice. Using personal voices also requires obtaining authorization from the user. To request access to any available personal voices, you can use the getPersonalVoices() method of the SpeechSynthesizer. Personal voices will then become available alongside system voices.

For more information, please refer to the API documentation.

Contributing

Contributions to this project are welcome. Please make sure to read the contribution guidelines and the contributor covenant code of conduct first.

License

This project is licensed under the MIT License. See Licenses for more information.

Spezi Footer

Owner

Name: Stanford Spezi
Login: StanfordSpezi
Kind: organization

Repositories: 1
Profile: https://github.com/StanfordSpezi

Citation (CITATION.cff)

#
# This source file is part of the Stanford Spezi open source project
#
# SPDX-FileCopyrightText: 2023 Stanford University and the project authors (see CONTRIBUTORS.md)
#
# SPDX-License-Identifier: MIT
# 

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Schmiedmayer"
  given-names: "Paul"
  orcid: "https://orcid.org/0000-0002-8607-9148"
- family-names: "Zagar"
  given-names: "Philipp"
  orcid: "https://orcid.org/0009-0001-5934-2078"
- family-names: "Ravi"
  given-names: "Vishnu"
  orcid: "https://orcid.org/0000-0003-0359-1275"
- family-names: "Bauer"
  given-names: "Andreas"
  orcid: "https://orcid.org/0000-0002-1680-237X"
- family-names: "Adrit"
  given-names: "Rao"
  orcid: "https://orcid.org/0000-0002-0780-033X"
- family-names: "Aalami"
  given-names: "Oliver"
  orcid: "https://orcid.org/0000-0002-7799-2429"
title: "SpeziSpeech"
doi: 10.5281/zenodo.10146086
url: "https://github.com/StanfordSpezi/SpeziSpeech"

GitHub Events

Total

Create event: 4
Issues event: 2
Release event: 2
Watch event: 1
Delete event: 3
Issue comment event: 6
Member event: 1
Push event: 4
Pull request review comment event: 4
Pull request review event: 7
Pull request event: 4
Fork event: 1

Last Year

Create event: 4
Issues event: 2
Release event: 2
Watch event: 1
Delete event: 3
Issue comment event: 6
Member event: 1
Push event: 4
Pull request review comment event: 4
Pull request review event: 7
Pull request event: 4
Fork event: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 3
Total pull requests: 10
Average time to close issues: 10 months
Average time to close pull requests: 7 days
Total issue authors: 2
Total pull request authors: 4
Average comments per issue: 0.67
Average comments per pull request: 1.5
Merged pull requests: 9
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 3
Average time to close issues: N/A
Average time to close pull requests: about 6 hours
Issue authors: 1
Pull request authors: 1
Average comments per issue: 0.0
Average comments per pull request: 1.0
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

PSchmiedmayer (1)
philippzagar (1)

Pull Request Authors

philippzagar (6)
PSchmiedmayer (2)
vishnuravi (2)
Supereg (2)

Top Labels

Issue Labels

enhancement (2) good first issue (1) help wanted (1)

Pull Request Labels

enhancement (5) bug (2) documentation (1)

Packages

Total packages: 1
Total downloads: unknown

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 8

swiftpackageindex.com: github.com/StanfordSpezi/SpeziSpeech

Spezi Module to Support Speech-related Features Including Text-to-Speech and Speech-to-Text

Homepage: https://swiftpackageindex.com/StanfordSpezi/SpeziSpeech/documentation
Documentation: https://swiftpackageindex.com/StanfordSpezi/SpeziSpeech/documentation
License: mit
Latest release: 1.2.0
published over 1 year ago

Versions: 8
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 16.6%

Dependent repos count: 34.7%

Average: 47.4%

Forks count: 61.0%

Stargazers count: 77.2%

Last synced: 7 months ago

Dependencies

.github/workflows/build-and-test.yml actions

.github/workflows/monthly-markdown-link-check.yml actions

.github/workflows/pull_request.yml actions

Package.swift swiftpm

spezispeech

Science Score: 67.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

SpeziSpeech

Overview

Setup

1. Add Spezi Speech as a Dependency

2. Configure the SpeechRecognizer and the SpeechSynthesizer in the Spezi Configuration

3. Configure target properties

Example

Contributing

License

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

swiftpackageindex.com: github.com/StanfordSpezi/SpeziSpeech

Rankings

Dependencies

2. Configure the `SpeechRecognizer` and the `SpeechSynthesizer` in the Spezi `Configuration`