audio_common
A PortAudio based audio_common with text to speech for ROS 2
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.3%) to scientific vocabulary
Keywords
Repository
A PortAudio based audio_common with text to speech for ROS 2
Basic Info
Statistics
- Stars: 17
- Watchers: 2
- Forks: 13
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
audio_capture
This repositiory provides a set of ROS 2 packages for audio. It provides a C++ version to capture and play audio data using PortAudio.
Table of Contents
Installation
shell
cd ~/ros2_ws/src
git clone https://github.com/mgonzs13/audio_common.git
cd ~/ros2_ws
rosdep install --from-paths src --ignore-src -r -y
colcon build
Docker
You can create a docker image to test audiocommon. Use the following command inside the directory of audiocommon.
shell
docker build -t audio_common .
After the image is created, run a docker container with the following command.
shell
docker run -it --rm --device /dev/snd audio_common
Nodes
audiocapturernode
Node to obtain audio data from a microphone and publish it into the audio topic.
Click to expand
#### Parameters - **format**: Specifies the audio format to be used for capturing. Possible values are: - `1` (paFloat32 - 32-bit floating point) - `2` (paInt32 - 32-bit integer) - `8` (paInt16 - 16-bit integer) - `16` (paInt8 - 8-bit integer) - `32` (paUInt8 - 8-bit unsigned integer) Default: `8` (paInt16) The integer values correspond to PortAudio sample format flags. - **channels**: The number of audio channels to capture. Typically, `1` for mono and `2` for stereo. Default: `1` - **rate**: The sample rate that is how many samples per second should be captured. Default: `16000` - **chunk**: The size of each audio frame. Default: `512` - **device**: The ID of the audio input device. A value of `-1` indicates that the default audio input device should be used. Default: `-1` - **frame_id**: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: `""` #### ROS 2 Interfaces - **audio**: Topic to publish the audio data captured from the microphone. Type: `audio_common_msgs/msg/AudioStamped`audioplayernode
Node to play the audio data obtained from the audio topic.
Click to expand
#### Parameters - **channels**: The number of audio channels to play. Typically, `1` for mono and `2` for stereo. Default: `2` - The node automatically handles conversion between mono and stereo formats if needed. - **device**: The ID of the audio output device. A value of `-1` indicates that the default audio output device should be used. Default: `-1` #### ROS 2 Interfaces - **audio**: Topic subscriber to get the audio data to be played. Type: `audio_common_msgs/msg/AudioStamped`music_node
Node to play music from audio files in wav format.
Click to expand
#### Parameters - **chunk**: The size of each audio frame. Default: `2048` - **frame_id**: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: `""` #### ROS 2 Interfaces - **audio**: Topic to publish the audio data from the files. Type: `audio_common_msgs/msg/AudioStamped` - **music_play**: Service to play audio files. Type: `audio_common_msgs/srv/MusicPlay` - Parameters: - `audio`: Name of a built-in audio sample (e.g., "elevator") - `file_path`: Path to a custom WAV file (ignored if audio is specified) - `loop`: Boolean to indicate if the audio should loop. Default: `false` - **music_stop**: Service to stop the currently playing music. Type: `std_srvs/srv/Trigger` - **music_pause**: Service to pause the currently playing music. Type: `std_srvs/srv/Trigger` - **music_resume**: Service to resume paused music. Type: `std_srvs/srv/Trigger`tts_node
Node to generate audio from text (TTS) using espeak.
Click to expand
#### Parameters - **chunk**: The size of each audio frame. Default: `4096` - **frame_id**: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: `""` #### ROS 2 Interfaces - **audio**: Topic publisher to send the audio data generated by the TTS. Type: `audio_common_msgs/msg/AudioStamped` - **say**: Action to generate audio data from a text. Type: `audio_common_msgs/action/TTS` - Goal: - `text`: The text to convert to speech - `language`: The language to use for speech synthesis. Default: `"en"` - `volume`: The volume of the generated speech (0.0-1.0). Default: `1.0` - `rate`: The speech rate (1.0 is normal speed). Default: `1.0` - Feedback: - `audio`: The audio being currently played - Result: - `text`: The text that was converted to speechDemos
Audio Capturer/Player
shell
ros2 run audio_common audio_capturer_node
shell
ros2 run audio_common audio_player_node
TTS
shell
ros2 run audio_common tts_node
shell
ros2 run audio_common audio_player_node
shell
ros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hello World'}"
Advanced TTS example with additional parameters:
shell
ros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hello World', 'language': 'en-us', 'volume': 0.8, 'rate': 1.2}"
Music Player
shell
ros2 run audio_common music_node
shell
ros2 run audio_common audio_player_node
Play a built-in sample:
shell
ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{audio: 'elevator'}"
Play a custom WAV file:
shell
ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{file_path: '/path/to/your/file.wav'}"
Play with looping enabled:
shell
ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{audio: 'elevator', loop: true}"
Control playback:
shell
ros2 service call /music_pause std_srvs/srv/Trigger "{}"
ros2 service call /music_resume std_srvs/srv/Trigger "{}"
ros2 service call /music_stop std_srvs/srv/Trigger "{}"
Owner
- Name: Miguel Ángel González Santamarta
- Login: mgonzs13
- Kind: user
- Location: León
- Company: University of León
- Twitter: miggsant
- Repositories: 2
- Profile: https://github.com/mgonzs13
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "González-Santamarta"
given-names: "Miguel Á."
title: "audio_common"
date-released: 2023-05-06
url: "https://github.com/mgonzs13/audio_common"
GitHub Events
Total
- Create event: 9
- Issues event: 3
- Release event: 7
- Watch event: 12
- Delete event: 1
- Issue comment event: 5
- Push event: 38
- Pull request event: 7
- Fork event: 7
Last Year
- Create event: 9
- Issues event: 3
- Release event: 7
- Watch event: 12
- Delete event: 1
- Issue comment event: 5
- Push event: 38
- Pull request event: 7
- Fork event: 7
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Miguel Ángel González Santamarta | m****s@u****s | 82 |
| Claas de Boer | d****v@c****s | 3 |
| Jiuguang Wang | j****w@g****m | 1 |
| Alberto Tudela | a****a@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 2
- Total pull requests: 7
- Average time to close issues: about 1 hour
- Average time to close pull requests: 1 day
- Total issue authors: 2
- Total pull request authors: 4
- Average comments per issue: 1.5
- Average comments per pull request: 0.43
- Merged pull requests: 7
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 2
- Pull requests: 5
- Average time to close issues: about 1 hour
- Average time to close pull requests: 2 days
- Issue authors: 2
- Pull request authors: 2
- Average comments per issue: 1.5
- Average comments per pull request: 0.6
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- cdboer (1)
- swiz23 (1)
Pull Request Authors
- cdboer (5)
- agonzc34 (2)
- ajtudela (2)
- jiuguangw (1)
- mgonzs13 (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- numpy *
- pyaudio *
- tts *