https://github.com/introlab/audio_utils
ROS node and utilities for audio streams.
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.1%) to scientific vocabulary
Repository
ROS node and utilities for audio streams.
Basic Info
- Host: GitHub
- Owner: introlab
- License: gpl-3.0
- Language: C++
- Default Branch: ros2
- Size: 1.49 MB
Statistics
- Stars: 13
- Watchers: 4
- Forks: 4
- Open Issues: 1
- Releases: 0
Metadata Files
README.md
audio_utils
ROS2 nodes and utilities for audio streams.
For ROS1, please see the ros1 branch.
Author(s): Marc-Antoine Maheux
Setup (Ubuntu)
The following subsections explain how to use the library on Ubuntu.
Install Dependencies
bash
sudo apt-get install cmake build-essential gfortran texinfo libasound2-dev libpulse-dev libgfortran-*-dev
Install Python Dependencies
bash
sudo pip install -r requirements.txt
or
bash
sudo pip3 install -r requirements.txt
Setup Submodules
bash
git submodule update --init --recursive
Nodes
capture_node
This node captures the sound from an ALSA or PulseAudio device and publishes it to a topic.
Parameters
backend(string): The backend to use (alsaorpulse_audio). The default value isalsa.device(string): The device to capture (ex:hw:CARD=1,DEV=0ordefaultfor ALSA, oralsa_input.usb-IntRoLab_16SoundsUSB_Audio_2.0-00.multichannel-inputfor PulseAudio). The default value isdefault.format(string): The audio format ( see audioutilsmsgs/AudioFrame). The default value issigned_16.channel_count(int): The device channel count. The default value is1.sampling_frequency(int): The device sampling frequency. The default value is16000.frame_sample_count(int): The number of samples in each frame. The default value is1024.merge(bool): Indicate to merge the channels or not. The default value isfalse.gain(double): The gain to apply. The default value is1.0.latency_us(int): The capture latency in microseconds. The default value is64000.channel_map(Array of string): The PulseAudio channel mapping. If empty or omitted, the default mapping is used. This parameter must be set only with the PulseAudio backend. The default value is[].queue_size(int): The publisher queue size. The default value is1.
Published Topics
audio_out(audioutilsmsgs/AudioFrame) The captured sound.
playback_node
This node captures the sound from a topic and plays it to an ALSA or PulseAudio device.
Parameters
backend(string): The backend to use (alsaorpulse_audio). The default value isalsa.device(string): The device to capture (ex:hw:CARD=1,DEV=0ordefaultfor ALSA, oralsa_input.usb-IntRoLab_16SoundsUSB_Audio_2.0-00.multichannel-inputfor PulseAudio). The default value isdefault.format(string): The audio format ( see audioutilsmsgs/AudioFrame). The default value issigned_16.channel_count(int): The device channel count. The default value is1.sampling_frequency(int): The device sampling frequency. The default value is16000.frame_sample_count(int): The number of samples in each frame. The default value is1024.latency_us(int): The capture latency in microseconds. The default value is64000.channel_map(Array of string): The PulseAudio channel mapping. If empty or omitted, the default mapping is used. This parameter must be set only with the PulseAudio backend. The default value is[].queue_size(int): The publisher queue size. The default value is1.
Subscribed Topics
audio_in(audioutilsmsgs/AudioFrame) The sound to play.
beat_detector_node
This node estimates the song tempo and detects if the beat is in the current frame.
Parameters
sampling_frequency(int): The device sampling frequency. The default value is44100.frame_sample_count(int): The number of samples in each analysed frame. It must be a multiple ofoss_fft_window_size. The default value is128.oss_fft_window_size(int): The onset strength signal window size. It must be greater than or equal toframe_sample_count. The default value is1024.flux_hamming_size(int): The flux hamming window size to calculate the onset strength signal. The default value is15.oss_bpm_window_size(int): The onset strength signal window size to calculate the BPM value. The default value is1024.min_bpm(double): The minimum valid BPM value. The default value is50.max_bpm(double): The maximum valid BPM value. The default value is180.bpm_candidate_count(int): The number of cross-correlations to perform to find the best BPM. The default value is10.
Subscribed Topics
audio_in(audioutilsmsgs/AudioFrame) The sound to analyze. The channel count must be 1.
Published Topics
bpm(std_msgs/Float32): The tempo in bpm (beats per minute) for each frame.beat(std_msgs/Bool): Indicate if the beat is in the current frame.
vad_node
This node performs voice activity detection with Silero VAD. The models folder contains the model trained by Silero VAD. The license of the model is MIT.
Parameters
silence_to_voice_threshold(double): The threshold to detect voice activity when silence was previously detected. The default value is0.5.voice_to_silence_threshold(double): The threshold to detect silence when voice activity was previously detected. It must be lower thansilence_to_voice_threshold. The default value is0.4.min_silence_duration_ms(double): The minimum silence duration in ms. The default value is500.
Subscribed Topics
audio_in(audioutilsmsgs/AudioFrame) The sound to analyze. The channel count must be 1. The samply frequency must be 16000 Hz. The frame sample count must be a multiple of 512.
Published Topics
voice_activity(audioutilsmsgs/VoiceActivity) The voice activity detection result.
format_conversion_node.py
This node converts the format of an audio topic.
Parameters
input_format(string): The input audio format ( see audioutilsmsgs/AudioFrame).output_format(string): The output audio format ( see audioutilsmsgs/AudioFrame).
Subscribed Topics
audio_in(audioutilsmsgs/AudioFrame) The sound topic to convert.
Published Topics
audio_out(audioutilsmsgs/AudioFrame) The converted sound.
resampling_node.py
This node resamples an audio topic.
Parameters
input_format(string): The input audio format ( see audioutilsmsgs/AudioFrame).output_format(string): The output audio format ( see audioutilsmsgs/AudioFrame).channel_count(int): The device channel count.input_sampling_frequency(int): The input sampling frequency.output_sampling_frequency(int): The output sampling frequency.input_frame_sample_count(int): The number of samples in each frame of the input.dynamic_input_resampling(bool: default isfalse): Iftrue, always adjust the input sampling informations ( format, sampling frequency and frame sample count) to the sampling informations of the reveiced frames, dynamically. In this mode,input_format,input_sampling_frequencyandinput_frame_sample_countare not required, but they can be used to save a recomputation if the starting input sampling informations are known.
Subscribed Topics
audio_in(audioutilsmsgs/AudioFrame) The sound topic to resample.
Published Topics
audio_out(audioutilsmsgs/AudioFrame) The resampled sound.
split_channel_node.py
This node split a multichannel audio topic into several mono audio topics.
Parameters
input_format(string): The input audio format ( see audioutilsmsgs/AudioFrame).output_format(string): The output audio format ( see audioutilsmsgs/AudioFrame).channel_count(int): The device channel count.
Subscribed Topics
audio_in(audioutilsmsgs/AudioFrame) The sound topic to split.
Published Topics
audio_out_0(audioutilsmsgs/AudioFrame) The first channel sound.audio_out_1(audioutilsmsgs/AudioFrame) The second channel sound.- ...
raw_file_writer_node.py
This node writes the raw sound data to a file.
Parameters
output_path(string): The output file path.
Subscribed Topics
audio_in(audioutilsmsgs/AudioFrame) The sound topic to write.
License
Sponsor

IntRoLab - Intelligent / Interactive / Integrated / Interdisciplinary Robot Lab
Owner
- Name: IntRoLab
- Login: introlab
- Kind: organization
- Location: Sherbrooke, Québec, Canada
- Website: https://introlab.3it.usherbrooke.ca
- Repositories: 65
- Profile: https://github.com/introlab
IntRoLab - Intelligent / Interactive / Integrated / Interdisciplinary Robot Lab @ Université de Sherbrooke
GitHub Events
Total
- Watch event: 1
- Delete event: 2
- Push event: 2
- Pull request review event: 1
- Pull request event: 3
- Create event: 2
Last Year
- Watch event: 1
- Delete event: 2
- Push event: 2
- Pull request review event: 1
- Pull request event: 3
- Create event: 2
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 4
- Total pull requests: 29
- Average time to close issues: 2 months
- Average time to close pull requests: 2 days
- Total issue authors: 3
- Total pull request authors: 3
- Average comments per issue: 1.0
- Average comments per pull request: 0.1
- Merged pull requests: 28
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 3
- Average time to close issues: N/A
- Average time to close pull requests: 11 minutes
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- Gaurav37 (2)
- ghadj (1)
- philippewarren (1)
Pull Request Authors
- mamaheux (21)
- philippewarren (8)
- doumdi (4)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- numpy *
- scipy *