https://github.com/introlab/audio_utils

ROS node and utilities for audio streams.

https://github.com/introlab/audio_utils

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.1%) to scientific vocabulary
Last synced: 5 months ago · JSON representation

Repository

ROS node and utilities for audio streams.

Basic Info
  • Host: GitHub
  • Owner: introlab
  • License: gpl-3.0
  • Language: C++
  • Default Branch: ros2
  • Size: 1.49 MB
Statistics
  • Stars: 13
  • Watchers: 4
  • Forks: 4
  • Open Issues: 1
  • Releases: 0
Created over 5 years ago · Last pushed about 1 year ago
Metadata Files
Readme License

README.md

audio_utils

ROS2 nodes and utilities for audio streams.

For ROS1, please see the ros1 branch.

Author(s): Marc-Antoine Maheux

Setup (Ubuntu)

The following subsections explain how to use the library on Ubuntu.

Install Dependencies

bash sudo apt-get install cmake build-essential gfortran texinfo libasound2-dev libpulse-dev libgfortran-*-dev

Install Python Dependencies

bash sudo pip install -r requirements.txt

or

bash sudo pip3 install -r requirements.txt

Setup Submodules

bash git submodule update --init --recursive

Nodes

capture_node

This node captures the sound from an ALSA or PulseAudio device and publishes it to a topic.

Parameters

  • backend (string): The backend to use (alsa or pulse_audio). The default value is alsa.
  • device (string): The device to capture (ex: hw:CARD=1,DEV=0 or default for ALSA, or alsa_input.usb-IntRoLab_16SoundsUSB_Audio_2.0-00.multichannel-input for PulseAudio). The default value is default.
  • format (string): The audio format ( see audioutilsmsgs/AudioFrame). The default value is signed_16.
  • channel_count (int): The device channel count. The default value is 1.
  • sampling_frequency (int): The device sampling frequency. The default value is 16000.
  • frame_sample_count (int): The number of samples in each frame. The default value is 1024.
  • merge (bool): Indicate to merge the channels or not. The default value is false.
  • gain (double): The gain to apply. The default value is 1.0.
  • latency_us (int): The capture latency in microseconds. The default value is 64000.
  • channel_map (Array of string): The PulseAudio channel mapping. If empty or omitted, the default mapping is used. This parameter must be set only with the PulseAudio backend. The default value is [].
  • queue_size (int): The publisher queue size. The default value is 1.

Published Topics

playback_node

This node captures the sound from a topic and plays it to an ALSA or PulseAudio device.

Parameters

  • backend (string): The backend to use (alsa or pulse_audio). The default value is alsa.
  • device (string): The device to capture (ex: hw:CARD=1,DEV=0 or default for ALSA, or alsa_input.usb-IntRoLab_16SoundsUSB_Audio_2.0-00.multichannel-input for PulseAudio). The default value is default.
  • format (string): The audio format ( see audioutilsmsgs/AudioFrame). The default value is signed_16.
  • channel_count (int): The device channel count. The default value is 1.
  • sampling_frequency (int): The device sampling frequency. The default value is 16000.
  • frame_sample_count (int): The number of samples in each frame. The default value is 1024.
  • latency_us (int): The capture latency in microseconds. The default value is 64000.
  • channel_map (Array of string): The PulseAudio channel mapping. If empty or omitted, the default mapping is used. This parameter must be set only with the PulseAudio backend. The default value is [].
  • queue_size (int): The publisher queue size. The default value is 1.

Subscribed Topics

beat_detector_node

This node estimates the song tempo and detects if the beat is in the current frame.

Parameters

  • sampling_frequency (int): The device sampling frequency. The default value is 44100.
  • frame_sample_count (int): The number of samples in each analysed frame. It must be a multiple of oss_fft_window_size. The default value is 128.
  • oss_fft_window_size (int): The onset strength signal window size. It must be greater than or equal to frame_sample_count. The default value is 1024.
  • flux_hamming_size (int): The flux hamming window size to calculate the onset strength signal. The default value is 15.
  • oss_bpm_window_size (int): The onset strength signal window size to calculate the BPM value. The default value is 1024.
  • min_bpm (double): The minimum valid BPM value. The default value is 50.
  • max_bpm (double): The maximum valid BPM value. The default value is 180.
  • bpm_candidate_count (int): The number of cross-correlations to perform to find the best BPM. The default value is 10.

Subscribed Topics

Published Topics

  • bpm (std_msgs/Float32): The tempo in bpm (beats per minute) for each frame.
  • beat (std_msgs/Bool): Indicate if the beat is in the current frame.

vad_node

This node performs voice activity detection with Silero VAD. The models folder contains the model trained by Silero VAD. The license of the model is MIT.

Parameters

  • silence_to_voice_threshold (double): The threshold to detect voice activity when silence was previously detected. The default value is 0.5.
  • voice_to_silence_threshold (double): The threshold to detect silence when voice activity was previously detected. It must be lower than silence_to_voice_threshold. The default value is 0.4.
  • min_silence_duration_ms (double): The minimum silence duration in ms. The default value is 500.

Subscribed Topics

  • audio_in (audioutilsmsgs/AudioFrame) The sound to analyze. The channel count must be 1. The samply frequency must be 16000 Hz. The frame sample count must be a multiple of 512.

Published Topics

format_conversion_node.py

This node converts the format of an audio topic.

Parameters

Subscribed Topics

Published Topics

resampling_node.py

This node resamples an audio topic.

Parameters

  • input_format (string): The input audio format ( see audioutilsmsgs/AudioFrame).
  • output_format (string): The output audio format ( see audioutilsmsgs/AudioFrame).
  • channel_count (int): The device channel count.
  • input_sampling_frequency (int): The input sampling frequency.
  • output_sampling_frequency (int): The output sampling frequency.
  • input_frame_sample_count (int): The number of samples in each frame of the input.
  • dynamic_input_resampling (bool: default is false): If true, always adjust the input sampling informations ( format, sampling frequency and frame sample count) to the sampling informations of the reveiced frames, dynamically. In this mode, input_format, input_sampling_frequency and input_frame_sample_count are not required, but they can be used to save a recomputation if the starting input sampling informations are known.

Subscribed Topics

Published Topics

split_channel_node.py

This node split a multichannel audio topic into several mono audio topics.

Parameters

Subscribed Topics

Published Topics

raw_file_writer_node.py

This node writes the raw sound data to a file.

Parameters

  • output_path (string): The output file path.

Subscribed Topics

License

Sponsor

IntRoLab

IntRoLab - Intelligent / Interactive / Integrated / Interdisciplinary Robot Lab

Owner

  • Name: IntRoLab
  • Login: introlab
  • Kind: organization
  • Location: Sherbrooke, Québec, Canada

IntRoLab - Intelligent / Interactive / Integrated / Interdisciplinary Robot Lab @ Université de Sherbrooke

GitHub Events

Total
  • Watch event: 1
  • Delete event: 2
  • Push event: 2
  • Pull request review event: 1
  • Pull request event: 3
  • Create event: 2
Last Year
  • Watch event: 1
  • Delete event: 2
  • Push event: 2
  • Pull request review event: 1
  • Pull request event: 3
  • Create event: 2

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 4
  • Total pull requests: 29
  • Average time to close issues: 2 months
  • Average time to close pull requests: 2 days
  • Total issue authors: 3
  • Total pull request authors: 3
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.1
  • Merged pull requests: 28
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 3
  • Average time to close issues: N/A
  • Average time to close pull requests: 11 minutes
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • Gaurav37 (2)
  • ghadj (1)
  • philippewarren (1)
Pull Request Authors
  • mamaheux (21)
  • philippewarren (8)
  • doumdi (4)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • numpy *
  • scipy *
setup.py pypi