whisper_ros

Speech-to-Text based on SileroVAD + whisper.cpp (GGML Whisper) for ROS 2

https://github.com/mgonzs13/whisper_ros

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.3%) to scientific vocabulary

Keywords

asr automatic-speech-recognition ggml ros2 speech-recognition speech-to-text vad voice-activity-detection whisper whisper-cpp

Keywords from Contributors

rerank audio vlm espeak pyaudio text-to-speech tts embeddings gguf langchain
Last synced: 6 months ago · JSON representation ·

Repository

Speech-to-Text based on SileroVAD + whisper.cpp (GGML Whisper) for ROS 2

Basic Info
  • Host: GitHub
  • Owner: mgonzs13
  • License: mit
  • Language: C++
  • Default Branch: main
  • Homepage:
  • Size: 1.94 MB
Statistics
  • Stars: 81
  • Watchers: 7
  • Forks: 19
  • Open Issues: 0
  • Releases: 49
Topics
asr automatic-speech-recognition ggml ros2 speech-recognition speech-to-text vad voice-activity-detection whisper whisper-cpp
Created almost 3 years ago · Last pushed 8 months ago
Metadata Files
Readme License Citation

README.md

whisper_ros

This repository provides a set of ROS 2 packages to integrate whisper.cpp into ROS 2 using audio_common 4.0.6. Besides, silero-vad is used to perform VAD (Voice Activity Detection).

[![License: MIT](https://img.shields.io/badge/GitHub-MIT-informational)](https://opensource.org/license/mit) [![GitHub release](https://img.shields.io/github/release/mgonzs13/whisper_ros.svg)](https://github.com/mgonzs13/whisper_ros/releases) [![Code Size](https://img.shields.io/github/languages/code-size/mgonzs13/whisper_ros.svg?branch=main)](https://github.com/mgonzs13/whisper_ros?branch=main) [![Last Commit](https://img.shields.io/github/last-commit/mgonzs13/whisper_ros.svg)](https://github.com/mgonzs13/whisper_ros/commits/main) [![GitHub issues](https://img.shields.io/github/issues/mgonzs13/whisper_ros)](https://github.com/mgonzs13/whisper_ros/issues) [![GitHub pull requests](https://img.shields.io/github/issues-pr/mgonzs13/whisper_ros)](https://github.com/mgonzs13/whisper_ros/pulls) [![Contributors](https://img.shields.io/github/contributors/mgonzs13/whisper_ros.svg)](https://github.com/mgonzs13/whisper_ros/graphs/contributors) [![Python Formatter Check](https://github.com/mgonzs13/whisper_ros/actions/workflows/python-formatter.yml/badge.svg?branch=main)](https://github.com/mgonzs13/whisper_ros/actions/workflows/python-formatter.yml?branch=main) [![C++ Formatter Check](https://github.com/mgonzs13/whisper_ros/actions/workflows/cpp-formatter.yml/badge.svg?branch=main)](https://github.com/mgonzs13/whisper_ros/actions/workflows/cpp-formatter.yml?branch=main) | ROS 2 Distro | Branch | Build status | Docker Image | Documentation | | :----------: | :---------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------: | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | **Humble** | [`main`](https://github.com/mgonzs13/whisper_ros/tree/main) | [![Humble Build](https://github.com/mgonzs13/whisper_ros/actions/workflows/humble-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/whisper_ros/actions/workflows/humble-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-humble-blue)](https://hub.docker.com/r/mgons/whisper_ros/tags?name=humble) | [![Doxygen Deployment](https://github.com/mgonzs13/whisper_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/whisper_ros/latest) | | **Iron** | [`main`](https://github.com/mgonzs13/whisper_ros/tree/main) | [![Iron Build](https://github.com/mgonzs13/whisper_ros/actions/workflows/iron-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/whisper_ros/actions/workflows/iron-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-iron-blue)](https://hub.docker.com/r/mgons/whisper_ros/tags?name=iron) | [![Doxygen Deployment](https://github.com/mgonzs13/whisper_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/whisper_ros/latest) | | **Jazzy** | [`main`](https://github.com/mgonzs13/whisper_ros/tree/main) | [![Jazzy Build](https://github.com/mgonzs13/whisper_ros/actions/workflows/jazzy-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/whisper_ros/actions/workflows/jazzy-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-jazzy-blue)](https://hub.docker.com/r/mgons/whisper_ros/tags?name=jazzy) | [![Doxygen Deployment](https://github.com/mgonzs13/whisper_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/whisper_ros/latest) | | **Kilted** | [`main`](https://github.com/mgonzs13/whisper_ros/tree/main) | [![Kilted Build](https://github.com/mgonzs13/whisper_ros/actions/workflows/kilted-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/whisper_ros/actions/workflows/kilted-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-kilted-blue)](https://hub.docker.com/r/mgons/whisper_ros/tags?name=kilted) | [![Doxygen Deployment](https://github.com/mgonzs13/whisper_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/whisper_ros/latest) | | **Rolling** | [`main`](https://github.com/mgonzs13/whisper_ros/tree/main) | [![Rolling Build](https://github.com/mgonzs13/whisper_ros/actions/workflows/rolling-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/whisper_ros/actions/workflows/rolling-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-rolling-blue)](https://hub.docker.com/r/mgons/whisper_ros/tags?name=rolling) | [![Doxygen Deployment](https://github.com/mgonzs13/whisper_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/whisper_ros/latest) |

Table of Contents

  1. Related Projects
  2. Installation
  3. Docker
  4. Usage
  5. Demos

Related Projects

  • chatbot_ros → This chatbot, integrated into ROS 2, uses whisper_ros, to listen to people speech; and llama_ros, to generate responses. The chatbot is controlled by a state machine created with YASMIN.

Installation

To run whisper_ros with CUDA, first, you must install the CUDA Toolkit. To run SileroVAD with ONNX and CUDA, you must install the cuDNN.

shell cd ~/ros2_ws/src git clone https://github.com/mgonzs13/audio_common.git git clone https://github.com/mgonzs13/whisper_ros.git cd ~/ros2_ws rosdep install --from-paths src --ignore-src -r -y colcon build --cmake-args -DGGML_CUDA=ON -DONNX_GPU=ON # To use CUDA on Whisper and on Silero, respectively

Docker

Build the whisperros docker. Additionally, you can choose to build whisperros with CUDA (USE_CUDA) and choose the CUDA version (CUDA_VERSION). Remember that you have to use DOCKER_BUILDKIT=0 to compile whisper_ros with CUDA when building the image.

shell DOCKER_BUILDKIT=0 docker build -t whisper_ros --build-arg USE_CUDA=1 --build-arg CUDA_VERSION=12-6 .

Run the docker container. If you want to use CUDA, you have to install the NVIDIA Container Tollkit and add --gpus all.

shell docker run -it --rm --gpus all whisper_ros

Usage

Run Silero for VAD and Whisper for STT:

shell ros2 launch whisper_bringup whisper.launch.py

Add the parameter silero_vad_use_cuda:=True to use Silero with CUDA.

Demos

Send a goal action to listen:

shell ros2 action send_goal /whisper/listen whisper_msgs/action/STT "{}"

Or try the example of a whisper client:

shell ros2 run whisper_demos whisper_demo_node

Owner

  • Name: Miguel Ángel González Santamarta
  • Login: mgonzs13
  • Kind: user
  • Location: León
  • Company: University of León

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: "González-Santamarta"
    given-names: "Miguel Á."
title: "whisper_ros"
date-released: 2023-05-01
url: "https://github.com/mgonzs13/whisper_ros"

GitHub Events

Total
  • Create event: 16
  • Issues event: 1
  • Release event: 13
  • Watch event: 35
  • Delete event: 3
  • Issue comment event: 2
  • Push event: 78
  • Pull request event: 9
  • Fork event: 6
Last Year
  • Create event: 16
  • Issues event: 1
  • Release event: 13
  • Watch event: 35
  • Delete event: 3
  • Issue comment event: 2
  • Push event: 78
  • Pull request event: 9
  • Fork event: 6

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 249
  • Total Committers: 5
  • Avg Commits per committer: 49.8
  • Development Distribution Score (DDS): 0.02
Past Year
  • Commits: 144
  • Committers: 3
  • Avg Commits per committer: 48.0
  • Development Distribution Score (DDS): 0.021
Top Committers
Name Email Commits
Miguel Ángel González Santamarta m****s@u****s 244
Alejandro González 5****4 2
Matt Williamson m****t@a****m 1
Jiuguang Wang j****w@g****m 1
Alberto Tudela a****a@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 3
  • Total pull requests: 10
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 12 days
  • Total issue authors: 3
  • Total pull request authors: 6
  • Average comments per issue: 0.67
  • Average comments per pull request: 0.1
  • Merged pull requests: 7
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 4
  • Average time to close issues: N/A
  • Average time to close pull requests: about 12 hours
  • Issue authors: 1
  • Pull request authors: 3
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.25
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • racket405 (1)
  • okdhryk (1)
  • josemgaleas (1)
Pull Request Authors
  • jiuguangw (2)
  • agonzc34 (2)
  • mattwilliamson (2)
  • Hiromasa56 (2)
  • ajtudela (1)
  • IS0moto (1)
Top Labels
Issue Labels
Pull Request Labels