https://github.com/gradio-app/fastrtc

The python library for real-time communication

https://github.com/gradio-app/fastrtc

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 37 committers (2.7%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.6%) to scientific vocabulary

Keywords

artificial-intelligence llm python real-time speech-to-text text-to-speech

Keywords from Contributors

ui-components python-notebook interface transformer gradio-interface gradio vlm speech-recognition qwen pytorch-transformers
Last synced: 4 months ago · JSON representation

Repository

The python library for real-time communication

Basic Info
  • Host: GitHub
  • Owner: gradio-app
  • License: mit
  • Language: JavaScript
  • Default Branch: main
  • Homepage: https://fastrtc.org/
  • Size: 6.55 MB
Statistics
  • Stars: 4,286
  • Watchers: 38
  • Forks: 393
  • Open Issues: 53
  • Releases: 0
Topics
artificial-intelligence llm python real-time speech-to-text text-to-speech
Created over 1 year ago · Last pushed 6 months ago
Metadata Files
Readme License

README.md

FastRTC

FastRTC Logo
Static Badge Static Badge

The Real-Time Communication Library for Python.

Turn any python function into a real-time audio and video stream over WebRTC or WebSockets.

Installation

bash pip install fastrtc

to use built-in pause detection (see ReplyOnPause), and text to speech (see Text To Speech), install the vad and tts extras:

bash pip install "fastrtc[vad, tts]"

Key Features

  • 🗣️ Automatic Voice Detection and Turn Taking built-in, only worry about the logic for responding to the user.
  • 💻 Automatic UI - Use the .ui.launch() method to launch the webRTC-enabled built-in Gradio UI.
  • 🔌 Automatic WebRTC Support - Use the .mount(app) method to mount the stream on a FastAPI app and get a webRTC endpoint for your own frontend!
  • ⚡️ Websocket Support - Use the .mount(app) method to mount the stream on a FastAPI app and get a websocket endpoint for your own frontend!
  • 📞 Automatic Telephone Support - Use the fastphone() method of the stream to launch the application and get a free temporary phone number!
  • 🤖 Completely customizable backend - A Stream can easily be mounted on a FastAPI app so you can easily extend it to fit your production application. See the Talk To Claude demo for an example of how to serve a custom JS frontend.

Docs

https://fastrtc.org

Examples

See the Cookbook for examples of how to use the library.

🗣️👀 Gemini Audio Video Chat

Stream BOTH your webcam video and audio feeds to Google Gemini. You can also upload images to augment your conversation!

Demo | Code

🗣️ Google Gemini Real Time Voice API

Talk to Gemini in real time using Google's voice API.

Demo | Code

🗣️ OpenAI Real Time Voice API

Talk to ChatGPT in real time using OpenAI's voice API.

Demo | Code

🤖 Hello Computer

Say computer before asking your question!

Demo | Code

🤖 Llama Code Editor

Create and edit HTML pages with just your voice! Powered by SambaNova systems.

Demo | Code

🗣️ Talk to Claude

Use the Anthropic and Play.Ht APIs to have an audio conversation with Claude.

Demo | Code

🎵 Whisper Transcription

Have whisper transcribe your speech in real time!

Demo | Code

📷 Yolov10 Object Detection

Run the Yolov10 model on a user webcam stream in real time!

Demo | Code

🗣️ Kyutai Moshi

Kyutai's moshi is a novel speech-to-speech model for modeling human conversations.

Demo | Code

🗣️ Hello Llama: Stop Word Detection

A code editor built with Llama 3.3 70b that is triggered by the phrase "Hello Llama". Build a Siri-like coding assistant in 100 lines of code!

Demo | Code

Usage

This is a shortened version of the official usage guide.

  • .ui.launch(): Launch a built-in UI for easily testing and sharing your stream. Built with Gradio.
  • .fastphone(): Get a free temporary phone number to call into your stream. Hugging Face token required.
  • .mount(app): Mount the stream on a FastAPI app. Perfect for integrating with your already existing production system.

Quickstart

Echo Audio

```python from fastrtc import Stream, ReplyOnPause import numpy as np

def echo(audio: tuple[int, np.ndarray]): # The function will be passed the audio until the user pauses # Implement any iterator that yields audio # See "LLM Voice Chat" for a more complete example yield audio

stream = Stream( handler=ReplyOnPause(echo), modality="audio", mode="send-receive", ) ```

LLM Voice Chat

```py from fastrtc import ( ReplyOnPause, AdditionalOutputs, Stream, audiotobytes, aggregatebytesto_16bit ) import gradio as gr from groq import Groq import anthropic from elevenlabs import ElevenLabs

groqclient = Groq() claudeclient = anthropic.Anthropic() tts_client = ElevenLabs()

See "Talk to Claude" in Cookbook for an example of how to keep

track of the chat history.

def response( audio: tuple[int, np.ndarray], ): prompt = groqclient.audio.transcriptions.create( file=("audio-file.mp3", audiotobytes(audio)), model="whisper-large-v3-turbo", responseformat="verbosejson", ).text response = claudeclient.messages.create( model="claude-3-5-haiku-20241022", maxtokens=512, messages=[{"role": "user", "content": prompt}], ) responsetext = " ".join( block.text for block in response.content if getattr(block, "type", None) == "text" ) iterator = ttsclient.texttospeech.convertasstream( text=responsetext, voiceid="JBFqnCBsd6RMkjVDRZzb", modelid="elevenmultilingualv2", outputformat="pcm24000"

)
for chunk in aggregate_bytes_to_16bit(iterator):
    audio_array = np.frombuffer(chunk, dtype=np.int16).reshape(1, -1)
    yield (24000, audio_array)

stream = Stream( modality="audio", mode="send-receive", handler=ReplyOnPause(response), ) ```

Webcam Stream

```python from fastrtc import Stream import numpy as np

def flip_vertically(image): return np.flip(image, axis=0)

stream = Stream( handler=flip_vertically, modality="video", mode="send-receive", ) ```

Object Detection

```python from fastrtc import Stream import gradio as gr import cv2 from huggingfacehub import hfhub_download from .inference import YOLOv10

modelfile = hfhubdownload( repoid="onnx-community/yolov10n", filename="onnx/model.onnx" )

git clone https://huggingface.co/spaces/fastrtc/object-detection

for YOLOv10 implementation

model = YOLOv10(model_file)

def detection(image, confthreshold=0.3): image = cv2.resize(image, (model.inputwidth, model.inputheight)) newimage = model.detectobjects(image, confthreshold) return cv2.resize(new_image, (500, 500))

stream = Stream( handler=detection, modality="video", mode="send-receive", additional_inputs=[ gr.Slider(minimum=0, maximum=1, step=0.01, value=0.3) ] ) ```

Running the Stream

Run:

Gradio

py stream.ui.launch()

Telephone (Audio Only)

```py
stream.fastphone()
```

FastAPI

```py app = FastAPI() stream.mount(app)

Optional: Add routes

@app.get("/") async def _(): return HTMLResponse(content=open("index.html").read())

uvicorn app:app --host 0.0.0.0 --port 8000

```

GitHub Events

Total
  • Create event: 32
  • Release event: 7
  • Issues event: 83
  • Watch event: 645
  • Delete event: 5
  • Issue comment event: 204
  • Push event: 147
  • Pull request review event: 13
  • Pull request review comment event: 8
  • Pull request event: 59
  • Fork event: 66
Last Year
  • Create event: 32
  • Release event: 7
  • Issues event: 83
  • Watch event: 645
  • Delete event: 5
  • Issue comment event: 204
  • Push event: 147
  • Pull request review event: 13
  • Pull request review comment event: 8
  • Pull request event: 59
  • Fork event: 66

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 224
  • Total Committers: 37
  • Avg Commits per committer: 6.054
  • Development Distribution Score (DDS): 0.411
Past Year
  • Commits: 224
  • Committers: 37
  • Avg Commits per committer: 6.054
  • Development Distribution Score (DDS): 0.411
Top Committers
Name Email Commits
Freddy Boulton a****n@g****m 132
Freddy Boulton 4****n@u****m 37
Marcus Valtonen Örnhag m****n@g****m 8
Sourabh S****1@g****m 5
Václav Volhejn 8****n@u****m 5
Dawood Khan d****2@g****m 2
Freddy Boulton f****n@h****l 2
Lucain l****p@g****m 2
Mahimai Raja m****3@g****m 2
Sofia Casadei 6****4@u****m 2
Aki Miyazaki a****3@g****m 1
AlbertMingXu a****u@g****m 1
AleksanderWWW a****z@g****m 1
Aman Chauhan a****0@g****m 1
Derek d****n@g****m 1
EasyTop 1****p@u****m 1
Erik Wasmosy e****8@g****m 1
Fabien F****u@u****m 1
Marcus Valtonen Örnhag m****g@e****m 1
MechanicCoder 6****g@u****m 1
Michael Hart m****t@c****m 1
Michael Hart m****u@g****m 1
Mohamed Ted Meftah 7****h@u****m 1
Rohan Richard 6****d@u****m 1
Ryan Ellis 5****z@u****m 1
Shane Blair 3****2@u****m 1
Shaon Debnath s****2@g****m 1
Shubham Rasal 9****l@u****m 1
Siddharth Garg s****2@g****m 1
Sofian Mejjoute 7****5@u****m 1
and 7 more...
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 5 months ago

All Time
  • Total issues: 68
  • Total pull requests: 68
  • Average time to close issues: 6 days
  • Average time to close pull requests: 1 day
  • Total issue authors: 54
  • Total pull request authors: 12
  • Average comments per issue: 1.69
  • Average comments per pull request: 0.76
  • Merged pull requests: 54
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 68
  • Pull requests: 68
  • Average time to close issues: 6 days
  • Average time to close pull requests: 1 day
  • Issue authors: 54
  • Pull request authors: 12
  • Average comments per issue: 1.69
  • Average comments per pull request: 0.76
  • Merged pull requests: 54
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • ANYMS-A (4)
  • BajaMexico (3)
  • mohamed99akram (3)
  • sofi444 (2)
  • thomaschhh (2)
  • marcusvaltonen (2)
  • Tobsad (2)
  • AndreasKarasenko (2)
  • Borzyszkowski (2)
  • weynechen (2)
  • Redcoder007 (1)
  • yondonfu (1)
  • rybakov-ks (1)
  • msxpwr (1)
  • sblair12 (1)
Pull Request Authors
  • freddyaboulton (50)
  • dawoodkhan82 (3)
  • leopardracer (2)
  • Shubham-Rasal (2)
  • mahimairaja (2)
  • omahs (2)
  • sofi444 (2)
  • tedmeftah (1)
  • FabienDanieau (1)
  • amanchauhan11 (1)
  • sblair12 (1)
  • AlbertMingXu (1)
Top Labels
Issue Labels
good first issue (1) question (1)
Pull Request Labels

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 74,317 last-month
    • npm 12 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 0
    (may contain duplicates)
  • Total versions: 95
  • Total maintainers: 1
npmjs.org: @freddyaboulton/fastrtc-component

Gradio UI packages

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 12 Last month
Rankings
Dependent repos count: 24.6%
Average: 30.0%
Dependent packages count: 35.5%
Maintainers (1)
Last synced: 4 months ago
pypi.org: fastrtc

The realtime communication library for Python

  • Versions: 94
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 74,317 Last month
Rankings
Dependent packages count: 9.7%
Average: 32.1%
Dependent repos count: 54.6%
Maintainers (1)
Last synced: 5 months ago

Dependencies

frontend/package-lock.json npm
  • 452 dependencies
frontend/package.json npm
  • @gradio/preview 0.12.0 development
  • @ffmpeg/ffmpeg ^0.12.10
  • @ffmpeg/util ^0.12.1
  • @gradio/atoms 0.9.0
  • @gradio/client 1.6.0
  • @gradio/icons 0.8.0
  • @gradio/image 0.16.0
  • @gradio/markdown ^0.10.0
  • @gradio/statustracker 0.8.0
  • @gradio/upload 0.13.0
  • @gradio/utils 0.7.0
  • @gradio/wasm 0.14.0
  • hls.js ^1.5.16
  • mrmime ^2.0.0
demo/requirements.txt pypi
  • onnxruntime-gpu *
  • opencv-python *
  • safetensors ==0.4.3
  • twilio *
pyproject.toml pypi
  • aiortc *
  • gradio >=4.0,<6.0