https://github.com/gradio-app/fastrtc

The python library for real-time communication

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
✓
Committers with academic emails
1 of 37 committers (2.7%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.6%) to scientific vocabulary

Keywords

artificial-intelligence llm python real-time speech-to-text text-to-speech

Keywords from Contributors

ui-components python-notebook interface transformer gradio-interface gradio vlm speech-recognition qwen pytorch-transformers

Last synced: 4 months ago · JSON representation

Repository

The python library for real-time communication

Basic Info

Host: GitHub
Owner: gradio-app
License: mit
Language: JavaScript
Default Branch: main
Homepage: https://fastrtc.org/
Size: 6.55 MB

Statistics

Stars: 4,286
Watchers: 38
Forks: 393
Open Issues: 53
Releases: 0

Topics

artificial-intelligence llm python real-time speech-to-text text-to-speech

Created over 1 year ago · Last pushed 6 months ago

Metadata Files

Readme License

FastRTC

The Real-Time Communication Library for Python.

Turn any python function into a real-time audio and video stream over WebRTC or WebSockets.

Installation

bash pip install fastrtc

to use built-in pause detection (see ReplyOnPause), and text to speech (see Text To Speech), install the vad and tts extras:

bash pip install "fastrtc[vad, tts]"

Key Features

🗣️ Automatic Voice Detection and Turn Taking built-in, only worry about the logic for responding to the user.
💻 Automatic UI - Use the .ui.launch() method to launch the webRTC-enabled built-in Gradio UI.
🔌 Automatic WebRTC Support - Use the .mount(app) method to mount the stream on a FastAPI app and get a webRTC endpoint for your own frontend!
⚡️ Websocket Support - Use the .mount(app) method to mount the stream on a FastAPI app and get a websocket endpoint for your own frontend!
📞 Automatic Telephone Support - Use the fastphone() method of the stream to launch the application and get a free temporary phone number!
🤖 Completely customizable backend - A Stream can easily be mounted on a FastAPI app so you can easily extend it to fit your production application. See the Talk To Claude demo for an example of how to serve a custom JS frontend.

Docs

https://fastrtc.org

Examples

See the Cookbook for examples of how to use the library.

🗣️👀 Gemini Audio Video Chat

Stream BOTH your webcam video and audio feeds to Google Gemini. You can also upload images to augment your conversation!

Demo | Code

🗣️ Google Gemini Real Time Voice API

Talk to Gemini in real time using Google's voice API.

Demo | Code

🗣️ OpenAI Real Time Voice API

Talk to ChatGPT in real time using OpenAI's voice API.

Demo | Code

🤖 Hello Computer

Say computer before asking your question!

Demo | Code

🤖 Llama Code Editor

Create and edit HTML pages with just your voice! Powered by SambaNova systems.

Demo | Code

🗣️ Talk to Claude

Use the Anthropic and Play.Ht APIs to have an audio conversation with Claude.

Demo | Code

🎵 Whisper Transcription

Have whisper transcribe your speech in real time!

Demo | Code

📷 Yolov10 Object Detection

Run the Yolov10 model on a user webcam stream in real time!

Demo | Code

🗣️ Kyutai Moshi

Kyutai's moshi is a novel speech-to-speech model for modeling human conversations.

Demo | Code

🗣️ Hello Llama: Stop Word Detection

A code editor built with Llama 3.3 70b that is triggered by the phrase "Hello Llama". Build a Siri-like coding assistant in 100 lines of code!

Demo | Code

Usage

This is a shortened version of the official usage guide.

.ui.launch(): Launch a built-in UI for easily testing and sharing your stream. Built with Gradio.
.fastphone(): Get a free temporary phone number to call into your stream. Hugging Face token required.
.mount(app): Mount the stream on a FastAPI app. Perfect for integrating with your already existing production system.

Quickstart

Echo Audio

```python from fastrtc import Stream, ReplyOnPause import numpy as np

def echo(audio: tuple[int, np.ndarray]): # The function will be passed the audio until the user pauses # Implement any iterator that yields audio # See "LLM Voice Chat" for a more complete example yield audio

stream = Stream( handler=ReplyOnPause(echo), modality="audio", mode="send-receive", ) ```

LLM Voice Chat

```py from fastrtc import ( ReplyOnPause, AdditionalOutputs, Stream, audiotobytes, aggregatebytesto_16bit ) import gradio as gr from groq import Groq import anthropic from elevenlabs import ElevenLabs

groqclient = Groq() claudeclient = anthropic.Anthropic() tts_client = ElevenLabs()

See "Talk to Claude" in Cookbook for an example of how to keep

track of the chat history.

def response( audio: tuple[int, np.ndarray], ): prompt = groqclient.audio.transcriptions.create( file=("audio-file.mp3", audiotobytes(audio)), model="whisper-large-v3-turbo", responseformat="verbosejson", ).text response = claudeclient.messages.create( model="claude-3-5-haiku-20241022", maxtokens=512, messages=[{"role": "user", "content": prompt}], ) responsetext = " ".join( block.text for block in response.content if getattr(block, "type", None) == "text" ) iterator = ttsclient.texttospeech.convertasstream( text=responsetext, voiceid="JBFqnCBsd6RMkjVDRZzb", modelid="elevenmultilingualv2", outputformat="pcm24000"

)
for chunk in aggregate_bytes_to_16bit(iterator):
    audio_array = np.frombuffer(chunk, dtype=np.int16).reshape(1, -1)
    yield (24000, audio_array)

stream = Stream( modality="audio", mode="send-receive", handler=ReplyOnPause(response), ) ```

Webcam Stream

```python from fastrtc import Stream import numpy as np

def flip_vertically(image): return np.flip(image, axis=0)

stream = Stream( handler=flip_vertically, modality="video", mode="send-receive", ) ```

Object Detection

```python from fastrtc import Stream import gradio as gr import cv2 from huggingfacehub import hfhub_download from .inference import YOLOv10

modelfile = hfhubdownload( repoid="onnx-community/yolov10n", filename="onnx/model.onnx" )

git clone https://huggingface.co/spaces/fastrtc/object-detection

for YOLOv10 implementation

model = YOLOv10(model_file)

def detection(image, confthreshold=0.3): image = cv2.resize(image, (model.inputwidth, model.inputheight)) newimage = model.detectobjects(image, confthreshold) return cv2.resize(new_image, (500, 500))

stream = Stream( handler=detection, modality="video", mode="send-receive", additional_inputs=[ gr.Slider(minimum=0, maximum=1, step=0.01, value=0.3) ] ) ```

Running the Stream

Run:

Gradio

py stream.ui.launch()

Telephone (Audio Only)

```py
stream.fastphone()
```

FastAPI

```py app = FastAPI() stream.mount(app)

Optional: Add routes

@app.get("/") async def _(): return HTMLResponse(content=open("index.html").read())

uvicorn app:app --host 0.0.0.0 --port 8000

```

GitHub Events

Total

Create event: 32
Release event: 7
Issues event: 83
Watch event: 645
Delete event: 5
Issue comment event: 204
Push event: 147
Pull request review event: 13
Pull request review comment event: 8
Pull request event: 59
Fork event: 66

Last Year

Create event: 32
Release event: 7
Issues event: 83
Watch event: 645
Delete event: 5
Issue comment event: 204
Push event: 147
Pull request review event: 13
Pull request review comment event: 8
Pull request event: 59
Fork event: 66

Committers

Last synced: 5 months ago

All Time

Total Commits: 224
Total Committers: 37
Avg Commits per committer: 6.054
Development Distribution Score (DDS): 0.411

Past Year

Commits: 224
Committers: 37
Avg Commits per committer: 6.054
Development Distribution Score (DDS): 0.411

Top Committers

Name	Email	Commits
Freddy Boulton	a**n@g**m	132
Freddy Boulton	4**n@u**m	37
Marcus Valtonen Örnhag	m**n@g**m	8
Sourabh	S**1@g**m	5
Václav Volhejn	8**n@u**m	5
Dawood Khan	d**2@g**m	2
Freddy Boulton	f**n@h**l	2
Lucain	l**p@g**m	2
Mahimai Raja	m**3@g**m	2
Sofia Casadei	6**4@u**m	2
Aki Miyazaki	a**3@g**m	1
AlbertMingXu	a**u@g**m	1
AleksanderWWW	a**z@g**m	1
Aman Chauhan	a**0@g**m	1
Derek	d**n@g**m	1
EasyTop	1**p@u**m	1
Erik Wasmosy	e**8@g**m	1
Fabien	F**u@u**m	1
Marcus Valtonen Örnhag	m**g@e**m	1
MechanicCoder	6**g@u**m	1
Michael Hart	m**t@c**m	1
Michael Hart	m**u@g**m	1
Mohamed Ted Meftah	7**h@u**m	1
Rohan Richard	6**d@u**m	1
Ryan Ellis	5**z@u**m	1
Shane Blair	3**2@u**m	1
Shaon Debnath	s**2@g**m	1
Shubham Rasal	9**l@u**m	1
Siddharth Garg	s**2@g**m	1
Sofian Mejjoute	7**5@u**m	1
and 7 more...

Committer Domains (Top 20 + Academic)

nyu.edu: 1 cloudflare.com: 1 ericsson.com: 1

Issues and Pull Requests

Last synced: 5 months ago

All Time

Total issues: 68
Total pull requests: 68
Average time to close issues: 6 days
Average time to close pull requests: 1 day
Total issue authors: 54
Total pull request authors: 12
Average comments per issue: 1.69
Average comments per pull request: 0.76
Merged pull requests: 54
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 68
Pull requests: 68
Average time to close issues: 6 days
Average time to close pull requests: 1 day
Issue authors: 54
Pull request authors: 12
Average comments per issue: 1.69
Average comments per pull request: 0.76
Merged pull requests: 54
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

ANYMS-A (4)
BajaMexico (3)
mohamed99akram (3)
sofi444 (2)
thomaschhh (2)
marcusvaltonen (2)
Tobsad (2)
AndreasKarasenko (2)
Borzyszkowski (2)
weynechen (2)
Redcoder007 (1)
yondonfu (1)
rybakov-ks (1)
msxpwr (1)
sblair12 (1)

Pull Request Authors

freddyaboulton (50)
dawoodkhan82 (3)
leopardracer (2)
Shubham-Rasal (2)
mahimairaja (2)
omahs (2)
sofi444 (2)
tedmeftah (1)
FabienDanieau (1)
amanchauhan11 (1)
sblair12 (1)
AlbertMingXu (1)

Top Labels

Issue Labels

good first issue (1) question (1)

Pull Request Labels

Packages

Total packages: 2
Total downloads:
- pypi 74,317 last-month
- npm 12 last-month

Total dependent packages: 0
(may contain duplicates)
Total dependent repositories: 0
(may contain duplicates)
Total versions: 95
Total maintainers: 1

npmjs.org: @freddyaboulton/fastrtc-component

Gradio UI packages

Homepage: https://github.com/gradio-app/fastrtc#readme
License: ISC
Latest release: 0.0.1
published 9 months ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 12 Last month

Rankings

Dependent repos count: 24.6%

Average: 30.0%

Dependent packages count: 35.5%

Maintainers (1)

freddyaboulton

Last synced: 4 months ago

pypi.org: fastrtc

The realtime communication library for Python

Documentation: https://fastrtc.org/
License: mit
Latest release: 0.0.33
published 5 months ago

Versions: 94
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 74,317 Last month

Rankings

Dependent packages count: 9.7%

Average: 32.1%

Dependent repos count: 54.6%

Maintainers (1)

freddyaboulton

Last synced: 5 months ago

Dependencies

frontend/package-lock.json npm

452 dependencies

frontend/package.json npm

@gradio/preview 0.12.0 development
@ffmpeg/ffmpeg ^0.12.10
@ffmpeg/util ^0.12.1
@gradio/atoms 0.9.0
@gradio/client 1.6.0
@gradio/icons 0.8.0
@gradio/image 0.16.0
@gradio/markdown ^0.10.0
@gradio/statustracker 0.8.0
@gradio/upload 0.13.0
@gradio/utils 0.7.0
@gradio/wasm 0.14.0
hls.js ^1.5.16
mrmime ^2.0.0

demo/requirements.txt pypi

onnxruntime-gpu *
opencv-python *
safetensors ==0.4.3
twilio *

pyproject.toml pypi

aiortc *
gradio >=4.0,<6.0

https://github.com/gradio-app/fastrtc

Science Score: 36.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

FastRTC

The Real-Time Communication Library for Python.

Installation

Key Features

Docs

Examples

🗣️👀 Gemini Audio Video Chat

🗣️ Google Gemini Real Time Voice API

🗣️ OpenAI Real Time Voice API

🤖 Hello Computer

🤖 Llama Code Editor

🗣️ Talk to Claude

🎵 Whisper Transcription

📷 Yolov10 Object Detection

🗣️ Kyutai Moshi

🗣️ Hello Llama: Stop Word Detection

Usage

Quickstart

Echo Audio

LLM Voice Chat

See "Talk to Claude" in Cookbook for an example of how to keep

track of the chat history.

Webcam Stream

Object Detection

git clone https://huggingface.co/spaces/fastrtc/object-detection

for YOLOv10 implementation

Running the Stream

Gradio

Telephone (Audio Only)

FastAPI

Optional: Add routes

uvicorn app:app --host 0.0.0.0 --port 8000

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

npmjs.org: @freddyaboulton/fastrtc-component

Rankings

Maintainers (1)

pypi.org: fastrtc

Rankings

Maintainers (1)

Dependencies