https://github.com/comput3ai/c3-csm-gradio

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.3%) to scientific vocabulary

Last synced: 6 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: comput3ai
Language: Python
Default Branch: main
Size: 7.81 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created 12 months ago · Last pushed 12 months ago

Metadata Files

Readme

🎙️ CSM-1B Gradio Interface 🎧

A user-friendly Gradio interface for Sesame's CSM-1B model that allows you to easily generate conversations and monologues using Conversational Speech Model technology.

✨ Features

🗣️ Multi-Speaker Conversations: Generate natural-sounding conversations between up to 10 speakers
🎭 Voice Cloning: Upload your own voice samples to personalize the generated speech
🔊 Built-in Voices: Use the included reference voices or generate random voices
📝 Simple JSON Input: Easily format your conversations with a simple JSON structure
🎚️ Advanced Controls: Fine-tune generation parameters like temperature and audio length
🌐 Web Interface: Intuitive UI powered by Gradio

🚀 Quick Start

Prerequisites

Python 3.10 (recommended)
CUDA-compatible GPU (for optimal performance)
ffmpeg installed on your system

Installation

Clone this repository: bash git clone https://github.com/yourusername/c3-csm-gradio.git cd c3-csm-gradio
Install the required dependencies: bash pip install -r requirements.txt
Authenticate with Hugging Face (to access the model): bash huggingface-cli login
Launch the application: bash python app.py
Open your browser and navigate to http://localhost:7860

🧩 Usage Examples

Conversation Mode

Create conversations between multiple speakers using this JSON format:

json [ {"speaker_id": 0, "text": "This voice synthesis is amazing!"}, {"speaker_id": 1, "text": "I agree, it sounds so natural!"}, {"speaker_id": 2, "text": "And it's simple to customize voices too."} ]

Monologue Mode

Generate a speech from a single speaker:

json [ "Welcome to my presentation.", "Today we'll explore the future of AI speech synthesis.", "Let's begin with the fundamentals." ]

🐳 Docker Support

Using Pre-built Image

Pull and run the pre-built Docker image from GitHub Container Registry:

```bash

Pull the image

docker pull ghcr.io/comput3ai/c3-csm-gradio:latest

Run the container with your Hugging Face token

docker run -p 7860:7860 --gpus all -e HFTOKEN=yourhuggingface_token ghcr.io/comput3ai/c3-csm-gradio ```

Building Locally

Build and run the application using Docker:

```bash

Build the image

docker build -t csm-gradio .

Run the container with your Hugging Face token

docker run -p 7860:7860 --gpus all -e HFTOKEN=yourhuggingface_token csm-gradio ```

About HF_TOKEN

The HF_TOKEN environment variable is required for the container to authenticate with Hugging Face Hub and download the model files. You can obtain this token from your Hugging Face account settings.

⚙️ Advanced Configuration

Temperature: Controls randomness (0.1-2.0, default: 0.9)
Top-k: Limits token selection (1-100, default: 50)
Max Audio Length: Maximum duration per utterance (1000-30000ms)
Pause Duration: Silence between utterances (0-1000ms)

🔍 Implementation Details

This application is built on:

CSM-1B Model: Sesame's Conversational Speech Model
Llama-3.2-1B: For text processing
Mimi: For audio codec operations
Gradio: For the web interface

⚠️ Ethical Use Guidelines

This tool is provided for research, education, and legitimate creative purposes. Please:

Do not use for impersonation without explicit consent
Do not create misleading or deceptive content
Follow all applicable laws and ethical guidelines regarding synthetic media

📄 License

The Gradio interface is licensed under the Apache 2.0 License. The CSM-1B model has its own license terms available at Hugging Face.

🙏 Acknowledgements

Sesame AI Labs for creating and open-sourcing CSM-1B
Hugging Face for hosting the model
Gradio for the web interface framework

Owner

Name: comput3.AI
Login: comput3ai
Kind: organization
Email: hello@comput3.ai

Website: https://comput3.ai
Twitter: comput3ai
Repositories: 1
Profile: https://github.com/comput3ai

Cloud infrastructure for the future of AI.

GitHub Events

Total

Watch event: 1
Push event: 3
Fork event: 1
Create event: 3

Last Year

Watch event: 1
Push event: 3
Fork event: 1
Create event: 3

Dependencies

Dockerfile docker

nvidia/cuda 12.8.1-runtime-ubuntu24.04 build

requirements.txt pypi

gradio *
huggingface_hub *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/comput3ai/c3-csm-gradio

Science Score: 13.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

🎙️ CSM-1B Gradio Interface 🎧

✨ Features

🚀 Quick Start

Prerequisites

Installation

🧩 Usage Examples

Conversation Mode

Monologue Mode

🐳 Docker Support

Using Pre-built Image

Pull the image

Run the container with your Hugging Face token

Building Locally

Build the image

Run the container with your Hugging Face token

About HF_TOKEN

⚙️ Advanced Configuration

🔍 Implementation Details

⚠️ Ethical Use Guidelines

📄 License

🙏 Acknowledgements

Owner

GitHub Events

Total

Last Year

Dependencies