Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.7%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: ktheindifferent
- License: mpl-2.0
- Language: Python
- Default Branch: dev
- Size: 126 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
SamTTS - Multi-Backend Offline Text-to-Speech API
A unified HTTP REST API providing access to multiple offline Text-to-Speech (TTS) engines through a single interface. Built on top of 🐸Coqui TTS and other leading TTS technologies.
🚀 Features
- Unified API - Single HTTP interface for multiple TTS backends
- Offline Operation - No internet connectivity required
- Multiple Engines - Support for Coqui TTS, eSpeak, eSpeak-NG, MaryTTS, pyttsx3, and Festival
- Auto-Detection - Automatic detection of available TTS engines
- Streaming & Batch - Real-time streaming and batch synthesis support
- Voice & Language Support - Multiple voices and languages per backend
- Adjustable Parameters - Control speed, pitch, and other speech parameters
🎯 Quick Start
Start the API Server
bash
cd multi_tts_api
python -m uvicorn api:app --host 0.0.0.0 --port 8000
Synthesize Speech
bash
curl -X POST http://localhost:8000/synthesize \
-H "Content-Type: application/json" \
-d '{"text": "Hello world", "backend": "espeak"}' \
--output speech.wav
📋 Supported TTS Backends
| Backend | Quality | Speed | Languages | Features | |---------|---------|-------|-----------|----------| | Coqui TTS | Excellent | Medium | 20+ | Neural models, voice cloning | | eSpeak | Fair | Very Fast | 50+ | Lightweight, reliable | | eSpeak-NG | Good | Very Fast | 100+ | Improved quality | | MaryTTS | Good | Medium | Multiple | Modular, customizable | | pyttsx3 | Varies | Fast | Varies | Cross-platform wrapper | | Festival | Good | Medium | Multiple | Highly configurable |
📖 API Documentation
Core Endpoints
List Available Backends
bash
GET /backends
Get Backend Information
bash
GET /backends/{backend_id}
Synthesize Speech
```bash POST /synthesize Content-Type: application/json
{ "text": "Hello, this is a test", "backend": "espeak", "language": "en", "speed": 1.0, "pitch": 1.0, "format": "wav" } ```
Streaming Synthesis
bash
POST /synthesize/stream
Batch Synthesis
bash
POST /synthesize/batch
For detailed API documentation, see multi_tts_api/README.md.
🛠️ Installation
Prerequisites
Install system dependencies for the TTS backends you want to use:
```bash
For eSpeak
sudo apt-get install espeak
For eSpeak-NG
sudo apt-get install espeak-ng
For Festival
sudo apt-get install festival
For MaryTTS (requires Java)
sudo apt-get install default-jre ```
Python Dependencies
```bash
Clone the repository
git clone [your-repo-url] cd SamTTS
Install Python dependencies
cd multittsapi pip install -r requirements.txt ```
🎭 Backend Details
Coqui TTS
Based on the original 🐸TTS library with support for: - High-performance Deep Learning models (Tacotron2, Glow-TTS, VITS, YourTTS) - Neural vocoders (HiFiGAN, MelGAN, WaveRNN) - Multi-speaker synthesis and voice cloning - 20+ languages with pretrained models
Other Backends
- eSpeak/eSpeak-NG: Lightweight, rule-based synthesis
- MaryTTS: Modular Java-based platform
- pyttsx3: Cross-platform TTS wrapper
- Festival: Configurable speech synthesis system
💻 Usage Examples
Python Client
```python import requests
Start the API server first
python -m uvicorn multittsapi.api:app --host 0.0.0.0 --port 8000
base_url = "http://localhost:8000"
List available backends
response = requests.get(f"{base_url}/backends") backends = response.json() print("Available backends:", backends)
Synthesize with eSpeak
tts_request = { "text": "Hello from SamTTS!", "backend": "espeak", "language": "en", "speed": 1.2 }
response = requests.post(f"{baseurl}/synthesize", json=ttsrequest) with open("output.wav", "wb") as f: f.write(response.content) ```
Command Line Interface
```bash
List available backends
curl http://localhost:8000/backends
Synthesize speech with different backends
curl -X POST http://localhost:8000/synthesize \ -H "Content-Type: application/json" \ -d '{"text": "Fast synthesis", "backend": "espeak"}' \ --output fast.wav
curl -X POST http://localhost:8000/synthesize \ -H "Content-Type: application/json" \ -d '{"text": "High quality synthesis", "backend": "coqui"}' \ --output quality.wav
Batch synthesis
curl -X POST http://localhost:8000/synthesize/batch \ -H "Content-Type: application/json" \ -d '[ {"text": "First sentence", "backend": "espeak"}, {"text": "Second sentence", "backend": "festival"} ]' \ --output batch.zip ```
📁 Project Structure
├── multi_tts_api/ # Main API application
│ ├── api.py # FastAPI application and endpoints
│ ├── backend_manager.py # Backend management and orchestration
│ ├── backends/ # TTS backend implementations
│ │ ├── base.py # Abstract base class for backends
│ │ ├── coqui.py # Coqui TTS backend
│ │ ├── espeak.py # eSpeak backend
│ │ ├── espeak_ng.py # eSpeak-NG backend
│ │ ├── festival.py # Festival backend
│ │ ├── marytts.py # MaryTTS backend
│ │ └── pyttsx3.py # pyttsx3 backend
│ ├── requirements.txt # Python dependencies
│ ├── run_server.py # Server startup script
│ ├── test_api.py # API test suite
│ └── README.md # Detailed API documentation
├── TTS/ # Original Coqui TTS library
└── README.md # This file
🤝 Contributing
Contributions are welcome! Areas for improvement: - New TTS backend implementations - Performance optimizations - Additional audio format support - Better error handling and logging - Extended language support
📄 License
This project builds upon multiple open-source TTS libraries:
- Coqui TTS: Mozilla Public License 2.0
- eSpeak/eSpeak-NG: GPL v3
- MaryTTS: LGPL v3
- Festival: Custom license
See individual backend documentation for specific license requirements.
Owner
- Login: ktheindifferent
- Kind: user
- Repositories: 9
- Profile: https://github.com/ktheindifferent
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you want to cite 🐸💬, feel free to use this (but only if you loved it 😊)"
title: "Coqui TTS"
abstract: "A deep learning toolkit for Text-to-Speech, battle-tested in research and production"
date-released: 2021-01-01
authors:
- family-names: "Eren"
given-names: "Gölge"
- name: "The Coqui TTS Team"
version: 1.4
doi: 10.5281/zenodo.6334862
license: "MPL-2.0"
url: "https://www.coqui.ai"
repository-code: "https://github.com/coqui-ai/TTS"
keywords:
- machine learning
- deep learning
- artificial intelligence
- text to speech
- TTS
GitHub Events
Total
- Delete event: 1
- Issue comment event: 2
- Push event: 4
- Pull request event: 11
- Create event: 4
Last Year
- Delete event: 1
- Issue comment event: 2
- Push event: 4
- Pull request event: 11
- Create event: 4
Dependencies
- actions/checkout v2 composite
- coqui-ai/setup-python pip-cache-key-py-ver composite
- actions/checkout v2 composite
- coqui-ai/setup-python pip-cache-key-py-ver composite
- actions/checkout v2 composite
- docker/build-push-action v2 composite
- docker/login-action v1 composite
- docker/setup-buildx-action v1 composite
- docker/setup-qemu-action v1 composite
- actions/checkout v2 composite
- coqui-ai/setup-python pip-cache-key-py-ver composite
- actions/checkout v2 composite
- actions/download-artifact v2 composite
- actions/setup-python v2 composite
- actions/upload-artifact v2 composite
- actions/checkout v2 composite
- coqui-ai/setup-python pip-cache-key-py-ver composite
- actions/checkout v2 composite
- coqui-ai/setup-python pip-cache-key-py-ver composite
- actions/checkout v2 composite
- coqui-ai/setup-python pip-cache-key-py-ver composite
- actions/checkout v2 composite
- coqui-ai/setup-python pip-cache-key-py-ver composite
- actions/checkout v2 composite
- coqui-ai/setup-python pip-cache-key-py-ver composite
- ${BASE} latest build
- numpy >=1.17.0
- umap-learn *
- furo *
- linkify-it-py *
- myst-parser ==0.15.1
- sphinx ==4.0.2
- sphinx_copybutton *
- sphinx_inline_tabs *
- black * development
- coverage * development
- isort * development
- nose2 * development
- pylint ==2.10.2 development
- bokeh ==1.4.0
- anyascii *
- coqpit >=0.0.16
- cython ==0.29.28
- flask *
- fsspec >=2021.04.0
- g2pkk >=0.1.1
- gruut ==2.2.3
- gunicorn ==20.1.0
- inflect ==5.6.0
- jamo *
- jieba *
- librosa ==0.8.0
- matplotlib *
- mecab-python3 ==1.0.5
- nltk *
- numba ==0.55.2
- numba ==0.55.1
- numpy ==1.22.4
- numpy ==1.21.6
- pandas *
- pypinyin *
- pysbd *
- pyyaml *
- scipy >=1.4.0
- soundfile *
- torch >=1.7
- torchaudio *
- tqdm *
- trainer *
- umap-learn ==0.5.1
- unidic-lite ==1.0.8