TTS-API / README.md
NitinBot001's picture
Update README.md
d7b937e verified
---
title: TTS API
emoji: πŸ†
colorFrom: green
colorTo: purple
sdk: docker
pinned: false
---
# Text-to-Speech API 🎀
A public Text-to-Speech API built with FastAPI and Microsoft Edge TTS, optimized for Hugging Face Spaces deployment.
## πŸš€ Features
- **Convert text to natural-sounding speech** using Microsoft Edge TTS
- **Multiple voice options** with different languages and accents
- **Customizable speech parameters** (pitch and rate adjustment)
- **RESTful API** with automatic OpenAPI documentation
- **Public access** with CORS enabled
- **Real-time audio generation** and streaming
## πŸ“– API Documentation
Once deployed, visit the root URL to access the interactive API documentation (Swagger UI).
## πŸ”§ API Endpoints
### Core Endpoints
- `GET /` - API information and documentation links
- `GET /health` - Health check endpoint
- `GET /voices` - List all available voices
- `POST /synthesize` - Convert text to speech (JSON)
- `POST /synthesize-form` - Convert text to speech (Form data)
### Example Usage
#### Using cURL with JSON:
```bash
curl -X POST 'https://your-space-url/synthesize' \
-H 'Content-Type: application/json' \
-d '{
"text": "Hello from Hugging Face Spaces!",
"voice": "en-GB-SoniaNeural",
"pitch": "-10Hz",
"rate": "+15%"
}' \
--output speech.mp3
```
#### Using cURL with Form Data:
```bash
curl -X POST 'https://your-space-url/synthesize-form' \
-F 'text=Hello World!' \
-F 'voice=en-US-AriaNeural' \
-F 'pitch=+5Hz' \
-F 'rate=+10%' \
--output speech.mp3
```
#### Using Python requests:
```python
import requests
response = requests.post(
'https://your-space-url/synthesize',
json={
'text': 'Hello from Python!',
'voice': 'en-US-AriaNeural',
'pitch': '+0Hz',
'rate': '+0%'
}
)
with open('speech.mp3', 'wb') as f:
f.write(response.content)
```
## πŸ“ Parameters
### Request Parameters
| Parameter | Type | Default | Description | Example |
|-----------|------|---------|-------------|---------|
| `text` | string | required | Text to convert to speech | "Hello World!" |
| `voice` | string | "en-US-AriaNeural" | Voice identifier | "en-GB-SoniaNeural" |
| `pitch` | string | "+0Hz" | Pitch adjustment | "+10Hz", "-15Hz" |
| `rate` | string | "+0%" | Rate adjustment | "+20%", "-10%" |
### Voice Examples
- `en-US-AriaNeural` - US English, Female
- `en-GB-SoniaNeural` - UK English, Female
- `en-AU-NatashaNeural` - Australian English, Female
- `de-DE-KatjaNeural` - German, Female
- `fr-FR-DeniseNeural` - French, Female
- `es-ES-ElviraNeural` - Spanish, Female
*Use the `/voices` endpoint to get the complete list of available voices.*
### Parameter Ranges
- **Pitch**: -50Hz to +50Hz (e.g., "-25Hz", "+0Hz", "+30Hz")
- **Rate**: -50% to +50% (e.g., "-20%", "+0%", "+25%")
## πŸ› οΈ Local Development
### Installation
1. Clone the repository
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Run the server:
```bash
python app.py
```
4. Open http://localhost:7860 for API documentation
### Docker Deployment
```bash
# Build the image
docker build -t tts-api .
# Run the container
docker run -p 7860:7860 tts-api
```
## 🌐 Hugging Face Spaces Deployment
1. Create a new Space on Hugging Face
2. Choose "Docker" as the SDK
3. Upload the following files:
- `app.py` (main application)
- `requirements.txt` (dependencies)
- `Dockerfile` (container configuration)
- `README.md` (this file)
4. Your API will be publicly accessible once deployed!
## πŸ“‹ Response Format
### Successful Response
- **Content-Type**: `audio/mpeg`
- **Body**: MP3 audio file
### Error Response
```json
{
"detail": "Error description"
}
```
## πŸ”’ Rate Limiting & Usage
This is a public API, but please use it responsibly:
- Maximum text length: 5,000 characters
- Recommended: Don't exceed 100 requests per minute
- For production use, consider implementing authentication
## πŸ› Troubleshooting
### Common Issues
1. **Voice not found**: Use the `/voices` endpoint to check available voices
2. **Invalid parameters**: Check pitch/rate format (must include Hz/% suffix)
3. **Text too long**: Maximum 5,000 characters per request
4. **Network timeout**: Large texts may take longer to process
## πŸ“„ License
This project uses Microsoft Edge TTS service. Please review Microsoft's terms of service for usage guidelines.
## 🀝 Contributing
Feel free to open issues or submit pull requests to improve this API!
---
**Made with ❀️ for the Hugging Face community**