TTS-API / README.md
NitinBot001's picture
Update README.md
d7b937e verified
metadata
title: TTS API
emoji: πŸ†
colorFrom: green
colorTo: purple
sdk: docker
pinned: false

Text-to-Speech API 🎀

A public Text-to-Speech API built with FastAPI and Microsoft Edge TTS, optimized for Hugging Face Spaces deployment.

πŸš€ Features

  • Convert text to natural-sounding speech using Microsoft Edge TTS
  • Multiple voice options with different languages and accents
  • Customizable speech parameters (pitch and rate adjustment)
  • RESTful API with automatic OpenAPI documentation
  • Public access with CORS enabled
  • Real-time audio generation and streaming

πŸ“– API Documentation

Once deployed, visit the root URL to access the interactive API documentation (Swagger UI).

πŸ”§ API Endpoints

Core Endpoints

  • GET / - API information and documentation links
  • GET /health - Health check endpoint
  • GET /voices - List all available voices
  • POST /synthesize - Convert text to speech (JSON)
  • POST /synthesize-form - Convert text to speech (Form data)

Example Usage

Using cURL with JSON:

curl -X POST 'https://your-space-url/synthesize' \
  -H 'Content-Type: application/json' \
  -d '{
    "text": "Hello from Hugging Face Spaces!",
    "voice": "en-GB-SoniaNeural",
    "pitch": "-10Hz",
    "rate": "+15%"
  }' \
  --output speech.mp3

Using cURL with Form Data:

curl -X POST 'https://your-space-url/synthesize-form' \
  -F 'text=Hello World!' \
  -F 'voice=en-US-AriaNeural' \
  -F 'pitch=+5Hz' \
  -F 'rate=+10%' \
  --output speech.mp3

Using Python requests:

import requests

response = requests.post(
    'https://your-space-url/synthesize',
    json={
        'text': 'Hello from Python!',
        'voice': 'en-US-AriaNeural',
        'pitch': '+0Hz',
        'rate': '+0%'
    }
)

with open('speech.mp3', 'wb') as f:
    f.write(response.content)

πŸ“ Parameters

Request Parameters

Parameter Type Default Description Example
text string required Text to convert to speech "Hello World!"
voice string "en-US-AriaNeural" Voice identifier "en-GB-SoniaNeural"
pitch string "+0Hz" Pitch adjustment "+10Hz", "-15Hz"
rate string "+0%" Rate adjustment "+20%", "-10%"

Voice Examples

  • en-US-AriaNeural - US English, Female
  • en-GB-SoniaNeural - UK English, Female
  • en-AU-NatashaNeural - Australian English, Female
  • de-DE-KatjaNeural - German, Female
  • fr-FR-DeniseNeural - French, Female
  • es-ES-ElviraNeural - Spanish, Female

Use the /voices endpoint to get the complete list of available voices.

Parameter Ranges

  • Pitch: -50Hz to +50Hz (e.g., "-25Hz", "+0Hz", "+30Hz")
  • Rate: -50% to +50% (e.g., "-20%", "+0%", "+25%")

πŸ› οΈ Local Development

Installation

  1. Clone the repository
  2. Install dependencies:
    pip install -r requirements.txt
    
  3. Run the server:
    python app.py
    
  4. Open http://localhost:7860 for API documentation

Docker Deployment

# Build the image
docker build -t tts-api .

# Run the container
docker run -p 7860:7860 tts-api

🌐 Hugging Face Spaces Deployment

  1. Create a new Space on Hugging Face
  2. Choose "Docker" as the SDK
  3. Upload the following files:
    • app.py (main application)
    • requirements.txt (dependencies)
    • Dockerfile (container configuration)
    • README.md (this file)
  4. Your API will be publicly accessible once deployed!

πŸ“‹ Response Format

Successful Response

  • Content-Type: audio/mpeg
  • Body: MP3 audio file

Error Response

{
  "detail": "Error description"
}

πŸ”’ Rate Limiting & Usage

This is a public API, but please use it responsibly:

  • Maximum text length: 5,000 characters
  • Recommended: Don't exceed 100 requests per minute
  • For production use, consider implementing authentication

πŸ› Troubleshooting

Common Issues

  1. Voice not found: Use the /voices endpoint to check available voices
  2. Invalid parameters: Check pitch/rate format (must include Hz/% suffix)
  3. Text too long: Maximum 5,000 characters per request
  4. Network timeout: Large texts may take longer to process

πŸ“„ License

This project uses Microsoft Edge TTS service. Please review Microsoft's terms of service for usage guidelines.

🀝 Contributing

Feel free to open issues or submit pull requests to improve this API!


Made with ❀️ for the Hugging Face community