Spaces:

NitinBot001
/

TTS-API

Sleeping

App Files Files Community

TTS-API / README.md

NitinBot001

Update README.md

d7b937e verified 4 months ago

preview code

raw

history blame contribute delete

4.57 kB

	---
	title: TTS API
	emoji: 🏆
	colorFrom: green
	colorTo: purple
	sdk: docker
	pinned: false
	---



	# Text-to-Speech API 🎤

	A public Text-to-Speech API built with FastAPI and Microsoft Edge TTS, optimized for Hugging Face Spaces deployment.

	## 🚀 Features

	- Convert text to natural-sounding speech using Microsoft Edge TTS
	- Multiple voice options with different languages and accents
	- Customizable speech parameters (pitch and rate adjustment)
	- RESTful API with automatic OpenAPI documentation
	- Public access with CORS enabled
	- Real-time audio generation and streaming

	## 📖 API Documentation

	Once deployed, visit the root URL to access the interactive API documentation (Swagger UI).

	## 🔧 API Endpoints

	### Core Endpoints

	- `GET /` - API information and documentation links
	- `GET /health` - Health check endpoint
	- `GET /voices` - List all available voices
	- `POST /synthesize` - Convert text to speech (JSON)
	- `POST /synthesize-form` - Convert text to speech (Form data)

	### Example Usage

	#### Using cURL with JSON:
	```bash
	curl -X POST 'https://your-space-url/synthesize' \
	-H 'Content-Type: application/json' \
	-d '{
	"text": "Hello from Hugging Face Spaces!",
	"voice": "en-GB-SoniaNeural",
	"pitch": "-10Hz",
	"rate": "+15%"
	}' \
	--output speech.mp3
	```

	#### Using cURL with Form Data:
	```bash
	curl -X POST 'https://your-space-url/synthesize-form' \
	-F 'text=Hello World!' \
	-F 'voice=en-US-AriaNeural' \
	-F 'pitch=+5Hz' \
	-F 'rate=+10%' \
	--output speech.mp3
	```

	#### Using Python requests:
	```python
	import requests

	response = requests.post(
	'https://your-space-url/synthesize',
	json={
	'text': 'Hello from Python!',
	'voice': 'en-US-AriaNeural',
	'pitch': '+0Hz',
	'rate': '+0%'
	}
	)

	with open('speech.mp3', 'wb') as f:
	f.write(response.content)
	```

	## 📝 Parameters

	### Request Parameters

	\| Parameter \| Type \| Default \| Description \| Example \|
	\|-----------\|------\|---------\|-------------\|---------\|
	\| `text` \| string \| required \| Text to convert to speech \| "Hello World!" \|
	\| `voice` \| string \| "en-US-AriaNeural" \| Voice identifier \| "en-GB-SoniaNeural" \|
	\| `pitch` \| string \| "+0Hz" \| Pitch adjustment \| "+10Hz", "-15Hz" \|
	\| `rate` \| string \| "+0%" \| Rate adjustment \| "+20%", "-10%" \|

	### Voice Examples

	- `en-US-AriaNeural` - US English, Female
	- `en-GB-SoniaNeural` - UK English, Female
	- `en-AU-NatashaNeural` - Australian English, Female
	- `de-DE-KatjaNeural` - German, Female
	- `fr-FR-DeniseNeural` - French, Female
	- `es-ES-ElviraNeural` - Spanish, Female

	Use the `/voices` endpoint to get the complete list of available voices.

	### Parameter Ranges

	- Pitch: -50Hz to +50Hz (e.g., "-25Hz", "+0Hz", "+30Hz")
	- Rate: -50% to +50% (e.g., "-20%", "+0%", "+25%")

	## 🛠️ Local Development

	### Installation

	1. Clone the repository
	2. Install dependencies:
	```bash
	pip install -r requirements.txt
	```
	3. Run the server:
	```bash
	python app.py
	```
	4. Open http://localhost:7860 for API documentation

	### Docker Deployment

	```bash
	# Build the image
	docker build -t tts-api .

	# Run the container
	docker run -p 7860:7860 tts-api
	```

	## 🌐 Hugging Face Spaces Deployment

	1. Create a new Space on Hugging Face
	2. Choose "Docker" as the SDK
	3. Upload the following files:
	- `app.py` (main application)
	- `requirements.txt` (dependencies)
	- `Dockerfile` (container configuration)
	- `README.md` (this file)
	4. Your API will be publicly accessible once deployed!

	## 📋 Response Format

	### Successful Response
	- Content-Type: `audio/mpeg`
	- Body: MP3 audio file

	### Error Response
	```json
	{
	"detail": "Error description"
	}
	```

	## 🔒 Rate Limiting & Usage

	This is a public API, but please use it responsibly:
	- Maximum text length: 5,000 characters
	- Recommended: Don't exceed 100 requests per minute
	- For production use, consider implementing authentication

	## 🐛 Troubleshooting

	### Common Issues

	1. Voice not found: Use the `/voices` endpoint to check available voices
	2. Invalid parameters: Check pitch/rate format (must include Hz/% suffix)
	3. Text too long: Maximum 5,000 characters per request
	4. Network timeout: Large texts may take longer to process

	## 📄 License

	This project uses Microsoft Edge TTS service. Please review Microsoft's terms of service for usage guidelines.

	## 🤝 Contributing

	Feel free to open issues or submit pull requests to improve this API!

	---

	Made with ❤️ for the Hugging Face community