anuragsingh922
/

VocRT

ONNX

Model card Files Files and versions Community

anuragsingh922 commited on Jan 29

Commit

606f718

verified ·

1 Parent(s): d7dfeff

Update README.md

Browse files

Files changed (1) hide show

README.md +25 -24

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
-# **Realtime TTS System**
-This repository contains the complete codebase for building your personal Realtime Text-to-Speech (TTS) solution. It integrates a powerful TTS model, gRPC communication, an Express server, and a React-based client. Follow this guide to set up and explore the system effectively.
 ---
@@ -8,8 +8,8 @@ This repository contains the complete codebase for building your personal Realti
 ├── backend/         # Express server for handling API requests
 ├── frontend/        # React client for user interaction
 ├── .env             # Environment variables (OpenAI API key, etc.)
-├── voices           # all available voices
-├── demo             # demo files of model
 ├── other...
 ```
@@ -20,8 +20,8 @@ This repository contains the complete codebase for building your personal Realti
 ### **Step 1: Clone the Repository**
 Clone this repository to your local machine:
 ```bash
-git clone https://huggingface.co/anuragsingh922/realtime-tts
-cd realtime-tts
 ```
 ---
@@ -52,7 +52,7 @@ pip install -r requirements.txt
 ```
 ### **Installing eSpeak**
-`eSpeak` is a necessary dependency for the TTS system. Follow the instructions below to install it on your platform:
 #### **Ubuntu/Linux**
 Use the `apt-get` package manager to install `eSpeak`:
@@ -134,12 +134,12 @@ This should output "Hello, world!" as audio on your system.
 ---
-### **Step 6: Start the TTS Server**
 1. Add your OpenAI API key to the `.env` file:
    - Open `.env` in a text editor.
    - Replace `<openai_api_key>` with your actual OpenAI API key.
-2. Start the TTS server:
    ```bash
    python3 app.py
    ```
@@ -149,36 +149,37 @@ This should output "Hello, world!" as audio on your system.
 ### **Step 7: Test the Full System**
 - Once all servers are running:
   1. Access the React client at [http://localhost:3000](http://localhost:3000).
-  2. Interact with the TTS system via the web interface.
 ---
 ## **Model Used**
-This project utilizes the [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) TTS model hosted on Hugging Face. The model generates high-quality, realtime text-to-speech outputs.
 ---
 ## **Key Features**
-1. **Realtime TTS Generation**: Convert text input into speech with minimal latency.
 2. **React Client**: A user-friendly frontend for interaction.
-3. **Express Backend**: Handles API requests and integrates the TTS system with external services.
-4. **gRPC Communication**: Seamless communication between the TTS server and other components.
-5. **Configurable APIs**: Supports OpenAI and Deepgram API integrations.
 ---
 ## **Dependencies**
 ### Python:
-- `torch`, `torchvision`, `torchaudio`
-- `phonemizer`
-- `transformers`
-- `scipy`
-- `munch`
-- `python-dotenv`
-- `openai`
-- `grpcio`, `grpcio-tools`
-- `espeak`
 ### Node.js:
 - Express server dependencies (`npm install` in `backend`).

+# **VocRT**
+This repository contains the complete codebase for building your personal Realtime Voice-to-Voice (V2V) solution. It integrates a powerful TTS model, gRPC communication, an Express server, and a React-based client. Follow this guide to set up and explore the system effectively.
 ---
 ├── backend/         # Express server for handling API requests
 ├── frontend/        # React client for user interaction
 ├── .env             # Environment variables (OpenAI API key, etc.)
+├── voices           # All available voices
+├── demo             # Contains sample audio and demo files
 ├── other...
 ```
 ### **Step 1: Clone the Repository**
 Clone this repository to your local machine:
 ```bash
+git clone https://huggingface.co/anuragsingh922/VocRT
+cd VocRT
 ```
 ---
 ```
 ### **Installing eSpeak**
+`eSpeak` is a necessary dependency for the VocRT system. Follow the instructions below to install it on your platform:
 #### **Ubuntu/Linux**
 Use the `apt-get` package manager to install `eSpeak`:
 ---
+### **Step 6: Start the VocRT Server**
 1. Add your OpenAI API key to the `.env` file:
    - Open `.env` in a text editor.
    - Replace `<openai_api_key>` with your actual OpenAI API key.
+2. Start the VocRT server:
    ```bash
    python3 app.py
    ```
 ### **Step 7: Test the Full System**
 - Once all servers are running:
   1. Access the React client at [http://localhost:3000](http://localhost:3000).
+  2. Interact with the VocRT system via the web interface.
 ---
 ## **Model Used**
+VocRT uses [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) for text-to-speech synthesis, processing user inputs into high-quality voice responses.
 ---
 ## **Key Features**
+1. **Realtime voice response generation**: Convert speech input into speech with minimal latency.
 2. **React Client**: A user-friendly frontend for interaction.
+3. **Express Backend**: Handles API requests and integrates the VocRT system with external services.
+4. **gRPC Communication**: Seamless communication between the VocRT server and other components.
+5. **Configurable APIs**: Integrates with OpenAI and Deepgram APIs for speech recognition and text generation.
 ---
 ## **Dependencies**
 ### Python:
+  - torch, torchvision, torchaudio
+  - phonemizer
+  - transformers
+  - scipy
+  - munch
+  - python-dotenv
+  - openai
+  - grpcio, grpcio-tools
+  - espeak
 ### Node.js:
 - Express server dependencies (`npm install` in `backend`).