Spaces:

dcrey7
/

test

Sleeping

App Files Files Community

test / README.md

dcrey7

Upload 522 files

811126d verified 30 days ago

preview code

raw

history blame

3.58 kB

	---
	title: Test
	emoji: 🐠
	colorFrom: pink
	colorTo: pink
	sdk: docker
	pinned: false
	---


	# AI-Powered Question & Answer Generator with Voice Cloning

	---

	## Overview

	This project leverages cutting-edge AI technologies to create an interactive experience where AI-generated answers are delivered using a cloned voice. The primary components of the project include:

	1. Text Generation: Based on a fine-tuned model, Mistral-7B-v0.1, we generate realistic and human-like answers to user-provided questions.
	2. Voice Cloning: Using the ElevenLabs API, we clone a voice and synthesize the AI-generated answers into natural-sounding speech.
	3. Deception for Interaction: The system is designed to "tromper" (mislead) players by making the responses appear as if they are coming from a real human.

	---

	## Key Features

	1. Fine-Tuned Model for Text Generation:
	- The project utilizes the Mistral-7B-v0.1 model fine-tuned on a custom dataset.
	- The model generates contextually accurate, human-like responses to a wide range of questions.

	2. Voice Cloning with ElevenLabs:
	- ElevenLabs’ Speech-to-Text and Voice Cloning API is used to replicate a target voice.
	- The cloned voice delivers the AI-generated answers in a natural and believable manner.

	3. Integration for Immersion:
	- The generated answers and synthesized speech are integrated to provide seamless interaction.
	- Designed for applications in gaming, interactive storytelling, or prank scenarios.

	---

	## How It Works

	### 1. Question Input:
	- Users provide a question in text form (e.g., "What’s the best way to prepare for a long flight?").
	- Alternatively, voice input can be transcribed to text using ElevenLabs’ speech-to-text feature.

	### 2. Text Generation:
	- The Mistral-7B-v0.1 model processes the input question and generates a natural response.
	- Example:
	- Question: "What’s your favorite place to relax?"
	- Answer: "My room, where I can unwind and enjoy some quiet time."

	### 3. Voice Cloning:
	- The generated text is sent to ElevenLabs’ API, where it is converted into speech using a cloned voice.
	- The voice sounds human, complete with natural intonation and emotion.

	### 4. Output Delivery:
	- The final output is an audio response delivered in the cloned voice, making it indistinguishable from a real human speaker.

	---

	## Applications

	- Gaming: Use in trivia or role-playing games to simulate human-like NPCs.
	- Storytelling: Create immersive audio experiences by combining generated text with realistic voiceovers.
	- Social Experiments: Test human reactions to AI-generated, voice-synthesized responses in various scenarios.
	- Entertainment/Pranks: Surprise players or audiences with a system that convincingly mimics a real human.

	---

	## Technologies Used

	1. Mistral-7B-v0.1:
	- A fine-tuned large language model specializing in text generation.
	- Delivers contextually accurate and relatable answers.

	2. ElevenLabs API:
	- Speech-to-Text: Converts spoken questions into text for the model to process.
	- Voice Cloning: Synthesizes text into speech using a cloned voice.

	3. Python:
	- Backend logic for integrating text generation, voice synthesis, and API calls.
	- Frameworks and libraries include `transformers`, `torch`, and API wrappers for ElevenLabs.

	---

	## Setup Instructions

	### 1. Clone the Repository:
	```bash
	git clone https://github.com/Lirone/NotMe.git
	cd NotMe