sonisphere / README.md
Phil Sobrepena
initial commit
73ed896
|
raw
history blame
1.91 kB
---
title: Sonisphere
emoji: 🐒
colorFrom: green
colorTo: gray
sdk: gradio
sdk_version: 5.20.0
app_file: app.py
pinned: false
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# Sonisphere Demo
This is a Hugging Face Spaces demo for [MMAudio](https://hkchengrex.com/MMAudio/), a powerful model for generating realistic audio for videos.
## πŸŽ₯ Features
- Upload any video and generate matching audio
- Control the generation with text prompts
- Adjust generation parameters like steps and guidance strength
- Process videos up to 30 seconds in length
## πŸš€ Usage
1. Upload a video or use one of the example videos
2. Enter a text prompt describing the desired audio
3. (Optional) Add a negative prompt to specify what you don't want
4. Adjust the generation parameters if needed
5. Click "Submit" and wait for the generation to complete
## βš™οΈ Parameters
- **Prompt**: Describe the audio you want to generate
- **Negative prompt**: Specify what you don't want in the audio (default: "music")
- **Seed**: Control randomness (-1 for random seed)
- **Number of steps**: More steps = better quality but slower (default: 25)
- **Guidance Strength**: Controls how closely the generation follows the prompt (default: 4.5)
- **Duration**: Length of the generated audio in seconds (default: 8)
## πŸ“ Notes
- Processing high-resolution videos (>384px on shorter side) takes longer and doesn't improve results
- The model works best with videos between 5-30 seconds
- Generation time depends on video length and number of steps
## πŸ”— Links
- [Project Page](https://hkchengrex.com/MMAudio/)
- [GitHub Repository](https://github.com/hkchengrex/MMAudio)
- [Paper](https://arxiv.org/abs/2401.09774)
## πŸ“œ License
This demo uses the MMAudio model which is released under the [MIT license](https://github.com/hkchengrex/MMAudio/blob/main/LICENSE).