--- license: cc0-1.0 --- # AEROMamba: Efficient Audio Super-Resolution *AI-Generated README - Original: [GitHub](https://github.com/aeromamba-super-resolution/aeromamba) | [Demo](https://aeromamba-super-resolution.github.io/)* --- ## Model Overview **Architecture**: Hybrid GAN + Mamba SSM **Task**: 11.025 kHz → 44.1 kHz audio upsampling **Key Improvements**: - 14x faster inference vs AERO - 5x less GPU memory usage - 66.47 subjective score (vs AERO's 60.03) **Checkpoint**: [MUSDB18-HQ Model](https://huggingface.co/KingNish/AEROMamba/blob/main/checkpoint.th) --- ## Quick Start ```python # Installation pip install torch==1.12.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113 pip install causal-conv1d==1.1.2 mamba-ssm==1.1.3 # Inference from src.models.aeromamba import AEROMamba import torchaudio model = AEROMamba.load_from_checkpoint("checkpoint.th") lr_audio, sr = torchaudio.load("low_res.wav") # 11kHz input hr_audio = model(lr_audio) # 44.1kHz output ``` --- ## Performance (MUSDB18) | Metric | Low-Res | AERO | AEROMamba | |-----------------|---------|-------|-----------| | ViSQOL ↑ | 1.82 | 2.90 | **2.93** | | LSD ↓ | 3.98 | 1.34 | **1.23** | | Subjective ↑ | 38.22 | 60.03 | **66.47** | **Hardware**: 14x faster on RTX 3090 (0.087s vs 1.246s) --- ## Training Data **MUSDB18-HQ**: - 150 full-track music recordings - 44.1 kHz originals → 11.025 kHz downsampled pairs - 87.5/12.5 train-val split --- ## Citation ```bibtex @inproceedings{Abreu2024lamir, author = {Wallace Abreu and Luiz Wagner Pereira Biscainho}, title = {AEROMamba: Efficient Audio SR with GANs and SSMs}, booktitle = {Proc. Latin American Music IR Workshop}, year = {2024} } ``` *This README was AI-generated based on original project materials. For training code and OLA inference scripts, visit the [GitHub repo](https://github.com/aeromamba-super-resolution/aeromamba).*