metadata
license: cc0-1.0
AEROMamba: Efficient Audio Super-Resolution
AI-Generated README - Original: GitHub | Demo
Model Overview
Architecture: Hybrid GAN + Mamba SSM
Task: 11.025 kHz → 44.1 kHz audio upsampling
Key Improvements:
- 14x faster inference vs AERO
- 5x less GPU memory usage
- 66.47 subjective score (vs AERO's 60.03)
Checkpoint: MUSDB18-HQ Model
Quick Start
# Installation
pip install torch==1.12.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
pip install causal-conv1d==1.1.2 mamba-ssm==1.1.3
# Inference
from src.models.aeromamba import AEROMamba
import torchaudio
model = AEROMamba.load_from_checkpoint("checkpoint.th")
lr_audio, sr = torchaudio.load("low_res.wav") # 11kHz input
hr_audio = model(lr_audio) # 44.1kHz output
Performance (MUSDB18)
Metric | Low-Res | AERO | AEROMamba |
---|---|---|---|
ViSQOL ↑ | 1.82 | 2.90 | 2.93 |
LSD ↓ | 3.98 | 1.34 | 1.23 |
Subjective ↑ | 38.22 | 60.03 | 66.47 |
Hardware: 14x faster on RTX 3090 (0.087s vs 1.246s)
Training Data
MUSDB18-HQ:
- 150 full-track music recordings
- 44.1 kHz originals → 11.025 kHz downsampled pairs
- 87.5/12.5 train-val split
Citation
@inproceedings{Abreu2024lamir,
author = {Wallace Abreu and Luiz Wagner Pereira Biscainho},
title = {AEROMamba: Efficient Audio SR with GANs and SSMs},
booktitle = {Proc. Latin American Music IR Workshop},
year = {2024}
}
This README was AI-generated based on original project materials. For training code and OLA inference scripts, visit the GitHub repo.