AEROMamba: Efficient Audio Super-Resolution
AI-Generated README - Original: GitHub | Demo
Model Overview
Architecture: Hybrid GAN + Mamba SSM
Task: 11.025 kHz β 44.1 kHz audio upsampling
Key Improvements:
- 14x faster inference vs AERO
- 5x less GPU memory usage
- 66.47 subjective score (vs AERO's 60.03)
Checkpoint: MUSDB18-HQ Model
Quick Start
# Installation
pip install torch==1.12.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
pip install causal-conv1d==1.1.2 mamba-ssm==1.1.3
# Inference
from src.models.aeromamba import AEROMamba
import torchaudio
model = AEROMamba.load_from_checkpoint("checkpoint.th")
lr_audio, sr = torchaudio.load("low_res.wav") # 11kHz input
hr_audio = model(lr_audio) # 44.1kHz output
Performance (MUSDB18)
Metric | Low-Res | AERO | AEROMamba |
---|---|---|---|
ViSQOL β | 1.82 | 2.90 | 2.93 |
LSD β | 3.98 | 1.34 | 1.23 |
Subjective β | 38.22 | 60.03 | 66.47 |
Hardware: 14x faster on RTX 3090 (0.087s vs 1.246s)
Training Data
MUSDB18-HQ:
- 150 full-track music recordings
- 44.1 kHz originals β 11.025 kHz downsampled pairs
- 87.5/12.5 train-val split
Citation
@inproceedings{Abreu2024lamir,
author = {Wallace Abreu and Luiz Wagner Pereira Biscainho},
title = {AEROMamba: Efficient Audio SR with GANs and SSMs},
booktitle = {Proc. Latin American Music IR Workshop},
year = {2024}
}
This README was AI-generated based on original project materials. For training code and OLA inference scripts, visit the GitHub repo.
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.