Whisper Base INT8 - Optimized for Intel iGPU ๐
This is an INT8 quantized version of OpenAI's Whisper base model, specifically optimized for Intel integrated GPUs.
๐ฏ Key Features
- 4x smaller than FP32 (75MB vs 280MB)
- 2-4x faster inference on Intel iGPU
- INT8 asymmetric quantization
- 100% weights quantized to INT8
- OpenVINO 2024.0+ compatible
๐ Performance
Metric | Original | INT8 | Improvement |
---|---|---|---|
Model Size | 280MB | 75MB | 3.7x smaller |
Inference Speed | 1.0x | 2-4x | 2-4x faster |
Memory Bandwidth | 100% | 30-50% | 50-70% reduction |
๐ฎ Optimized for Intel Hardware
- Intel Arc Graphics (A770, A750, A380)
- Intel Iris Xe Graphics (12th Gen+)
- Intel UHD Graphics (11th Gen+)
๐ License
Apache 2.0
๐ฆ Part of Unicorn Amanuensis
Professional STT suite: https://github.com/Unicorn-Commander/Unicorn-Amanuensis
- Downloads last month
- 15