Optimum documentation
Overview
Overview
🤗 Optimum provides an integration with BetterTransformer, a stable API from PyTorch to benefit from interesting speedups on CPU & GPU through sparsity and fused kernels.
Quickstart
Since its 1.13 version, PyTorch released the stable version of BetterTransformer in its library. You can benefit from interesting speedup on most consumer-type devices, including CPUs, older and newer versions of NIVIDIA GPUs.
You can now use this feature in 🤗 Optimum together with Transformers and use it for major models in the Hugging Face ecosystem.
Supported models
The list of supported model below:
- AlBERT
- BART
- BERT
- BERT-generation
- CamemBERT
- CLIP
- Data2VecText
- DistilBert
- DeiT
- Electra
- Ernie
- FSMT
- HuBERT
- LayoutLM
- MarkupLM
- MBart
- M2M100
- RemBERT
- RoBERTa
- Splinter
- Tapas
- ViLT
- ViT
- ViT-MAE
- ViT-MSN
- Wav2Vec2
- Whisper
- XLMRoberta
- YOLOS
Note that for encoder-decoder models, only the encoder part is supported by PyTorch’s BetterTransformer for now.
Let us know by opening an issue in 🤗 Optimum if you want more models to be supported, or check out the contribution guideline if you want to add it by yourself!
Quick usage
In order to use the BetterTransformer API just run the following commands:
>>> from transformers import AutoModelForSequenceClassification
>>> from optimum.bettertransformer import BetterTransformer
>>> model_hf = AutoModelForSequenceClassification.from_pretrained("bert-base-cased")
>>> model = BetterTransformer.transform(model_hf, keep_original_model=True)You can leave keep_original_model=False in case you want to overwrite the current model with its BetterTransformer version.
More details on tutorials section to deeply understand how to use it, or check the Google colab demo!