This repository is a fork of philschmid/all-MiniLM-L6-v2-optimum-embeddings. My own ONNX conversion seems to be about 4x slower, no discernable reason why: the quantized models seem roughly the same. The idea here is by forking we can ex. upgrade the Optimum lib used as well.

Downloads last month
30
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.