The speed of obtaining embeddings on CPU/GPU
#3
by
hiauiarau
- opened
Hello, could you please tell me the embedding generation speed on CPU/GPU in FP16/FP32, and how much does it increase compared to BAAI/bge-m3? Also, is it possible to obtain the model in ONNX format?
In fp16 it's 10 times slower than bge-m3.