Update ONNX weights
#24
by
						
Xenova
	
							HF Staff
						- opened
							
					
This PR:
- Uses external data format to save model (so we don't need the individual tensors)
 - Adds fp16 version
 - Slims model.onnx using onnxslim for a more optimized graph
 
+--------------------+------------------------------------------+------------------------------------------+
|     Model Name     |                model.onnx                |                Op Set: 16                |
+--------------------+------------------------------------------+------------------------------------------+
|     Model Info     |              Original Model              |              Slimmed Model               |
+--------------------+------------------------------------------+------------------------------------------+
|   IN: input_ids    | int64: ('batch_size', 'sequence_length') | int64: ('batch_size', 'sequence_length') |
| IN: attention_mask | int64: ('batch_size', 'sequence_length') | int64: ('batch_size', 'sequence_length') |
|    IN: task_id     |               int64: None                |               int64: None                |
|  OUT: text_embeds  |         float32: ('batch_size',          |         float32: ('batch_size',          |
|                    |      'Addtext_embeds_dim_1', 1024)       |         'sequence_length', 1024)         |
|     OUT: 13049     |      float32: ('batch_size', 1024)       |      float32: ('batch_size', 1024)       |
+--------------------+------------------------------------------+------------------------------------------+
|        Add         |                   486                    |                   438                    |
|        Cast        |                   529                    |                    1                     |
|       Concat       |                   481                    |                   216                    |
|      Constant      |                   4047                   |                    0                     |
|  ConstantOfShape   |                   121                    |                    25                    |
|        Div         |                   337                    |                   121                    |
|       Einsum       |                    48                    |                    48                    |
|       Equal        |                    96                    |                    0                     |
|        Erf         |                    24                    |                    24                    |
|       Expand       |                    96                    |                    96                    |
|       Gather       |                   826                    |                   514                    |
|        Gemm        |                    1                     |                    1                     |
|       MatMul       |                   195                    |                   195                    |
|        Mul         |                   748                    |                   316                    |
|        Neg         |                    48                    |                    48                    |
|        Pow         |                    49                    |                    49                    |
|     ReduceMean     |                    98                    |                    98                    |
|      Reshape       |                   435                    |                   363                    |
|       Shape        |                   553                    |                   145                    |
|       Slice        |                   288                    |                   288                    |
|      Softmax       |                    24                    |                    24                    |
|       Split        |                    24                    |                    24                    |
|        Sqrt        |                    49                    |                    49                    |
|      Squeeze       |                    72                    |                    72                    |
|        Sub         |                    49                    |                    49                    |
|        Tanh        |                    1                     |                    1                     |
|     Transpose      |                    96                    |                    96                    |
|     Unsqueeze      |                   1057                   |                   409                    |
|       Where        |                   120                    |                    24                    |
+--------------------+------------------------------------------+------------------------------------------+
|     Model Size     |                 2.14 GB                  |            1.44 MB (2.14 GB)             |
+--------------------+------------------------------------------+------------------------------------------+
|    Elapsed Time    |                                       33.37 s                                       |
+--------------------+------------------------------------------+------------------------------------------+
Hi @Xenova , thanks for your contribution!
Does the usage of the ONNX model change with this new format? We have an example in the README, so please update it if necessary. Also, how did you combine the external data into a single file? Could you please share the conversion code? I'd like to apply the same process to https://huggingface.co/jinaai/jina-colbert-v2
bwang0911
	
				
		changed pull request status to
		merged