qaihm-bot commited on
Commit
e55e123
·
verified ·
1 Parent(s): 7d8eb66

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +33 -10
README.md CHANGED
@@ -18,7 +18,7 @@ tags:
18
 
19
  VIT is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
20
 
21
- This model is an implementation of VIT found [here](https://github.com/pytorch/vision/blob/main/torchvision/models/vision_transformer.py).
22
  This repository provides scripts to run VIT on Qualcomm® devices.
23
  More details on model performance across various devices, can be found
24
  [here](https://aihub.qualcomm.com/models/vit).
@@ -33,14 +33,23 @@ More details on model performance across various devices, can be found
33
  - Number of parameters: 86.6M
34
  - Model size: 330 MB
35
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
 
38
 
39
- | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
40
- | ---|---|---|---|---|---|---|---|
41
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | TFLite | 19.822 ms | 0 - 3 MB | FP16 | NPU | [VIT.tflite](https://huggingface.co/qualcomm/VIT/blob/main/VIT.tflite)
42
-
43
-
44
 
45
  ## Installation
46
 
@@ -95,7 +104,17 @@ device. This script does the following:
95
  ```bash
96
  python -m qai_hub_models.models.vit.export
97
  ```
98
-
 
 
 
 
 
 
 
 
 
 
99
 
100
 
101
  ## How does this work?
@@ -193,15 +212,19 @@ provides instructions on how to use the `.so` shared library in an Android appl
193
  Get more details on VIT's performance across various devices [here](https://aihub.qualcomm.com/models/vit).
194
  Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
195
 
 
196
  ## License
197
- - The license for the original implementation of VIT can be found
198
- [here](https://github.com/pytorch/vision/blob/main/LICENSE).
199
- - The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
 
200
 
201
  ## References
202
  * [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929)
203
  * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/vision_transformer.py)
204
 
 
 
205
  ## Community
206
  * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
207
  * For questions or feedback please [reach out to us](mailto:[email protected]).
 
18
 
19
  VIT is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
20
 
21
+ This model is an implementation of VIT found [here]({source_repo}).
22
  This repository provides scripts to run VIT on Qualcomm® devices.
23
  More details on model performance across various devices, can be found
24
  [here](https://aihub.qualcomm.com/models/vit).
 
33
  - Number of parameters: 86.6M
34
  - Model size: 330 MB
35
 
36
+ | Model | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
37
+ |---|---|---|---|---|---|---|---|---|
38
+ | VIT | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 | TFLITE | 19.821 ms | 0 - 3 MB | FP16 | NPU | [VIT.tflite](https://huggingface.co/qualcomm/VIT/blob/main/VIT.tflite) |
39
+ | VIT | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 | ONNX | 15.505 ms | 0 - 193 MB | FP16 | NPU | [VIT.onnx](https://huggingface.co/qualcomm/VIT/blob/main/VIT.onnx) |
40
+ | VIT | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 | TFLITE | 16.903 ms | 0 - 382 MB | FP16 | NPU | [VIT.tflite](https://huggingface.co/qualcomm/VIT/blob/main/VIT.tflite) |
41
+ | VIT | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 | ONNX | 11.372 ms | 0 - 149 MB | FP16 | NPU | [VIT.onnx](https://huggingface.co/qualcomm/VIT/blob/main/VIT.onnx) |
42
+ | VIT | QCS8550 (Proxy) | QCS8550 Proxy | TFLITE | 19.788 ms | 0 - 3 MB | FP16 | NPU | [VIT.tflite](https://huggingface.co/qualcomm/VIT/blob/main/VIT.tflite) |
43
+ | VIT | SA8255 (Proxy) | SA8255P Proxy | TFLITE | 19.83 ms | 0 - 3 MB | FP16 | NPU | [VIT.tflite](https://huggingface.co/qualcomm/VIT/blob/main/VIT.tflite) |
44
+ | VIT | SA8775 (Proxy) | SA8775P Proxy | TFLITE | 20.031 ms | 0 - 3 MB | FP16 | NPU | [VIT.tflite](https://huggingface.co/qualcomm/VIT/blob/main/VIT.tflite) |
45
+ | VIT | SA8650 (Proxy) | SA8650P Proxy | TFLITE | 20.358 ms | 0 - 3 MB | FP16 | NPU | [VIT.tflite](https://huggingface.co/qualcomm/VIT/blob/main/VIT.tflite) |
46
+ | VIT | QCS8450 (Proxy) | QCS8450 Proxy | TFLITE | 24.972 ms | 0 - 368 MB | FP16 | NPU | [VIT.tflite](https://huggingface.co/qualcomm/VIT/blob/main/VIT.tflite) |
47
+ | VIT | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | TFLITE | 11.489 ms | 0 - 207 MB | FP16 | NPU | [VIT.tflite](https://huggingface.co/qualcomm/VIT/blob/main/VIT.tflite) |
48
+ | VIT | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | ONNX | 9.01 ms | 1 - 112 MB | FP16 | NPU | [VIT.onnx](https://huggingface.co/qualcomm/VIT/blob/main/VIT.onnx) |
49
+ | VIT | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 21.624 ms | 171 - 171 MB | FP16 | NPU | [VIT.onnx](https://huggingface.co/qualcomm/VIT/blob/main/VIT.onnx) |
50
 
51
 
52
 
 
 
 
 
 
53
 
54
  ## Installation
55
 
 
104
  ```bash
105
  python -m qai_hub_models.models.vit.export
106
  ```
107
+ ```
108
+ Profiling Results
109
+ ------------------------------------------------------------
110
+ VIT
111
+ Device : Samsung Galaxy S23 (13)
112
+ Runtime : TFLITE
113
+ Estimated inference time (ms) : 19.8
114
+ Estimated peak memory usage (MB): [0, 3]
115
+ Total # Ops : 1579
116
+ Compute Unit(s) : NPU (1579 ops)
117
+ ```
118
 
119
 
120
  ## How does this work?
 
212
  Get more details on VIT's performance across various devices [here](https://aihub.qualcomm.com/models/vit).
213
  Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
214
 
215
+
216
  ## License
217
+ * The license for the original implementation of VIT can be found [here](https://github.com/pytorch/vision/blob/main/LICENSE).
218
+ * The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
219
+
220
+
221
 
222
  ## References
223
  * [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929)
224
  * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/vision_transformer.py)
225
 
226
+
227
+
228
  ## Community
229
  * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
230
  * For questions or feedback please [reach out to us](mailto:[email protected]).