Skriller0208 commited on
Commit
8dcd9f9
·
verified ·
1 Parent(s): 55d16cb

Delete README_sycl.md

Browse files
Files changed (1) hide show
  1. README_sycl.md +0 -249
README_sycl.md DELETED
@@ -1,249 +0,0 @@
1
- # whisper.cpp for SYCL
2
-
3
- [Background](#background)
4
-
5
- [OS](#os)
6
-
7
- [Intel GPU](#intel-gpu)
8
-
9
- [Linux](#linux)
10
-
11
- [Environment Variable](#environment-variable)
12
-
13
- [Known Issue](#known-issue)
14
-
15
- [Todo](#todo)
16
-
17
- ## Background
18
-
19
- SYCL is a higher-level programming model to improve programming productivity on various hardware accelerators�such as CPUs, GPUs, and FPGAs. It is a single-source embedded domain-specific language based on pure C++17.
20
-
21
- oneAPI is a specification that is open and standards-based, supporting multiple architecture types including but not limited to GPU, CPU, and FPGA. The spec has both direct programming and API-based programming paradigms.
22
-
23
- Intel uses the SYCL as direct programming language to support CPU, GPUs and FPGAs.
24
-
25
- To avoid re-inventing the wheel, this code refers other code paths in llama.cpp (like OpenBLAS, cuBLAS, CLBlast). We use a open-source tool [SYCLomatic](https://github.com/oneapi-src/SYCLomatic) (Commercial release [Intel� DPC++ Compatibility Tool](https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compatibility-tool.html)) migrate to SYCL.
26
-
27
- The whisper.cpp for SYCL is used to support Intel GPUs.
28
-
29
- For Intel CPU, recommend to use whisper.cpp for X86 (Intel MKL build).
30
-
31
- ## OS
32
-
33
- |OS|Status|Verified|
34
- |-|-|-|
35
- |Linux|Support|Ubuntu 22.04|
36
- |Windows|Ongoing| |
37
-
38
-
39
- ## Intel GPU
40
-
41
- |Intel GPU| Status | Verified Model|
42
- |-|-|-|
43
- |Intel Data Center Max Series| Support| Max 1550|
44
- |Intel Data Center Flex Series| Support| Flex 170|
45
- |Intel Arc Series| Support| Arc 770|
46
- |Intel built-in Arc GPU| Support| built-in Arc GPU in Meteor Lake|
47
- |Intel iGPU| Support| iGPU in i5-1250P, i7-1165G7|
48
-
49
-
50
- ## Linux
51
-
52
- ### Setup Environment
53
-
54
- 1. Install Intel GPU driver.
55
-
56
- a. Please install Intel GPU driver by official guide: [Install GPU Drivers](https://dgpu-docs.intel.com/driver/installation.html).
57
-
58
- Note: for iGPU, please install the client GPU driver.
59
-
60
- b. Add user to group: video, render.
61
-
62
- ```
63
- sudo usermod -aG render username
64
- sudo usermod -aG video username
65
- ```
66
-
67
- Note: re-login to enable it.
68
-
69
- c. Check
70
-
71
- ```
72
- sudo apt install clinfo
73
- sudo clinfo -l
74
- ```
75
-
76
- Output (example):
77
-
78
- ```
79
- Platform #0: Intel(R) OpenCL Graphics
80
- `-- Device #0: Intel(R) Arc(TM) A770 Graphics
81
-
82
-
83
- Platform #0: Intel(R) OpenCL HD Graphics
84
- `-- Device #0: Intel(R) Iris(R) Xe Graphics [0x9a49]
85
- ```
86
-
87
- 2. Install Intel� oneAPI Base toolkit.
88
-
89
-
90
- a. Please follow the procedure in [Get the Intel� oneAPI Base Toolkit ](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html).
91
-
92
- Recommend to install to default folder: **/opt/intel/oneapi**.
93
-
94
- Following guide use the default folder as example. If you use other folder, please modify the following guide info with your folder.
95
-
96
- b. Check
97
-
98
- ```
99
- source /opt/intel/oneapi/setvars.sh
100
-
101
- sycl-ls
102
- ```
103
-
104
- There should be one or more level-zero devices. Like **[ext_oneapi_level_zero:gpu:0]**.
105
-
106
- Output (example):
107
- ```
108
- [opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2 [2023.16.10.0.17_160000]
109
- [opencl:cpu:1] Intel(R) OpenCL, 13th Gen Intel(R) Core(TM) i7-13700K OpenCL 3.0 (Build 0) [2023.16.10.0.17_160000]
110
- [opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A770 Graphics OpenCL 3.0 NEO [23.30.26918.50]
111
- [ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) A770 Graphics 1.3 [1.3.26918]
112
-
113
- ```
114
-
115
- 2. Build locally:
116
-
117
- ```
118
- mkdir -p build
119
- cd build
120
- source /opt/intel/oneapi/setvars.sh
121
-
122
- #for FP16
123
- #cmake .. -DWHISPER_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DWHISPER_SYCL_F16=ON
124
-
125
- #for FP32
126
- cmake .. -DWHISPER_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx
127
-
128
- #build example/main only
129
- #cmake --build . --config Release --target main
130
-
131
- #build all binary
132
- cmake --build . --config Release -v
133
-
134
- ```
135
-
136
- or
137
-
138
- ```
139
- ./examples/sycl/build.sh
140
- ```
141
-
142
- Note:
143
-
144
- - By default, it will build for all binary files. It will take more time. To reduce the time, we recommend to build for **example/main** only.
145
-
146
- ### Run
147
-
148
- 1. Put model file to folder **models**
149
-
150
- 2. Enable oneAPI running environment
151
-
152
- ```
153
- source /opt/intel/oneapi/setvars.sh
154
- ```
155
-
156
- 3. List device ID
157
-
158
- Run without parameter:
159
-
160
- ```
161
- ./build/bin/ls-sycl-device
162
-
163
- or
164
-
165
- ./build/bin/main
166
- ```
167
-
168
- Check the ID in startup log, like:
169
-
170
- ```
171
- found 4 SYCL devices:
172
- Device 0: Intel(R) Arc(TM) A770 Graphics, compute capability 1.3,
173
- max compute_units 512, max work group size 1024, max sub group size 32, global mem size 16225243136
174
- Device 1: Intel(R) FPGA Emulation Device, compute capability 1.2,
175
- max compute_units 24, max work group size 67108864, max sub group size 64, global mem size 67065057280
176
- Device 2: 13th Gen Intel(R) Core(TM) i7-13700K, compute capability 3.0,
177
- max compute_units 24, max work group size 8192, max sub group size 64, global mem size 67065057280
178
- Device 3: Intel(R) Arc(TM) A770 Graphics, compute capability 3.0,
179
- max compute_units 512, max work group size 1024, max sub group size 32, global mem size 16225243136
180
-
181
- ```
182
-
183
- |Attribute|Note|
184
- |-|-|
185
- |compute capability 1.3|Level-zero running time, recommended |
186
- |compute capability 3.0|OpenCL running time, slower than level-zero in most cases|
187
-
188
- 4. Set device ID and execute whisper.cpp
189
-
190
- Set device ID = 0 by **GGML_SYCL_DEVICE=0**
191
-
192
- ```
193
- GGML_SYCL_DEVICE=0 ./build/bin/main -m models/ggml-base.en.bin -f samples/jfk.wav
194
- ```
195
- or run by script:
196
-
197
- ```
198
- ./examples/sycl/run_whisper.sh
199
- ```
200
-
201
-
202
-
203
- 5. Check the device ID in output
204
-
205
- Like:
206
- ```
207
- Using device **0** (Intel(R) Arc(TM) A770 Graphics) as main device
208
- ```
209
-
210
-
211
- ## Environment Variable
212
-
213
- #### Build
214
-
215
- |Name|Value|Function|
216
- |-|-|-|
217
- |WHISPER_SYCL|ON (mandatory)|Enable build with SYCL code path. <br>For FP32/FP16, WHISPER_SYCL=ON is mandatory.|
218
- |WHISPER_SYCL_F16|ON (optional)|Enable FP16 build with SYCL code path.For FP32, do not set it.|
219
- |CMAKE_C_COMPILER|icx|Use icx compiler for SYCL code path|
220
- |CMAKE_CXX_COMPILER|icpx|use icpx for SYCL code path|
221
-
222
- #### Running
223
-
224
-
225
- |Name|Value|Function|
226
- |-|-|-|
227
- |GGML_SYCL_DEVICE|0 (default) or 1|Set the device id used. Check the device ids by default running output|
228
- |GGML_SYCL_DEBUG|0 (default) or 1|Enable log function by macro: GGML_SYCL_DEBUG|
229
-
230
- ## Known Issue
231
-
232
- - Error: `error while loading shared libraries: libsycl.so.7: cannot open shared object file: No such file or directory`.
233
-
234
- Miss to enable oneAPI running environment.
235
-
236
- Install oneAPI base toolkit and enable it by: `source /opt/intel/oneapi/setvars.sh`.
237
-
238
-
239
- - Hang during startup
240
-
241
- llama.cpp use mmap as default way to read model file and copy to GPU. In some system, memcpy will be abnormal and block.
242
-
243
- Solution: add **--no-mmap**.
244
-
245
- ## Todo
246
-
247
- - Support to build in Windows.
248
-
249
- - Support multiple cards.