qqc1989 commited on
Commit
ba565b0
Β·
verified Β·
1 Parent(s): 62c7ab9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -11,7 +11,7 @@ tags:
11
  - Int4
12
  ---
13
 
14
- # Qwen2.5-1.5B-Instruct-GPTQ-Int4
15
 
16
  This version of Qwen2.5-3B-Instruct-GPTQ-Int4 has been converted to run on the Axera NPU using **w4a16** quantization.
17
 
@@ -76,10 +76,10 @@ http://localhost:12345
76
 
77
  #### Inference with AX650 Host, such as M4N-Dock(爱芯派Pro) or AX650N DEMO Board
78
 
79
- Open another terminal and run `run_qwen2.5_1.5b_gptq_int4_ax650.sh`
80
 
81
  ```
82
- root@ax650:/mnt/qtang/llm-test/qwen2.5-1.5b# ./run_qwen2.5_3b_gptq_int4_ax650.sh
83
  [I][ Init][ 125]: LLM init start
84
  [I][ Init][ 26]: LLaMaEmbedSelector use mmap
85
  100% | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 39 / 39 [19.30s<19.30s, 2.02 count/s] init post axmodel ok,remain_cmm(1811 MB)
 
11
  - Int4
12
  ---
13
 
14
+ # Qwen2.5-3B-Instruct-GPTQ-Int4
15
 
16
  This version of Qwen2.5-3B-Instruct-GPTQ-Int4 has been converted to run on the Axera NPU using **w4a16** quantization.
17
 
 
76
 
77
  #### Inference with AX650 Host, such as M4N-Dock(爱芯派Pro) or AX650N DEMO Board
78
 
79
+ Open another terminal and run `run_qwen2.5_3b_gptq_int4_ax650.sh`
80
 
81
  ```
82
+ root@ax650:/mnt/qtang/llm-test/qwen2.5-3b# ./run_qwen2.5_3b_gptq_int4_ax650.sh
83
  [I][ Init][ 125]: LLM init start
84
  [I][ Init][ 26]: LLaMaEmbedSelector use mmap
85
  100% | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 39 / 39 [19.30s<19.30s, 2.02 count/s] init post axmodel ok,remain_cmm(1811 MB)