AXERA-TECH
/

Qwen2.5-3B-Instruct-GPTQ-Int4

Qwen2.5-3B-Instruct

Qwen2.5-3B-Instruct-GPTQ-Int4

Model card Files Files and versions Community

qqc1989 commited on Feb 15

Commit

ba565b0

·

verified ·

1 Parent(s): 62c7ab9

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -11,7 +11,7 @@ tags:
 - Int4
 ---
-# Qwen2.5-1.5B-Instruct-GPTQ-Int4
 This version of Qwen2.5-3B-Instruct-GPTQ-Int4 has been converted to run on the Axera NPU using **w4a16** quantization.
@@ -76,10 +76,10 @@ http://localhost:12345
 #### Inference with AX650 Host, such as M4N-Dock(爱芯派Pro) or AX650N DEMO Board
-Open another terminal and run `run_qwen2.5_1.5b_gptq_int4_ax650.sh`
 ```
-root@ax650:/mnt/qtang/llm-test/qwen2.5-1.5b# ./run_qwen2.5_3b_gptq_int4_ax650.sh
 [I][                            Init][ 125]: LLM init start
 [I][                            Init][  26]: LLaMaEmbedSelector use mmap
 100% | ████████████████████████████████ |  39 /  39 [19.30s<19.30s, 2.02 count/s] init post axmodel ok,remain_cmm(1811 MB)

 - Int4
 ---
+# Qwen2.5-3B-Instruct-GPTQ-Int4
 This version of Qwen2.5-3B-Instruct-GPTQ-Int4 has been converted to run on the Axera NPU using **w4a16** quantization.
 #### Inference with AX650 Host, such as M4N-Dock(爱芯派Pro) or AX650N DEMO Board
+Open another terminal and run `run_qwen2.5_3b_gptq_int4_ax650.sh`
 ```
+root@ax650:/mnt/qtang/llm-test/qwen2.5-3b# ./run_qwen2.5_3b_gptq_int4_ax650.sh
 [I][                            Init][ 125]: LLM init start
 [I][                            Init][  26]: LLaMaEmbedSelector use mmap
 100% | ████████████████████████████████ |  39 /  39 [19.30s<19.30s, 2.02 count/s] init post axmodel ok,remain_cmm(1811 MB)