vince62s commited on
Commit
5cdc95b
·
1 Parent(s): d3b6ea2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -23,14 +23,23 @@ Boston is a great city with many attractions to visit. Here are some popular one
23
  If you run with a batch size of 60 you can get a nice throughput even with GEMV:
24
 
25
  [2023-12-20 08:27:03,293 INFO] Loading checkpoint from /mnt/InternalCrucial4/dataAI/mistral-7B/mistral-instruct/mistral-onmt-awq.pt
 
26
  [2023-12-20 08:27:03,394 INFO] aawq_gemv compression of layer ['w_1', 'w_2', 'w_3', 'linear_values', 'linear_query', 'linear_keys', 'final_linear']
 
27
  [2023-12-20 08:27:08,346 INFO] Loading data into the model
 
28
  step0 time: 1.3734617233276367
 
29
  [2023-12-20 08:27:28,197 INFO] PRED SCORE: -0.2994, PRED PPL: 1.35 NB SENTENCES: 59
 
30
  [2023-12-20 08:27:28,197 INFO] Total translation time (s): 6.4
 
31
  [2023-12-20 08:27:28,197 INFO] Average translation time (ms): 109.1
 
32
  [2023-12-20 08:27:28,197 INFO] Tokens per second: 1835.8
 
33
  Time w/o python interpreter load/terminate: 24.914613008499146
34
 
35
 
36
 
 
 
23
  If you run with a batch size of 60 you can get a nice throughput even with GEMV:
24
 
25
  [2023-12-20 08:27:03,293 INFO] Loading checkpoint from /mnt/InternalCrucial4/dataAI/mistral-7B/mistral-instruct/mistral-onmt-awq.pt
26
+
27
  [2023-12-20 08:27:03,394 INFO] aawq_gemv compression of layer ['w_1', 'w_2', 'w_3', 'linear_values', 'linear_query', 'linear_keys', 'final_linear']
28
+
29
  [2023-12-20 08:27:08,346 INFO] Loading data into the model
30
+
31
  step0 time: 1.3734617233276367
32
+
33
  [2023-12-20 08:27:28,197 INFO] PRED SCORE: -0.2994, PRED PPL: 1.35 NB SENTENCES: 59
34
+
35
  [2023-12-20 08:27:28,197 INFO] Total translation time (s): 6.4
36
+
37
  [2023-12-20 08:27:28,197 INFO] Average translation time (ms): 109.1
38
+
39
  [2023-12-20 08:27:28,197 INFO] Tokens per second: 1835.8
40
+
41
  Time w/o python interpreter load/terminate: 24.914613008499146
42
 
43
 
44
 
45
+