carsonhxsu
commited on
Commit
·
4d53fc2
1
Parent(s):
19ac28c
update README
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ lyraXVERSE is currently the **fastest XVERSE-13b** available. The inferen
|
|
12 |
Among its main features are:
|
13 |
- device: Nvidia GPU with Amperer architecture or Volta architecture (A100 or higher, V100).
|
14 |
- batch_size: compiled with dynamic batch size, maximum depends on device.
|
15 |
-
- MEMOPT mode: significantly optimized
|
16 |
|
17 |
We use the XVERSE-13B-Chat model for measurement, but this optimized inference is also applicable to XVERSE-13B model.
|
18 |
|
|
|
12 |
Among its main features are:
|
13 |
- device: Nvidia GPU with Amperer architecture or Volta architecture (A100 or higher, V100).
|
14 |
- batch_size: compiled with dynamic batch size, maximum depends on device.
|
15 |
+
- MEMOPT mode: significantly optimized VRAM usage and increased speed
|
16 |
|
17 |
We use the XVERSE-13B-Chat model for measurement, but this optimized inference is also applicable to XVERSE-13B model.
|
18 |
|