Update README.md
Browse files
README.md
CHANGED
@@ -71,14 +71,14 @@ Currently supports the following LLMs, including Hunyuan-Dense, Hunyuan-MoE, Qwe
|
|
71 |
### Speculative Decoding
|
72 |
The Eagle3 weights for the Qwen3 series model are now available.
|
73 |
|
74 |
-
|
|
75 |
-
|
|
76 |
-
| [Qwen3-1.7B](https://huggingface.co/AngelSlim/Qwen3-1.7B_eagle3) |
|
77 |
-
| [Qwen3-4B](https://huggingface.co/AngelSlim/Qwen3-4B_eagle3) |
|
78 |
-
| [Qwen3-8B](https://huggingface.co/AngelSlim/Qwen3-8B_eagle3) |
|
79 |
-
| [Qwen3-14B](https://huggingface.co/AngelSlim/Qwen3-14B_eagle3) |
|
80 |
-
| [Qwen3-32B](https://huggingface.co/AngelSlim/Qwen3-32B_eagle3) |
|
81 |
-
| [Qwen3-30B-A3B](https://huggingface.co/AngelSlim/Qwen3-a3B_eagle3) |
|
82 |
|
83 |
## ποΈHow to Use
|
84 |
|
@@ -277,6 +277,8 @@ Benchmark results for other models with `FP8-Static`, `FP8-Dynamic`, `INT4-GPTQ`
|
|
277 |
</table>
|
278 |
|
279 |
### (2) Speculative Decoding
|
|
|
|
|
280 |
Benchmark results for Qwen3 series models with `Eagle3` speculative decoding algorithm on datasets including `MT-bench`, `HunmanEval`, `GSM8K`, and `Alpaca`:
|
281 |
|
282 |
<table>
|
@@ -310,6 +312,34 @@ Benchmark results for Qwen3 series models with `Eagle3` speculative decoding alg
|
|
310 |
</tbody>
|
311 |
</table>
|
312 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
313 |
## π License
|
314 |
|
315 |
The code for this project is open-sourced under the [License for AngelSlim](LICENSE).
|
|
|
71 |
### Speculative Decoding
|
72 |
The Eagle3 weights for the Qwen3 series model are now available.
|
73 |
|
74 |
+
| Qwen3 Models | Hunyuan Models |
|
75 |
+
| ----------|----------|
|
76 |
+
| β
[Qwen3-1.7B](https://huggingface.co/AngelSlim/Qwen3-1.7B_eagle3) |β
[Hunyuan-1.8B-Instruct](https://huggingface.co/AngelSlim/Hunyuan-1.8B-Instruct_eagle3) |
|
77 |
+
| β
[Qwen3-4B](https://huggingface.co/AngelSlim/Qwen3-4B_eagle3) |β
[Hunyuan-4B-Instruct](https://huggingface.co/AngelSlim/Hunyuan-4B-Instruct_eagle3) |
|
78 |
+
| β
[Qwen3-8B](https://huggingface.co/AngelSlim/Qwen3-8B_eagle3) |β
[Hunyuan-7B-Instruct](https://huggingface.co/AngelSlim/Hunyuan-7B-Instruct_eagle3) |
|
79 |
+
| β
[Qwen3-14B](https://huggingface.co/AngelSlim/Qwen3-14B_eagle3) |
|
80 |
+
| β
[Qwen3-32B](https://huggingface.co/AngelSlim/Qwen3-32B_eagle3) |
|
81 |
+
| β
[Qwen3-30B-A3B](https://huggingface.co/AngelSlim/Qwen3-a3B_eagle3) |
|
82 |
|
83 |
## ποΈHow to Use
|
84 |
|
|
|
277 |
</table>
|
278 |
|
279 |
### (2) Speculative Decoding
|
280 |
+
|
281 |
+
#### Qwen3 Series Models
|
282 |
Benchmark results for Qwen3 series models with `Eagle3` speculative decoding algorithm on datasets including `MT-bench`, `HunmanEval`, `GSM8K`, and `Alpaca`:
|
283 |
|
284 |
<table>
|
|
|
312 |
</tbody>
|
313 |
</table>
|
314 |
|
315 |
+
#### Hunyuan Series Models
|
316 |
+
Benchmark results for Hunyuan series models with `Eagle3` speculative decoding algorithm on datasets including `MT-bench`, `HunmanEval`, `GSM8K`, and `Alpaca`:
|
317 |
+
|
318 |
+
<table>
|
319 |
+
<thead>
|
320 |
+
<tr>
|
321 |
+
<th> </th><th> </th>
|
322 |
+
<th colspan="2" style="text-align: center; vertical-align: middle;">MT-bench</th>
|
323 |
+
<th colspan="2" style="text-align: center; vertical-align: middle;">HumanEval</th>
|
324 |
+
<th colspan="2" style="text-align: center; vertical-align: middle;">GSM8K</th>
|
325 |
+
<th colspan="2" style="text-align: center; vertical-align: middle;">Alpaca</th>
|
326 |
+
<th colspan="2" style="text-align: center; vertical-align: middle;">Mean</th></tr>
|
327 |
+
<tr><th>Temperature</th><th>Model</th><th>Speedup</th><th>Ο</th><th>Speedup</th><th>Ο</th><th>Speedup</th><th>Ο</th><th>Speedup</th><th>Ο</th><th>Speedup</th><th>Ο</th></tr>
|
328 |
+
</thead>
|
329 |
+
<tbody>
|
330 |
+
<!-- <tr><td colspan="12" style="text-align: center; vertical-align: middle;"><strong>Temperature=0</strong></td></tr> -->
|
331 |
+
<tr><td rowspan="3"><strong>T=0</strong></td>
|
332 |
+
<td>Hunyuan-1.8B-Instruct</td><td>1.97x</td><td>2.90</td><td>2.58x</td><td>3.73</td><td>2.61x</td><td>3.71</td><td>1.71x</td><td>2.43</td><td>2.22x</td><td>3.19</td></tr>
|
333 |
+
<tr> <td>Hunyuan-4B-Instruct</td><td>1.77x</td><td>2.60</td><td>2.64x</td><td>3.35</td><td>2.14x</td><td>3.17</td><td>1.72x</td><td>2.57</td><td>2.07x</td><td>2.92</td></tr>
|
334 |
+
<tr><td>Hunyuan-7B-Instruct</td><td>2.22x</td><td>3.58</td><td>3.59x</td><td>5.47</td><td>2.96x</td><td>4.68</td><td>1.64x</td><td>2.56</td><td>2.60x</td><td>4.07</td></tr>
|
335 |
+
<!-- <tr><td colspan="12" style="text-align: center; vertical-align: middle;"><strong>Temperature=1</strong></td></tr> -->
|
336 |
+
<tr><td rowspan="3"><strong>T=1</strong></td>
|
337 |
+
<td>Hunyuan-1.8B-Instruct</td><td>1.58x</td><td>2.36</td><td>2.35x</td><td>3.56</td><td>2.23x</td><td>3.38</td><td>1.26x</td><td>1.87</td><td>1.86x</td><td>2.79</td></tr>
|
338 |
+
<tr><td>Hunyuan-4B-Instruct</td><td>1.36x</td><td>2.05</td><td>1.97x</td><td>2.86</td><td>1.72x</td><td>2.68</td><td>1.14x</td><td>1.76</td><td>1.55x</td><td>2.34</td></tr>
|
339 |
+
<tr><td>Hunyuan-7B-Instruct</td><td>1.90x</td><td>3.11</td><td>3.12x</td><td>5.09</td><td>2.74x</td><td>4.34</td><td>1.47x</td><td>2.39</td><td>2.31x</td><td>3.73</td></tr>
|
340 |
+
</tbody>
|
341 |
+
</table>
|
342 |
+
|
343 |
## π License
|
344 |
|
345 |
The code for this project is open-sourced under the [License for AngelSlim](LICENSE).
|