Improve model card: title, metadata, project page
Browse filesThis PR addresses several improvements for the model card:
- Corrects the main title to "Step-Audio 2 Technical Report".
- Adds `pipeline_tag: any-to-any` to the metadata, ensuring the model is discoverable under this modality.
- Includes `library_name: transformers` in the metadata, enabling direct integration and "how to use" widgets on the Hub.
- Adds clear, explicit links to the paper, official project page, and code repository for better visibility and user experience.
README.md
CHANGED
@@ -1,5 +1,7 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
3 |
---
|
4 |
|
5 |
<div align="center">
|
@@ -21,6 +23,12 @@ license: apache-2.0
|
|
21 |
<a href="https://github.com/stepfun-ai/Step-Audio2/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/badge/License-Apache%202.0-blue?&color=blue"/></a>
|
22 |
</div>
|
23 |
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
## Introduction
|
25 |
|
26 |
|
@@ -133,7 +141,7 @@ CER for Chinese, Cantonese and Japanese and WER for Arabian and English. N/A ind
|
|
133 |
<td align="center"><strong>2.71</strong></td>
|
134 |
<td align="center">4.47</td>
|
135 |
<td align="center">5.05</td>
|
136 |
-
<
|
137 |
<td align="center">3.05</td>
|
138 |
</tr>
|
139 |
<tr>
|
@@ -190,6 +198,7 @@ CER for Chinese, Cantonese and Japanese and WER for Arabian and English. N/A ind
|
|
190 |
<td align="center">7.01</td>
|
191 |
<td align="center">2.68</td>
|
192 |
<td align="center"><strong>2.53</strong></td>
|
|
|
193 |
</tr>
|
194 |
<tr>
|
195 |
<td align="left">KeSpeech phase1</td>
|
@@ -854,3 +863,7 @@ The model and code in the repository is licensed under [Apache 2.0](LICENSE) Lic
|
|
854 |
url={https://arxiv.org/abs/2507.16632},
|
855 |
}
|
856 |
```
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
pipeline_tag: any-to-any
|
4 |
+
library_name: transformers
|
5 |
---
|
6 |
|
7 |
<div align="center">
|
|
|
23 |
<a href="https://github.com/stepfun-ai/Step-Audio2/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/badge/License-Apache%202.0-blue?&color=blue"/></a>
|
24 |
</div>
|
25 |
|
26 |
+
# Step-Audio 2 Technical Report
|
27 |
+
|
28 |
+
**Paper**: [Step-Audio 2 Technical Report](https://arxiv.org/abs/2507.16632)
|
29 |
+
**Project Page**: [Step-Audio 2 Documentation](https://www.stepfun.com/docs/en/step-audio2)
|
30 |
+
**Code**: [GitHub Repository](https://github.com/stepfun-ai/Step-Audio2)
|
31 |
+
|
32 |
## Introduction
|
33 |
|
34 |
|
|
|
141 |
<td align="center"><strong>2.71</strong></td>
|
142 |
<td align="center">4.47</td>
|
143 |
<td align="center">5.05</td>
|
144 |
+
<align="center">3.03</align>
|
145 |
<td align="center">3.05</td>
|
146 |
</tr>
|
147 |
<tr>
|
|
|
198 |
<td align="center">7.01</td>
|
199 |
<td align="center">2.68</td>
|
200 |
<td align="center"><strong>2.53</strong></td>
|
201 |
+
<td align="center">2.53</td>
|
202 |
</tr>
|
203 |
<tr>
|
204 |
<td align="left">KeSpeech phase1</td>
|
|
|
863 |
url={https://arxiv.org/abs/2507.16632},
|
864 |
}
|
865 |
```
|
866 |
+
|
867 |
+
## Star History
|
868 |
+
|
869 |
+
[](https://star-history.com/#stepfun-ai/Step-Audio2&Date)
|