khanhld3
commited on
Commit
·
06f1304
1
Parent(s):
7e37373
[test] init
Browse files- .gitignore +0 -2
- README.md +38 -10
.gitignore
CHANGED
|
@@ -1,2 +0,0 @@
|
|
| 1 |
-
push_hf.py
|
| 2 |
-
pytorch_model.pt
|
|
|
|
|
|
|
|
|
README.md
CHANGED
|
@@ -67,26 +67,28 @@ model-index:
|
|
| 67 |
|
| 68 |
# **ChunkFormer-Large-Vie: Large-Scale Pretrained ChunkFormer for Vietnamese Automatic Speech Recognition**
|
| 69 |
[](https://creativecommons.org/licenses/by-nc/4.0/)
|
| 70 |
-
[](https://your-paper-link)
|
| 72 |
|
| 73 |
-
|
| 74 |
1. [Model Description](#description)
|
| 75 |
-
2. [Implementation](#implementation)
|
| 76 |
-
3. [Benchmark
|
| 77 |
-
4. [
|
| 78 |
-
5. [Evaluation](#evaluation)
|
| 79 |
6. [Citation](#citation)
|
| 80 |
-
7. [Contact](#contact)
|
|
|
|
| 81 |
|
| 82 |
<a name = "description" ></a>
|
| 83 |
-
|
|
|
|
|
|
|
| 84 |
<a name = "implementation" ></a>
|
| 85 |
### Documentation and Implementation
|
| 86 |
-
|
| 87 |
|
| 88 |
<a name = "benchmark" ></a>
|
| 89 |
-
### Benchmark
|
| 90 |
| STT | Model | Vios | Common Voice | VLSP - Task 1 | Avg. |
|
| 91 |
|-----|--------------|------|--------------|---------------|------|
|
| 92 |
| 1 | ChunkFormer | x | x | x | x |
|
|
@@ -94,6 +96,9 @@ We provide the documentation and implementation of ChunkFormer, check it out [HE
|
|
| 94 |
| 3 | X | x | x | x | x |
|
| 95 |
| 4 | Y | x | x | x | x |
|
| 96 |
|
|
|
|
|
|
|
|
|
|
| 97 |
<a name = "usage" ></a>
|
| 98 |
### Usage
|
| 99 |
|
|
@@ -125,4 +130,27 @@ python decode.py \
|
|
| 125 |
--right_context_size 128
|
| 126 |
```
|
| 127 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 128 |
|
|
|
|
| 67 |
|
| 68 |
# **ChunkFormer-Large-Vie: Large-Scale Pretrained ChunkFormer for Vietnamese Automatic Speech Recognition**
|
| 69 |
[](https://creativecommons.org/licenses/by-nc/4.0/)
|
| 70 |
+
[](https://github.com/khanld/chunkformer)
|
| 71 |
[](https://your-paper-link)
|
| 72 |
|
| 73 |
+
### Table of contents
|
| 74 |
1. [Model Description](#description)
|
| 75 |
+
2. [Documentation and Implementation](#implementation)
|
| 76 |
+
3. [Benchmark Results](#benchmark)
|
| 77 |
+
4. [Usage](#usage)
|
|
|
|
| 78 |
6. [Citation](#citation)
|
| 79 |
+
7. [Contact](#contact)
|
| 80 |
+
---
|
| 81 |
|
| 82 |
<a name = "description" ></a>
|
| 83 |
+
### Model Description
|
| 84 |
+
**ChunkFormer-Large-Vie** is a large-scale Vietnamese Automatic Speech Recognition (ASR) model based on the innovative **ChunkFormer** architecture, introduced at **ICASSP 2025**. The model has been fine-tuned on approximately **2000 hours** of Vietnamese speech data sourced from diverse datasets.
|
| 85 |
+
|
| 86 |
<a name = "implementation" ></a>
|
| 87 |
### Documentation and Implementation
|
| 88 |
+
The [documentation](#) and [implementation](#) of ChunkFormer are publicly available.
|
| 89 |
|
| 90 |
<a name = "benchmark" ></a>
|
| 91 |
+
### Benchmark Results
|
| 92 |
| STT | Model | Vios | Common Voice | VLSP - Task 1 | Avg. |
|
| 93 |
|-----|--------------|------|--------------|---------------|------|
|
| 94 |
| 1 | ChunkFormer | x | x | x | x |
|
|
|
|
| 96 |
| 3 | X | x | x | x | x |
|
| 97 |
| 4 | Y | x | x | x | x |
|
| 98 |
|
| 99 |
+
---
|
| 100 |
+
|
| 101 |
+
|
| 102 |
<a name = "usage" ></a>
|
| 103 |
### Usage
|
| 104 |
|
|
|
|
| 130 |
--right_context_size 128
|
| 131 |
```
|
| 132 |
|
| 133 |
+
---
|
| 134 |
+
|
| 135 |
+
<a name = "citation" ></a>
|
| 136 |
+
### Citation
|
| 137 |
+
If you use this work in your research, please cite:
|
| 138 |
+
|
| 139 |
+
```bibtex
|
| 140 |
+
@inproceedings{your_paper,
|
| 141 |
+
title={ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription},
|
| 142 |
+
author={Khanh Le, Tuan Vu Ho, Dung Tran and Duc Thanh Chau},
|
| 143 |
+
booktitle={ICASSP},
|
| 144 |
+
year={2025}
|
| 145 |
+
}
|
| 146 |
+
```
|
| 147 |
+
|
| 148 |
+
<a name = "contact"></a>
|
| 149 |
+
### Contact
|
| 150 | |
| 151 |
+
- [](https://github.com/)
|
| 152 |
+
- [](https://www.linkedin.com/in/khanhld257/)
|
| 153 |
+
|
| 154 |
+
|
| 155 |
+
|
| 156 |
|