kotoba-tech
/

kotoba-whisper-v2.2

@@ -19,39 +19,8 @@ additional postprocessing stacks integrated as [`pipeline`](https://huggingface.
 These libraries are merged into Kotoba-Whisper-v2.1 via pipeline and will be applied seamlessly to the predicted transcription from [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0).
 The pipeline has been developed through the collaboration between [Asahi Ushio](https://asahiushio.com) and [Kotoba Technologies](https://twitter.com/kotoba_tech)
-Following table presents the raw CER (unlike usual CER where the punctuations are removed before computing the metrics, see the evaluation script [here](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1/blob/main/run_short_form_eval.py))
-along with the.
-| model                                                                                                                                             |   [CommonVoice 8 (Japanese test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0) |   [JSUT Basic 5000](https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000) |   [ReazonSpeech (held out test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test) |
-|:--------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------:|----------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------------:|
-| [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0)                                                         |                                                                                                        17.6 |                                                                                    15.4 |                                                                                                        17.4 |
-| [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1)                                                         |                                                                                                        17.7 |                                                                                    15.4 |                                                                                                        17   |
-| [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1) (punctuator + stable-ts)                                |                                                                                                        17.7 |                                                                                    15.4 |                                                                                                        17   |
-| [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1) (punctuator)                                            |                                                                                                        17.7 |                                                                                    15.4 |                                                                                                        17   |
-| [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1) (stable-ts)                                             |                                                                                                        17.7 |                                                                                    15.4 |                                                                                                        17   |
-| [kotoba-tech/kotoba-whisper-v1.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0)                                                         |                                                                                                        17.8 |                                                                                    15.2 |                                                                                                        17.8 |
-| [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1)                                                         |                                                                                                        17.9 |                                                                                    15   |                                                                                                        17.8 |
-| [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) (punctuator + stable-ts)                                |                                                                                                        17.9 |                                                                                    15   |                                                                                                        17.8 |
-| [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) (punctuator)                                            |                                                                                                        17.9 |                                                                                    15   |                                                                                                        17.8 |
-| [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) (stable-ts)                                             |                                                                                                        17.9 |                                                                                    15   |                                                                                                        17.8 |
-| [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)                                                                         |                                                                                                        15.3 |                                                                                    13.4 |                                                                                                        20.5 |
-| [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2)                                                                         |                                                                                                        15.9 |                                                                                    10.6 |                                                                                                        34.6 |
-| [openai/whisper-large](https://huggingface.co/openai/whisper-large)                                                                               |                                                                                                        16.6 |                                                                                    11.3 |                                                                                                        40.7 |
-| [openai/whisper-medium](https://huggingface.co/openai/whisper-medium)                                                                             |                                                                                                        17.9 |                                                                                    13.1 |                                                                                                        39.3 |
-| [openai/whisper-base](https://huggingface.co/openai/whisper-base)                                                                                 |                                                                                                        34.5 |                                                                                    26.4 |                                                                                                        76   |
-| [openai/whisper-small](https://huggingface.co/openai/whisper-small)                                                                               |                                                                                                        21.5 |                                                                                    18.9 |                                                                                                        48.1 |
-| [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny)                                                                                 |                                                                                                        58.8 |                                                                                    38.3 |                                                                                                       153.3 |
-Regarding to the normalized CER, since those update from v2.1 will be removed by the normalization, kotoba-tech/kotoba-whisper-v2.1 marks the same CER values as [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0).
-### Latency
-Please refer to the section of the latency in the kotoba-whisper-v1.1 [here](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1#latency).
 ## Transformers Usage
-Kotoba-Whisper-v2.1 is supported in the Hugging Face 🤗 Transformers library from version 4.39 onwards. To run the model, first
 install the latest version of Transformers.
 ```bash
@@ -61,6 +30,8 @@ pip install "punctuators==0.0.5"
 pip install "pyannote.audio"
 pip install git+https://github.com/huggingface/diarizers.git
 ```
 ### Transcription
 The model can be used with the [`pipeline`](https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.AutomaticSpeechRecognitionPipeline)

 These libraries are merged into Kotoba-Whisper-v2.1 via pipeline and will be applied seamlessly to the predicted transcription from [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0).
 The pipeline has been developed through the collaboration between [Asahi Ushio](https://asahiushio.com) and [Kotoba Technologies](https://twitter.com/kotoba_tech)
 ## Transformers Usage
+Kotoba-Whisper-v2.2 is supported in the Hugging Face 🤗 Transformers library from version 4.39 onwards. To run the model, first
 install the latest version of Transformers.
 ```bash
 pip install "pyannote.audio"
 pip install git+https://github.com/huggingface/diarizers.git
 ```
+Also,
 ### Transcription
 The model can be used with the [`pipeline`](https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.AutomaticSpeechRecognitionPipeline)