metadata
datasets:
- kresnik/zeroth_korean
metrics:
- bleu
- cer
base_model:
- microsoft/Phi-4-multimodal-instruct
model-index:
- name: Phi-4-mm-inst-zeroth-kor
results:
- task:
type: speech-to-text-translation
dataset:
type: seastar105/fleurs_ko_en_test
name: fleurs (ko-en test intersection)
metrics:
- type: bleu
name: ko2en
value: To-be-filled
- type: bleu
name: ko2en-cot
value: To-be-filled
- type: bleu
name: en2ko (ko-mecab)
value: To-be-filled
- type: bleu
name: en2ko-cot (ko-mecab)
value: To-be-filled
- task:
type: automatic-speech-recognition
dataset:
type: kresnik/zeroth_korean
name: zeroth_korean test
metrics:
- type: cer
name: test CER
value: To-be-filled
language:
- ko
This model is fine-tuned from microsoft/Phi-4-multimodal-instruct on kresnik/zeroth_korean dataset only 1 epoch.
script for fine-tuning is here, adapted from phi-4 repository example
model is trained only 174 steps on zeroth train set, and main purpose is to check if only korean ASR training can expand to other speech tasks(e.g. speech-to-text-translation)
Evaluation
ASR on zeroth-test set and fleurs ko <-> en speech translation result will be filled.