---
datasets:
- kresnik/zeroth_korean
metrics:
- bleu
- cer
base_model:
- microsoft/Phi-4-multimodal-instruct
model-index:
- name: Phi-4-mm-inst-zeroth-kor
  results:
    - task:
        type: speech-to-text-translation
      dataset:
        type: seastar105/fleurs_ko_en_test
        name: fleurs (ko-en test intersection)
      metrics:
        - type: bleu
          name: ko2en
          value: To-be-filled
        - type: bleu
          name: ko2en-cot
          value: To-be-filled
        - type: bleu
          name: en2ko (ko-mecab)
          value: To-be-filled
        - type: bleu
          name: en2ko-cot (ko-mecab)
          value: To-be-filled
    - task:
        type: automatic-speech-recognition
      dataset:
        type: kresnik/zeroth_korean
        name: zeroth_korean test
      metrics:
        - type: cer
          name: test CER
          value: To-be-filled
---
# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->

This model is fine-tuned from [microsoft/Phi-4-multimodal-instruct](https://huggingface.co/microsoft/Phi-4-multimodal-instruct) on [kresnik/zeroth_korean](https://huggingface.co/datasets/kresnik/zeroth_korean) dataset only 1 epoch.

script for fine-tuning is [here](https://gist.github.com/seastar105/d1d8983b27611370528e3b194dcc5577#file-main-py), adapted from phi-4 repository example

model is trained only 174 steps on zeroth train set, and main purpose is to check if only korean ASR training can expand to other speech tasks(e.g. speech-to-text-translation)

## Evaluation

ASR on zeroth-test set and fleurs ko <-> en speech translation result will be filled.