daekeun-ml
/

Phi-4-multimodal-finetune-ko-speech

phi-4-multimodal

Model card Files Files and versions Community

daekeun-ml commited on 4 days ago

Commit

59813d5

·

verified ·

1 Parent(s): a780aa8

Update README.md

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -69,14 +69,19 @@ Phi-4-multimodal model is strong in multimodal tasks, especially in speech-to-te
 ## Evaluation
-ASR (Automatic Speech Recognition) on zeroth-test set and Speech translation on fleurs ko <-> en speech translation result.
 Script is retrieved from [here](https://gist.github.com/seastar105/d1d8983b27611370528e3b194dcc5577#file-evaluate-py).
 | Model                | zeroth-test | fleurs-ko2en | fleurs-ko2en-cot | fleurs-en2ko | fleurs-en2ko-cot |
 |----------------------|-------------|--------------|------------------|--------------|------------------|
 | original             |  198.32     | 5.63         | 2.42             | 6.86         | 4.17             |
 | finetune (this model)|  3.80       | 7.03         | 7.04             | 12.50        | 9.54             |
 ## References

 ## Evaluation
+Evaluation was done on the following datasets:
+- ASR (Automatic Speech Recognition): Evaluated with CER (Character Error Rate) on zeroth-test set (457 samples).
+- AST (Automatic Speech Translation): Evaluated with BLEU score on fleurs ko <-> en speech translation result (270 samples).
 Script is retrieved from [here](https://gist.github.com/seastar105/d1d8983b27611370528e3b194dcc5577#file-evaluate-py).
+Compared to [this fine-tuned model](https://huggingface.co/seastar105/Phi-4-mm-inst-zeroth-kor), ASR is significantly improved with more high-quality voice data and my own voice. However, the quality of AST deteriorates for fleurs-ko2en-cot, so appropriate data should be inserted in between to improve catastrophic forgetting.
 | Model                | zeroth-test | fleurs-ko2en | fleurs-ko2en-cot | fleurs-en2ko | fleurs-en2ko-cot |
 |----------------------|-------------|--------------|------------------|--------------|------------------|
 | original             |  198.32     | 5.63         | 2.42             | 6.86         | 4.17             |
 | finetune (this model)|  3.80       | 7.03         | 7.04             | 12.50        | 9.54             |
+| Phi-4-mm-inst-zeroth-kor |  7.02       | 7.07         | 9.19             | 13.08        | 9.35             |
 ## References