SECAP: Speech Emotion Captioning with Large Language Model
This repository contains the implementation of the paper "SECap: Speech Emotion Captioning with Large Language Model". It includes the model code, training and testing scripts, and a test dataset. The test dataset consists of 600 wav audio files and their corresponding emotion descriptions.
Please find more details at the GitHub repo[https://github.com/xuyaoxun/SECaps]
You can download the model checkpoint in this repo freely.
Citation
If you use this repository in your research, please kindly cite our paper:
@article{SECap, title={SECap: Speech Emotion Captioning with Large Language Model},
}