SECAP: Speech Emotion Captioning with Large Language Model

This repository contains the implementation of the paper "SECap: Speech Emotion Captioning with Large Language Model". It includes the model code, training and testing scripts, and a test dataset. The test dataset consists of 600 wav audio files and their corresponding emotion descriptions.

Please find more details at the GitHub repo[https://github.com/xuyaoxun/SECaps]

You can download the model checkpoint in this repo freely.

Citation

If you use this repository in your research, please kindly cite our paper:

@article{SECap, title={SECap: Speech Emotion Captioning with Large Language Model},

}