SECap
SECap provides insights into speech emotion captioning by leveraging large language models to enhance accuracy and relevance. The repository contains model code, scripts for training and testing, and a dataset of 600 audio files with emotion descriptions. It offers pretrained models and weights for inference and evaluation of description similarity with ground truth, serving as a comprehensive resource for emotion analysis research.