cahya's picture
udpated readme
3b17203
metadata
license: apache-2.0

Whisper small audio captioning

This model is a finetuned whisper-small model with 1M audio caption samples from the dataset mitermix/audiosnippets and 500K samples of audio emotion dataset.