Generate captions for images in various styles
Experiment with and compare different tokenizers
Enhance and clean audio files