given text, return someĀ image based on CLIP embedding
Image Classifier of various vehicles from fine-tuned model