File size: 668 Bytes
8ed5cec 1fe2566 8ed5cec 1fe2566 8ed5cec 1fe2566 8ed5cec |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# MedCLIP
## Description
A CLIP model is finetuned on the [ROCO dataset](https://github.com/razorx89/roco-dataset).
## Dataset
Each image is accompanied by a text caption. The caption length varies from a few characters (a single word) to 2,000 characters. During preprocessing we remove all images that has a caption shorter than 10 characters.
Training set: 57,780 images with their caption.
Validation set: 7.200
Test set: 7,650
## Training
Finetune a CLIP model by simply running `sh run_medclip`.
This is the validation loss curve we observed when we trained the model using the `run_medclip.sh` script.
![Validation loss](./assets/val_loss.png)
## Evaluation
|