Overview
BioMed-VITAL is a multimodal foundation model specifically tuned for biomedical applications. It leverages visual and textual data to improve understanding and reasoning within the biomedical domain.
Model Training
The training of BioMed-VITAL involved two key stages, both incorporating clinician preferences to ensure the relevance and quality of the training data:
Data Generation: During this stage, the GPT-4V generator was prompted with a diverse set of clinician-selected demonstrations. This approach facilitated the generation of domain-specific, preference-aligned data candidates, tailored to reflect real-world clinical scenarios and preferences.
Data Selection: A separate selection model was trained to explicitly incorporate clinician and policy-guided preferences. This model employed a sophisticated rating function to evaluate and select the highest quality data for further tuning of BioMed-VITAL. This selection process was critical in refining the dataset to ensure that only the most relevant and accurate instructional data was used.
Performance and Evaluation
The effectiveness of BioMed-VITAL was demonstrated through significant improvements in two key areas:
- Open Visual Chat: The model showed a relative improvement of 18.5%, indicating enhanced capabilities in engaging in visual dialogues pertinent to biomedical contexts.
- Medical Visual Question Answering (VQA): BioMed-VITAL achieved a win rate of up to 81.73% in this domain, showcasing its superior performance in interpreting and responding to complex medical imagery and queries.
- Downloads last month
- 100