# Adapting Multimodal Large Language Models to Domains via Post-Training This repository provides an implementation preview of our paper, **On Domain-Specific Post-Training for Multimodal Large Language Models**. We investigate domain adaptation of MLLMs through post-training, focusing on data synthesis, training pipelines, and task evaluation. Our resulting model, **AdaMLLM**, consistently outperforms general MLLMs across various tasks in two domains: biomedicine and food.

### **Updates** - **[2024/11/28]** Released our paper. ## About **AdaMLLM** is our third effort to enhance **task generalization** by scaling synthetic supervised tasks from unsupervised contexts.

- **1st Work: [AdaptLLM](https://huggingface.co/papers/2309.09530)** We employ rule-based methods to extract tasks from domain-specific corpora, reformatting them into reading comprehension tasks for continued pre-training. Our 7B finance model outperforms domain-specific models of much larger scales, such as BloombergGPT-50B. - **2nd Work: [Instruction Pretraining](https://huggingface.co/instruction-pretrain)** We develop a general-purpose instruction synthesizer which significantly increases task diversity for LM pre-training, outperforming Vanilla Pretraining in both general pretraining from scratch and domain-adaptive continual pretraining. - **3rd Work: AdaMLLM** We extend supervised task synthesis to multimodality, introducing a unified **visual instruction synthesizer** to extract task pairs from image-caption data. Our synthetic tasks outperform those generated by manual rules, GPT-4, and GPT-4V in improving domain-specific performance for MLLMs. Looking ahead, we aim to further broaden the scope of supervised task synthesis, efficiently enhancing the general capabilities of trained models. ## Citation ```bibtex @article{instructPT, title={Instruction Pre-Training: Language Models are Supervised Multitask Learners}, author={Cheng, Daixuan and Gu, Yuxian and Huang, Shaohan and Bi, Junyu and Huang, Minlie and Wei, Furu}, journal={arXiv preprint arXiv:2406.14491}, year={2024} } @inproceedings{ adaptllm, title={Adapting Large Language Models via Reading Comprehension}, author={Daixuan Cheng and Shaohan Huang and Furu Wei}, booktitle={The Twelfth International Conference on Learning Representations}, year={2024}, url={https://openreview.net/forum?id=y886UXPEZ0} } ```