AdaptLLM commited on
Commit
585929e
·
verified ·
1 Parent(s): 5a1b4fb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -2
README.md CHANGED
@@ -1,8 +1,8 @@
1
  # Adapting Multimodal Large Language Models to Domains via Post-Training
2
 
3
- This repository provides an implementation preview of our paper, **On Domain-Specific Post-Training for Multimodal Large Language Models**.
4
 
5
- Building on our previous work, [AdaptLLM](https://huggingface.co/papers/2309.09530), which develops domain-specific LLMs through continued training on domain-specific corpora, we introduce **AdaMLLM**. This work explores domain adaptation of MLLMs through post-training, focusing on data synthesis, training pipelines, and task evaluation.
6
 
7
  <p align='center'>
8
  <img src="https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/iklQIKW_6TyCT13BMq5-d.png" width="600">
@@ -10,3 +10,24 @@ Building on our previous work, [AdaptLLM](https://huggingface.co/papers/2309.095
10
 
11
  ******* **Updates** *********
12
  - [2024/11/28] Released our paper.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Adapting Multimodal Large Language Models to Domains via Post-Training
2
 
3
+ This repo provides an implementation preview of our paper, **On Domain-Specific Post-Training for Multimodal Large Language Models**.
4
 
5
+ We investigate domain adaptation of MLLMs through post-training, focusing on data synthesis, training pipelines, and task evaluation. Our resulting model, **AdaMLLM**, consistently outperforms general MLLMs across various tasks in two domains: biomedicine and food.
6
 
7
  <p align='center'>
8
  <img src="https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/iklQIKW_6TyCT13BMq5-d.png" width="600">
 
10
 
11
  ******* **Updates** *********
12
  - [2024/11/28] Released our paper.
13
+
14
+
15
+ ## About
16
+
17
+ **AdaMLLM** represents our third effort to enhance **task generalization** of trained models by scaling synthetic supervised tasks from unsupervised contexts.
18
+
19
+ - **1st Work: [AdaptLLM](https://huggingface.co/papers/2309.09530) (ICLR 2024)**
20
+ We employ rule-based methods to extract tasks from domain-specific corpora, reformatting them into reading comprehension tasks for continued pre-training. Our 7B model outperforms domain-spcific models of much larger scales such as BloombergGPT.
21
+
22
+ - **2nd Work: [Instruction Pretraining](https://huggingface.co/instruction-pretrain) (EMNLP 2024)**
23
+ We develop a general-purpose instruction synthesizer that significantly increased task diversity. Instruction Pretraining outperforms Vanilla Pretraining in both general pretraining from scratch and domain-adaptive continual pretraining.
24
+
25
+ - **3rd Work: AdaMLLM (This Work)**
26
+ We extend supervised task synthesis to multimodality, introducing a unified **visual instruction synthesizer** to extract task pairs from image-caption pairs. Our synthetic tasks surpass those generated by manual rules, GPT-4, and GPT-4V in enhancing domain-specific performance for MLLMs.
27
+
28
+ Looking ahead, we envision broadening the scope of supervised task synthesis, enhancing general capabilities of trained models.
29
+
30
+ <p align='center'>
31
+ <img src=https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/-5qzvcSj_PCYKmTS_ZMOS.png width="1000">
32
+ </p>
33
+