AdaptLLM commited on
Commit
d167b68
·
verified ·
1 Parent(s): 06cdb86

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -12
README.md CHANGED
@@ -41,25 +41,28 @@ We investigate domain adaptation of MLLMs through post-training, focusing on dat
41
 
42
  **Code**: [https://github.com/bigai-ai/QA-Synthesizer](https://github.com/bigai-ai/QA-Synthesizer)
43
 
 
 
 
 
44
  ## About
45
 
46
- AdaMLLM represents our latest advancement in building domain-specific foundation models through post-training on synthetic supervised tasks derived from unsupervised contexts.
47
 
48
  <p align='left'>
49
- <img src="https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/2aPl6mKIyHeQp8SO4TXAk.png" width="700">
50
  </p>
51
 
52
-
53
- - **[AdaptLLM](https://huggingface.co/papers/2309.09530): Adapt LLM to domains**
54
  We employ rule-based methods to extract tasks from domain-specific corpora, reformatting them into reading comprehension tasks for continued pre-training. Our 7B finance model outperforms domain-specific models of much larger scales, such as BloombergGPT-50B.
55
 
56
- - **[AdaMLLM](https://huggingface.co/papers/2411.19930): Adapt Multimodal LLM to domains**
57
- We extend supervised task synthesis to multimodality, introducing a unified visual instruction synthesizer to extract instruction-response pairs from domain-specific image-caption pairs. Our synthetic tasks outperform those generated by manual rules, GPT-4, and GPT-4V in improving domain-specific performance for MLLMs.
58
-
59
 
60
- ## Contact
61
- Daixuan Cheng: `daixuancheng6@gmail.com`
62
 
 
63
 
64
  ## Citation
65
  If you find our work helpful, please cite us.
@@ -74,14 +77,24 @@ If you find our work helpful, please cite us.
74
  }
75
  ```
76
 
77
- [AdaptLLM](https://huggingface.co/papers/2309.09530) (ICLR 2024)
 
 
 
 
 
 
 
 
 
 
78
  ```bibtex
79
  @inproceedings{
80
- adaptllm,
81
  title={Adapting Large Language Models via Reading Comprehension},
82
  author={Daixuan Cheng and Shaohan Huang and Furu Wei},
83
  booktitle={The Twelfth International Conference on Learning Representations},
84
  year={2024},
85
  url={https://openreview.net/forum?id=y886UXPEZ0}
86
  }
87
- ```
 
41
 
42
  **Code**: [https://github.com/bigai-ai/QA-Synthesizer](https://github.com/bigai-ai/QA-Synthesizer)
43
 
44
+
45
+ ## Contact
46
+ Daixuan Cheng: `[email protected]`
47
+
48
  ## About
49
 
50
+ AdaMLLM is our latest effort to enhance task generalization of (M)LLMs by scaling synthetic supervised tasks based on unsupervised contexts.
51
 
52
  <p align='left'>
53
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/HUN3Cr66w_xpj5_c7QQaI.png" width="1000">
54
  </p>
55
 
56
+ - [AdaptLLM](https://huggingface.co/papers/2309.09530)
 
57
  We employ rule-based methods to extract tasks from domain-specific corpora, reformatting them into reading comprehension tasks for continued pre-training. Our 7B finance model outperforms domain-specific models of much larger scales, such as BloombergGPT-50B.
58
 
59
+ - [Instruction Pre-Training](https://huggingface.co/papers/2406.14491)
60
+ We develop a general-purpose instruction synthesizer which significantly increases task diversity for LM pre-training, outperforming vanilla pre-training in both general pre-training from scratch and domain-adaptive continual pre-training.
 
61
 
62
+ - AdaMLLM
63
+ We extend supervised task synthesis to multimodality, introducing a unified visual instruction synthesizer to extract instruction-response pairs from image-caption data. Our synthetic tasks outperform those generated by manual rules, GPT-4, and GPT-4V in improving domain-specific performance for MLLMs.
64
 
65
+ Looking ahead, we envision further broadening the scope of supervised task synthesis, efficiently enhancing the general capabilities of trained models.
66
 
67
  ## Citation
68
  If you find our work helpful, please cite us.
 
77
  }
78
  ```
79
 
80
+ [Instruction Pre-Training](https://huggingface.co/papers/2406.14491) (EMNLP 2024)
81
+ ```bibtex
82
+ @article{cheng2024instruction,
83
+ title={Instruction Pre-Training: Language Models are Supervised Multitask Learners},
84
+ author={Cheng, Daixuan and Gu, Yuxian and Huang, Shaohan and Bi, Junyu and Huang, Minlie and Wei, Furu},
85
+ journal={arXiv preprint arXiv:2406.14491},
86
+ year={2024}
87
+ }
88
+ ```
89
+
90
+ [Adapt LLM to Domains](https://huggingface.co/papers/2309.09530) (ICLR 2024)
91
  ```bibtex
92
  @inproceedings{
93
+ cheng2024adapting,
94
  title={Adapting Large Language Models via Reading Comprehension},
95
  author={Daixuan Cheng and Shaohan Huang and Furu Wei},
96
  booktitle={The Twelfth International Conference on Learning Representations},
97
  year={2024},
98
  url={https://openreview.net/forum?id=y886UXPEZ0}
99
  }
100
+ ```