Update README.md
Browse files
README.md
CHANGED
@@ -41,25 +41,28 @@ We investigate domain adaptation of MLLMs through post-training, focusing on dat
|
|
41 |
|
42 |
**Code**: [https://github.com/bigai-ai/QA-Synthesizer](https://github.com/bigai-ai/QA-Synthesizer)
|
43 |
|
|
|
|
|
|
|
|
|
44 |
## About
|
45 |
|
46 |
-
AdaMLLM
|
47 |
|
48 |
<p align='left'>
|
49 |
-
<img src="https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/
|
50 |
</p>
|
51 |
|
52 |
-
|
53 |
-
- **[AdaptLLM](https://huggingface.co/papers/2309.09530): Adapt LLM to domains**
|
54 |
We employ rule-based methods to extract tasks from domain-specific corpora, reformatting them into reading comprehension tasks for continued pre-training. Our 7B finance model outperforms domain-specific models of much larger scales, such as BloombergGPT-50B.
|
55 |
|
56 |
-
-
|
57 |
-
We
|
58 |
-
|
59 |
|
60 |
-
|
61 |
-
|
62 |
|
|
|
63 |
|
64 |
## Citation
|
65 |
If you find our work helpful, please cite us.
|
@@ -74,14 +77,24 @@ If you find our work helpful, please cite us.
|
|
74 |
}
|
75 |
```
|
76 |
|
77 |
-
[
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
78 |
```bibtex
|
79 |
@inproceedings{
|
80 |
-
|
81 |
title={Adapting Large Language Models via Reading Comprehension},
|
82 |
author={Daixuan Cheng and Shaohan Huang and Furu Wei},
|
83 |
booktitle={The Twelfth International Conference on Learning Representations},
|
84 |
year={2024},
|
85 |
url={https://openreview.net/forum?id=y886UXPEZ0}
|
86 |
}
|
87 |
-
```
|
|
|
41 |
|
42 |
**Code**: [https://github.com/bigai-ai/QA-Synthesizer](https://github.com/bigai-ai/QA-Synthesizer)
|
43 |
|
44 |
+
|
45 |
+
## Contact
|
46 |
+
Daixuan Cheng: `[email protected]`
|
47 |
+
|
48 |
## About
|
49 |
|
50 |
+
AdaMLLM is our latest effort to enhance task generalization of (M)LLMs by scaling synthetic supervised tasks based on unsupervised contexts.
|
51 |
|
52 |
<p align='left'>
|
53 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/HUN3Cr66w_xpj5_c7QQaI.png" width="1000">
|
54 |
</p>
|
55 |
|
56 |
+
- [AdaptLLM](https://huggingface.co/papers/2309.09530)
|
|
|
57 |
We employ rule-based methods to extract tasks from domain-specific corpora, reformatting them into reading comprehension tasks for continued pre-training. Our 7B finance model outperforms domain-specific models of much larger scales, such as BloombergGPT-50B.
|
58 |
|
59 |
+
- [Instruction Pre-Training](https://huggingface.co/papers/2406.14491)
|
60 |
+
We develop a general-purpose instruction synthesizer which significantly increases task diversity for LM pre-training, outperforming vanilla pre-training in both general pre-training from scratch and domain-adaptive continual pre-training.
|
|
|
61 |
|
62 |
+
- AdaMLLM
|
63 |
+
We extend supervised task synthesis to multimodality, introducing a unified visual instruction synthesizer to extract instruction-response pairs from image-caption data. Our synthetic tasks outperform those generated by manual rules, GPT-4, and GPT-4V in improving domain-specific performance for MLLMs.
|
64 |
|
65 |
+
Looking ahead, we envision further broadening the scope of supervised task synthesis, efficiently enhancing the general capabilities of trained models.
|
66 |
|
67 |
## Citation
|
68 |
If you find our work helpful, please cite us.
|
|
|
77 |
}
|
78 |
```
|
79 |
|
80 |
+
[Instruction Pre-Training](https://huggingface.co/papers/2406.14491) (EMNLP 2024)
|
81 |
+
```bibtex
|
82 |
+
@article{cheng2024instruction,
|
83 |
+
title={Instruction Pre-Training: Language Models are Supervised Multitask Learners},
|
84 |
+
author={Cheng, Daixuan and Gu, Yuxian and Huang, Shaohan and Bi, Junyu and Huang, Minlie and Wei, Furu},
|
85 |
+
journal={arXiv preprint arXiv:2406.14491},
|
86 |
+
year={2024}
|
87 |
+
}
|
88 |
+
```
|
89 |
+
|
90 |
+
[Adapt LLM to Domains](https://huggingface.co/papers/2309.09530) (ICLR 2024)
|
91 |
```bibtex
|
92 |
@inproceedings{
|
93 |
+
cheng2024adapting,
|
94 |
title={Adapting Large Language Models via Reading Comprehension},
|
95 |
author={Daixuan Cheng and Shaohan Huang and Furu Wei},
|
96 |
booktitle={The Twelfth International Conference on Learning Representations},
|
97 |
year={2024},
|
98 |
url={https://openreview.net/forum?id=y886UXPEZ0}
|
99 |
}
|
100 |
+
```
|