Post
1879
Prompting BERT!
Zero-shot learning ability is the hottest thing about causal LLMs. You don't need to finetune causal LLMs on each specific task. Instead, you can use prompting and get a decent performance on unseen tasks.
Unfortunately, autoencoding LLMs - like our dear friend BERT πββοΈ- lack this ability and you need a task-specific head for different tasks. But what if you could prompt all the BERTs in the world?!
π₯ Introducing Statement-Tuning π₯
Now hold your horses! don't go full-LLama on it yet. Using this finetuning approach, we can get zero-shot performance from encoders by turning a problem into a yes/no problem. Binary classification all the way down!
For example, a single entailment problem will be decomposed into 3 yes/no questions.
This is still not super useful. But I like works that try to make a little more space for encoders in the current autoregressive era!
Check the paper if interested: Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning (2404.12897)
Zero-shot learning ability is the hottest thing about causal LLMs. You don't need to finetune causal LLMs on each specific task. Instead, you can use prompting and get a decent performance on unseen tasks.
Unfortunately, autoencoding LLMs - like our dear friend BERT πββοΈ- lack this ability and you need a task-specific head for different tasks. But what if you could prompt all the BERTs in the world?!
π₯ Introducing Statement-Tuning π₯
Now hold your horses! don't go full-LLama on it yet. Using this finetuning approach, we can get zero-shot performance from encoders by turning a problem into a yes/no problem. Binary classification all the way down!
For example, a single entailment problem will be decomposed into 3 yes/no questions.
This is still not super useful. But I like works that try to make a little more space for encoders in the current autoregressive era!
Check the paper if interested: Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning (2404.12897)