Universal Assisted Generation: Faster Decoding with Any Assistant Model
β’
52
Welcome to the team @dvilasuero and Argilla! Itβs been really nice collaborating with you on various projects around LLM alignment and Iβm excited to see what weβll build next together!
I am not aware of any public ablations which validate this, but I suspect it has become less important for chat models where one is more interested in the performance via human evaluation instead of academic benchmarks like MMLU (which are OK for selecting base models, but less so for chat/instruct ones)