Text Generation
Transformers
PyTorch
TensorBoard
Safetensors
bloom
Eval Results
text-generation-inference
Inference Endpoints

Is Few-shots performance optimization possible? (keep initial prompt encoded state)

#101
by Saiyan - opened

Hey,
Is it possible with BL🌸🌸M to encode a big prompt for few-shot, and then 'append' the added text for final completion?

For example, say the prompt "template" is (toy example, usually the prompt will be much bigger):
"
English => French
cheese => fromage
chair => chaise
<ADDED_TEXT> =>
"
so is it possible to encode (only one time) all the text until the <ADDED_TEXT> part, and then use that state with many different
<ADDED_TEXT> examples to make the final predictions? instead of encoding the whole text every time

Thank you in advance

Sign up or log in to comment