Potentially incorrect Position of SYSTEM PROMPT

#54
by jianguozhang - opened

Has anyone else experienced the issue where Mixtral-8x22b-inst-v0.1 consistently appends the SYSTEM PROMPT to the last turn? Or is this normal?

from pprint import pprint
from transformers import AutoTokenizer


messages = [
    {"role": "system", "content": "You are a help AI assistant developed by the Mistral AI."},
    
    {"role": "user", "content": "My girlfriend Mary and I want to go to Disney in LA. Can you help us plan the trip?"},
    
    {"role": "assistant", "content": "Sure! I can help with booking flights, finding weather information, and checking Disney details. First, let's choose a good date. Do you want to check the weather to find a sunny and warm day?"},
    
    {"role": "user", "content": "Yes, we'd love that. We prefer sunny days with warm temperatures."},
    
    {"role": "assistant", "content": "Got it. Let me check the weather forecast for Los Angeles to help you find the best date."},
    
    {"role": "user", "content": "Sounds great! Thanks for your help."},
]

tokenizer = AutoTokenizer.from_pretrained("mistralai/Mixtral-8x22B-Instruct-v0.1")

model_input = tokenizer.apply_chat_template(
                    messages,
                    tokenize=False
                )

pprint(model_input)

Output:

'[INST] My girlfriend Mary and I want to go to Disney in LA. Can you help '
'us plan the trip?[/INST] Sure! I can help with booking flights, finding '
"weather information, and checking Disney details. First, let's choose a good "
'date. Do you want to check the weather to find a sunny and warm '
"day?
[INST] Yes, we'd love that. We prefer sunny days with warm "
'temperatures.[/INST] Got it. Let me check the weather forecast for Los '
'Angeles to help you find the best date.[INST] You are a help AI '
'assistant developed by the Mistral AI.\n
'
'\n'
'Sounds great! Thanks for your help.[/INST]'

Mistral AI_ org

Hi! Yes, by default its how the system prompt is dealt with for this tokenizer, for more details I would recommend taking a look at https://github.com/mistralai/cookbook/blob/main/concept-deep-dive/tokenization/chat_templates.md

To summarize since there is currently no fixed way to deal with system prompts, its prepended either at the first user message or last user message, but you are free to change this!

@pandora-s Thanks for the discussions in the Discord channel and I have closed this issue!

jianguozhang changed discussion status to closed

Sign up or log in to comment