how to set context in multi-turn QA?

#14
by J22 - opened

From the model card, multi-turn QA is performed on the single context which is provided in {system}.....

In real use cases, new context might be available for a new question.

NVIDIA org

Hi,
We use "\n\n" to connect multiple context. In other words, the format would be {context_1}\n\n{context_2}\n\n{context_3} ...

I mean, how to provide a new context when asking a new question?

i see. need to just simply replace the old context with the new context when asking a new question.

So, I need to do this, which is not intuitive:

System: {System}

{Context1}

{Context2}

User: {Question1 relevant to Context1 }

Assistant: {Response}

User: {Question2 relevant to Context2}

Assistant:

It would be much better if I can use it like this:

System: {System}

User: {Question1}

{Context1}
Assistant: {Response}

User: {Question2}

{Context2}

Assistant:

oh not like this. you should remove {Context1} and do this instead:

System: {System}

{Context2}

User: {Question1 relevant to Context1}

Assistant: {Response}

User: {Question2 relevant to Context2}

Assistant:

The retriever will consider both question1 and question2 in the conversation when retrieving the relevant context for question2 (i.e., {Context2})

It is clear after the Dragon-multiturn retriever is involved.

Anyway, this looks like weird to me. My assumption is that LLM generates output incrementally: when a new question and its context are appended, the generation continues, which is compute-friendly. While with ChatQA, when a new question is raised, LLM needs to re-process a whole new chat history before generating answer to the latest question, which is suitable for a LLM server but not suitable for single user, local interference.

I suggest updating the model card to clarify multi-turn conversations.

Sign up or log in to comment