Spaces:

rasyosef
/

phi-2-chat

Running

rasyosef commited on Jan 6

Commit

8974f60

•

1 Parent(s): 1300829

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -26,7 +26,7 @@ def generate(prompt, chat_history):
   final_prompt += "User: " + prompt + "\n"
   final_prompt += "Output:"
-  generated_text = phi2(final_prompt, max_new_tokens=32)[0]["generated_text"]
   response = generated_text.split("Output:")[1].split("User:")[0]
   if "Assistant:" in response:
@@ -41,7 +41,7 @@ with gr.Blocks() as demo:
   gr.Markdown("""
   # Phi-2 Chatbot Demo
-  This chatbot was created using Microsoft's [phi-2](https://huggingface.co/microsoft/phi-2) model. To speed up inference, `max_new_tokens` has been set to `32` in the text generation pipeline. It might take up to 120 seconds for each response to be generated.
   """)
   chatbot = gr.Chatbot()

   final_prompt += "User: " + prompt + "\n"
   final_prompt += "Output:"
+  generated_text = phi2(final_prompt, max_new_tokens=24)[0]["generated_text"]
   response = generated_text.split("Output:")[1].split("User:")[0]
   if "Assistant:" in response:
   gr.Markdown("""
   # Phi-2 Chatbot Demo
+  This chatbot was created using Microsoft's 2.7 billion parameter [phi-2](https://huggingface.co/microsoft/phi-2) Transformer model. In order to reduce the response time on this hardware, `max_new_tokens` has been set to `24` in the text generation pipeline. It takes up to 150 seconds for each response to be generated.
   """)
   chatbot = gr.Chatbot()