|
--- |
|
license: mit |
|
license_link: https://huggingface.co/rhysjones/Phi-3-mini-mango-1/resolve/main/LICENSE |
|
|
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
tags: |
|
- nlp |
|
- code |
|
widget: |
|
- messages: |
|
- role: user |
|
content: Can you provide ways to eat combinations of bananas and dragonfruits? |
|
--- |
|
|
|
## Model Summary |
|
|
|
The Phi-3-mini-4k-mango-2 is a finetune of [Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) with 4K context and 3.8B parameters. |
|
|
|
It is a continuaton of finetuning Phi-3 (which is a great model!) to explore its properties and behaviour. More to follow. |
|
|
|
This version of the model has had its weight layers converted to Mistral format, allowing it to run within a Mistral toolset ecosystem without change or trust_remote_code. |
|
It seems to offer better performance than the eqivalent conversion to Llama format, which could be of interest to those using finetune toolsets yet to encompass the phi-3 model. |
|
|
|
The process was first to convert the model weight names and config to Mistral, followed by a finetune of those weights. |