|
--- |
|
base_model: benhaotang/phi4-qwq-sky-t1 |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
- llama-cpp |
|
- gguf-my-repo |
|
datasets: |
|
- NovaSky-AI/Sky-T1_data_17k |
|
--- |
|
|
|
# benhaotang/phi4-qwq-sky-t1-Q4_K_M-GGUF |
|
This model was converted to GGUF format from [`benhaotang/phi4-qwq-sky-t1`](https://huggingface.co/benhaotang/phi4-qwq-sky-t1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space. |
|
Refer to the [original model card](https://huggingface.co/benhaotang/phi4-qwq-sky-t1) for more details on the model. |
|
|
|
## Use with ollama |
|
|
|
- via Huggingface without system prompt |
|
|
|
``` |
|
ollama run hf.co/benhaotang/phi4-qwq-sky-t1-Q4_K_M-GGUF |
|
``` |
|
|
|
- via Modelfile with suggested settings |
|
|
|
```go |
|
FROM ./phi4-qwq-sky-t1-Q6_K-GGUF/phi4-qwq-sky-t1-q6_k.gguf |
|
TEMPLATE """{{- range $i, $_ := .Messages }} |
|
{{- $last := eq (len (slice $.Messages $i)) 1 -}} |
|
<|im_start|>{{ .Role }}<|im_sep|> |
|
{{ .Content }}{{ if not $last }}<|im_end|> |
|
{{ end }} |
|
{{- if and (ne .Role "assistant") $last }}<|im_end|> |
|
<|im_start|>assistant<|im_sep|> |
|
{{ end }} |
|
{{- end }}""" |
|
PARAMETER stop <|im_start|> |
|
PARAMETER stop <|im_end|> |
|
PARAMETER stop <|im_sep|> |
|
PARAMETER num_ctx 16384 |
|
SYSTEM """Your role as an assistant involves thoroughly exploring questions through a systematic long thinking process before providing the final precise and accurate solutions. This requires engaging in a comprehensive cycle of analysis, summarizing, exploration, reassessment, reflection, backtracing, and iteration to develop well-considered thinking process. |
|
|
|
Please structure your response into two main sections: Thought and Solution. |
|
|
|
In the Thought section, detail your reasoning process using the specified format: |
|
|
|
ββ |
|
<|begin_of_thought|> |
|
{thought with steps separated with "\n\n"} |
|
<|end_of_thought|> |
|
ββ |
|
|
|
Each step should include detailed considerations such as analisying questions, summarizing relevant findings, brainstorming new ideas, verifying the accuracy of the current steps, refining any errors, and revisiting previous steps. |
|
|
|
In the Solution section, based on various attempts, explorations, and reflections from the Thought section, systematically present the final solution that you deem correct. The solution should remain a logical, accurate, concise expression style and detail necessary step needed to reach the conclusion, formatted as follows: |
|
ββ |
|
<|begin_of_solution|> |
|
{final formatted, precise, and clear solution} |
|
<|end_of_solution|> |
|
ββ |
|
Now, try to solve the following question through the above guidelines: |
|
""" |
|
LICENSE """Microsoft. |
|
Copyright (c) Microsoft Corporation. |
|
|
|
MIT License |
|
|
|
Permission is hereby granted, free of charge, to any person obtaining a copy |
|
of this software and associated documentation files (the "Software"), to deal |
|
in the Software without restriction, including without limitation the rights |
|
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell |
|
copies of the Software, and to permit persons to whom the Software is |
|
furnished to do so, subject to the following conditions: |
|
|
|
The above copyright notice and this permission notice shall be included in all |
|
copies or substantial portions of the Software. |
|
|
|
THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR |
|
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, |
|
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE |
|
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER |
|
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, |
|
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE |
|
SOFTWARE.""" |
|
``` |
|
|
|
## Use with llama.cpp |
|
Install llama.cpp through brew (works on Mac and Linux) |
|
|
|
```bash |
|
brew install llama.cpp |
|
|
|
``` |
|
Invoke the llama.cpp server or the CLI. |
|
|
|
### CLI: |
|
```bash |
|
llama-cli --hf-repo benhaotang/phi4-qwq-sky-t1-Q4_K_M-GGUF --hf-file phi4-qwq-sky-t1-q4_k_m.gguf -p "The meaning to life and the universe is" |
|
``` |
|
|
|
### Server: |
|
```bash |
|
llama-server --hf-repo benhaotang/phi4-qwq-sky-t1-Q4_K_M-GGUF --hf-file phi4-qwq-sky-t1-q4_k_m.gguf -c 2048 |
|
``` |
|
|
|
Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well. |
|
|
|
Step 1: Clone llama.cpp from GitHub. |
|
``` |
|
git clone https://github.com/ggerganov/llama.cpp |
|
``` |
|
|
|
Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux). |
|
``` |
|
cd llama.cpp && LLAMA_CURL=1 make |
|
``` |
|
|
|
Step 3: Run inference through the main binary. |
|
``` |
|
./llama-cli --hf-repo benhaotang/phi4-qwq-sky-t1-Q4_K_M-GGUF --hf-file phi4-qwq-sky-t1-q4_k_m.gguf -p "The meaning to life and the universe is" |
|
``` |
|
or |
|
``` |
|
./llama-server --hf-repo benhaotang/phi4-qwq-sky-t1-Q4_K_M-GGUF --hf-file phi4-qwq-sky-t1-q4_k_m.gguf -c 2048 |
|
``` |
|
|