Quantization made by Richard Erkhov.

llama3-8b-stargate-m1 - GGUF

Model creator: https://huggingface.co/scandukuri/
Original model: https://huggingface.co/scandukuri/llama3-8b-stargate-m1/

Name	Quant method	Size
llama3-8b-stargate-m1.Q2_K.gguf	Q2_K	2.96GB
llama3-8b-stargate-m1.IQ3_XS.gguf	IQ3_XS	3.28GB
llama3-8b-stargate-m1.IQ3_S.gguf	IQ3_S	3.43GB
llama3-8b-stargate-m1.Q3_K_S.gguf	Q3_K_S	3.41GB
llama3-8b-stargate-m1.IQ3_M.gguf	IQ3_M	3.52GB
llama3-8b-stargate-m1.Q3_K.gguf	Q3_K	3.74GB
llama3-8b-stargate-m1.Q3_K_M.gguf	Q3_K_M	3.74GB
llama3-8b-stargate-m1.Q3_K_L.gguf	Q3_K_L	4.03GB
llama3-8b-stargate-m1.IQ4_XS.gguf	IQ4_XS	4.18GB
llama3-8b-stargate-m1.Q4_0.gguf	Q4_0	4.34GB
llama3-8b-stargate-m1.IQ4_NL.gguf	IQ4_NL	4.38GB
llama3-8b-stargate-m1.Q4_K_S.gguf	Q4_K_S	4.37GB
llama3-8b-stargate-m1.Q4_K.gguf	Q4_K	2.38GB
llama3-8b-stargate-m1.Q4_K_M.gguf	Q4_K_M	4.58GB
llama3-8b-stargate-m1.Q4_1.gguf	Q4_1	4.78GB
llama3-8b-stargate-m1.Q5_0.gguf	Q5_0	5.21GB
llama3-8b-stargate-m1.Q5_K_S.gguf	Q5_K_S	5.21GB
llama3-8b-stargate-m1.Q5_K.gguf	Q5_K	5.34GB
llama3-8b-stargate-m1.Q5_K_M.gguf	Q5_K_M	5.34GB
llama3-8b-stargate-m1.Q5_1.gguf	Q5_1	5.65GB
llama3-8b-stargate-m1.Q6_K.gguf	Q6_K	6.14GB
llama3-8b-stargate-m1.Q8_0.gguf	Q8_0	7.95GB

Original model description:

license: mit

STaR-GATE

This repository contains the iteration 1 meta-llama/Meta-Llama-3-8B-Instruct model from an additional experiment for STaR-GATE: Teaching Language Models to Ask Clarifying Questions. Note that this experiment is an extension and is not yet included in the most recent revision of the linked preprint. The weights contained in this repository are represented by the blue line in the left-side win-rate graph below. Note that this repository contains the weights for iteration t=1, i.e. only one iteration of self-improvement.

When prompting language models to complete a task, users often leave important aspects unsaid. While asking questions could resolve this ambiguity (GATE; Li et al., 2023), models often struggle to ask good questions. We explore a language model's ability to self-improve (STaR; Zelikman et al., 2022) by rewarding the model for generating useful questions-a simple method we dub STaR-GATE. We generate a synthetic dataset of 25,500 unique persona-task prompts to simulate conversations between a pretrained language model-the Questioner-and a Roleplayer whose preferences are unknown to the Questioner. By asking questions, the Questioner elicits preferences from the Roleplayer. The Questioner is iteratively finetuned on questions that increase the probability of high-quality responses to the task, which are generated by an Oracle with access to the Roleplayer's latent preferences. After two iterations of self-improvement, the Questioner asks better questions, allowing it to generate responses that are preferred over responses from the initial model on 72% of tasks. Our results indicate that teaching a language model to ask better questions leads to better personalized responses.

fig_3

Usage

Reference the paper appendix sections A.5.2 (Figure 14: Questioner Elicitation Prompt) and A.6.2 (Figure 17: Questioner Win-Rate Response Prompt.) to see how you can prompt the model for elicitation or for final responses. All code and data for the project can be found here.