doberst commited on
Commit
201449a
·
1 Parent(s): 804187d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -5
README.md CHANGED
@@ -82,14 +82,14 @@ Note: this was the smallest model that we were able to train to consistently re
82
 
83
  The fastest way to get started with BLING is through direct import in transformers:
84
 
85
- from transformers import AutoTokenizer, AutoModelForCausalLM
86
- tokenizer = AutoTokenizer.from_pretrained("llmware/bling-1b-0.1")
87
- model = AutoModelForCausalLM.from_pretrained("llmware/bling-1b-0.1")
88
 
89
 
90
  The BLING model was fine-tuned with a simple "\<human> and \<bot> wrapper", so to get the best results, wrap inference entries as:
91
 
92
- full_prompt = "\<human>\: " + my_prompt + "\n" + "\<bot>\:"
93
 
94
  The BLING model was fine-tuned with closed-context samples, which assume generally that the prompt consists of two sub-parts:
95
 
@@ -98,8 +98,31 @@ The BLING model was fine-tuned with closed-context samples, which assume general
98
 
99
  To get the best results, package "my_prompt" as follows:
100
 
101
- my_prompt = {{text_passage}} + "\n" + {{question/instruction}}
102
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
103
 
104
  ## Citation [optional]
105
 
 
82
 
83
  The fastest way to get started with BLING is through direct import in transformers:
84
 
85
+ from transformers import AutoTokenizer, AutoModelForCausalLM
86
+ tokenizer = AutoTokenizer.from_pretrained("llmware/bling-1b-0.1")
87
+ model = AutoModelForCausalLM.from_pretrained("llmware/bling-1b-0.1")
88
 
89
 
90
  The BLING model was fine-tuned with a simple "\<human> and \<bot> wrapper", so to get the best results, wrap inference entries as:
91
 
92
+ full_prompt = "\<human>\: " + my_prompt + "\n" + "\<bot>\:"
93
 
94
  The BLING model was fine-tuned with closed-context samples, which assume generally that the prompt consists of two sub-parts:
95
 
 
98
 
99
  To get the best results, package "my_prompt" as follows:
100
 
101
+ my_prompt = {{text_passage}} + "\n" + {{question/instruction}}
102
 
103
+ If you are using a HuggingFace generation script:
104
+
105
+ # prepare prompt packaging used in fine-tuning process
106
+ new_prompt = "<human>: " + entries["context"] + "\n" + entries["query"] + "\n" + "<bot>:"
107
+
108
+ inputs = tokenizer(new_prompt, return_tensors="pt")
109
+ start_of_output = len(inputs.input_ids[0])
110
+
111
+ # temperature: set at 0.3 for consistency of output
112
+ # max_new_tokens: set at 100 - may prematurely stop a few of the summaries
113
+
114
+ outputs = model.generate(
115
+ inputs.input_ids.to(device),
116
+ eos_token_id=tokenizer.eos_token_id,
117
+ pad_token_id=tokenizer.eos_token_id,
118
+ do_sample=True,
119
+ temperature=0.3,
120
+ max_new_tokens=100,
121
+ )
122
+
123
+ output_only = tokenizer.decode(outputs[0][start_of_output:],skip_special_tokens=True)
124
+
125
+ Please also refer to two sample test scripts in the files repository for full examples.
126
 
127
  ## Citation [optional]
128