artemsnegirev commited on
Commit
d1eefe9
1 Parent(s): 832c721

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - artemsnegirev/ru-word-games
5
+ language:
6
+ - ru
7
+ metrics:
8
+ - exact_match
9
+ pipeline_tag: text2text-generation
10
+ ---
11
+
12
+ Model was trained on companion [dataset](artemsnegirev/ru-word-games). Minibob guess word from a description modeling well known Alias word game.
13
+
14
+ ```python
15
+ from transformers import T5ForConditionalGeneration, T5Tokenizer
16
+
17
+ prefix = "guess word:"
18
+
19
+ def predict_word(prompt, model, tokenizer):
20
+ prompt = prompt.replace("...", "<extra_id_0>")
21
+ prompt = f"{prefix} {prompt}"
22
+
23
+ input_ids = tokenizer([prompt], return_tensors="pt").input_ids
24
+
25
+ outputs = model.generate(
26
+ input_ids.to(model.device),
27
+ num_beams=5,
28
+ max_new_tokens=8,
29
+ do_sample=False,
30
+ num_return_sequences=5
31
+ )
32
+
33
+ candidates = set()
34
+
35
+ for tokens in outputs:
36
+ candidate = tokenizer.decode(tokens, skip_special_tokens=True)
37
+ candidate = candidate.strip().lower()
38
+
39
+ candidates.add(candidate)
40
+
41
+ return candidates
42
+
43
+ model_name = "artemsnegirev/minibob"
44
+
45
+ tokenizer = T5Tokenizer.from_pretrained(model_name)
46
+ model = T5ForConditionalGeneration.from_pretrained(model_name)
47
+
48
+ prompt = "это животное с копытами на нем ездят"
49
+
50
+ print(predict_word(prompt, model, tokenizer))
51
+ # {'верблюд', 'конь', 'коня', 'лошадь', 'пони'}
52
+ ```
53
+
54
+ Detailed github-based [tutorial](https://github.com/artemsnegirev/minibob) with pipeline and source code for building Minibob