JacopoAbate commited on
Commit
4fd66b1
1 Parent(s): 9567dce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +89 -3
README.md CHANGED
@@ -1,3 +1,89 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ tags:
7
+ - kto
8
+ - phi3
9
+ - chatml
10
+ ---
11
+
12
+ # Model Information
13
+
14
+
15
+ Phi3-instruct-128k-KTO is an updated version of [Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct), aligned with KTO and QLora.
16
+
17
+ - It's trained on [distilabel-intel-orca-kto](https://huggingface.co/datasets/argilla/distilabel-intel-orca-kto).
18
+
19
+ # Evaluation
20
+
21
+ We evaluated the model using the same test sets as used for the [Open LLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
22
+
23
+ | hellaswag acc_norm | arc_challenge acc_norm | m_mmlu 5-shot acc | Average |
24
+ |:----------------------| :--------------- | :-------------------- | :------- |
25
+ | 0.7915 | 0.5606 | 0.6939 | 0.6357 |
26
+
27
+
28
+ ## Usage
29
+
30
+ Be sure to install these dependencies before running the program
31
+
32
+ ```python
33
+ !pip install transformers torch sentencepiece
34
+ ```
35
+
36
+ ```python
37
+ from transformers import AutoModelForCausalLM, AutoTokenizer
38
+
39
+ device = "cpu" # if you want to use the gpu make sure to have cuda toolkit installed and change this to "cuda"
40
+
41
+ model = AutoModelForCausalLM.from_pretrained("MoxoffSpA/Phi3-instruct-128k-KTO")
42
+ tokenizer = AutoTokenizer.from_pretrained("MoxoffSpA/Phi3-instruct-128k-KTO")
43
+
44
+ question = """Quanto è alta la torre di Pisa?"""
45
+ context = """
46
+ La Torre di Pisa è un campanile del XII secolo, famoso per la sua inclinazione. Alta circa 56 metri.
47
+ """
48
+
49
+ prompt = f"Domanda: {question}, contesto: {context}"
50
+
51
+ messages = [
52
+ {"role": "user", "content": prompt}
53
+ ]
54
+
55
+ encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
56
+
57
+ model_inputs = encodeds.to(device)
58
+ model.to(device)
59
+
60
+ generated_ids = model.generate(
61
+ model_inputs, # The input to the model
62
+ max_new_tokens=128, # Limiting the maximum number of new tokens generated
63
+ do_sample=True, # Enabling sampling to introduce randomness in the generation
64
+ temperature=0.1, # Setting temperature to control the randomness, lower values make it more deterministic
65
+ top_p=0.95, # Using nucleus sampling with top-p filtering for more coherent generation
66
+ eos_token_id=tokenizer.eos_token_id # Specifying the token that indicates the end of a sequence
67
+ )
68
+
69
+ decoded_output = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
70
+ trimmed_output = decoded_output.strip()
71
+ print(trimmed_output)
72
+ ```
73
+
74
+ ## Bias, Risks and Limitations
75
+
76
+ Phi3-instruct-128k-KTO has not been aligned to human preferences for safety within the RLHF phase or deployed with in-the-loop filtering of
77
+ responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so). It is also unknown what the size and composition
78
+ of the corpus was used to train the base model, however it is likely to have included a mix of Web data and technical sources
79
+ like books and code.
80
+
81
+ ## Links to resources
82
+
83
+ - distilabel-intel-orca-kto dataset: https://huggingface.co/datasets/argilla/distilabel-intel-orca-kto
84
+ - Phi-3-mini-128k-instruct model: https://huggingface.co/microsoft/Phi-3-mini-128k-instruct
85
+ - Open LLM Leaderbord: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard
86
+
87
+ ## The Moxoff Team
88
+
89
+ Jacopo Abate, Marco D'Ambra, Dario Domanin, Luigi Simeone, Gianpaolo Francesco Trotta