MoxoffSpA
/

Moxoff-Phi3Mini-DPO

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

JacopoAbate commited on Jun 27

Commit

4a46175

•

1 Parent(s): 7f42902

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ metrics:
 # Model Information
-Phi-3-mini-128k-instruct-DPO is an updated version of [Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct), aligned with DPO and QLora.
 - It's trained on [ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned).
@@ -43,8 +43,8 @@ from transformers import AutoModelForCausalLM, AutoTokenizer
 device = "cpu" # if you want to use the gpu make sure to have cuda toolkit installed and change this to "cuda"
-model = AutoModelForCausalLM.from_pretrained("MoxoffSpA/Phi-3-mini-128k-instruct-DPO ")
-tokenizer = AutoTokenizer.from_pretrained("MoxoffSpA/Phi-3-mini-128k-instruct-DPO ")
 question = """Quanto è alta la torre di Pisa?"""
 context = """
@@ -78,7 +78,7 @@ print(trimmed_output)
 ## Bias, Risks and Limitations
-Phi-3-mini-128k-instruct-DPO has not been aligned to human preferences for safety within the RLHF phase or deployed with in-the-loop filtering of
 responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so). It is also unknown what the size and composition
 of the corpus was used to train the base model, however it is likely to have included a mix of Web data and technical sources
 like books and code.

 # Model Information
+Moxoff-Phi3Mini-DPO is an updated version of [Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct), aligned with DPO and QLora.
 - It's trained on [ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned).
 device = "cpu" # if you want to use the gpu make sure to have cuda toolkit installed and change this to "cuda"
+model = AutoModelForCausalLM.from_pretrained("MoxoffSpA/Moxoff-Phi3Mini-DPO")
+tokenizer = AutoTokenizer.from_pretrained("MoxoffSpA/Moxoff-Phi3Mini-DPO")
 question = """Quanto è alta la torre di Pisa?"""
 context = """
 ## Bias, Risks and Limitations
+Moxoff-Phi3Mini-DPO has not been aligned to human preferences for safety within the RLHF phase or deployed with in-the-loop filtering of
 responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so). It is also unknown what the size and composition
 of the corpus was used to train the base model, however it is likely to have included a mix of Web data and technical sources
 like books and code.