joshuasundance
/

phi3-mini-4k-qlora-python-code-20k-mypo-4k-rfc-full

@@ -1,41 +1,51 @@
 ---
 library_name: transformers
-tags: []
 ---
 # Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
 ### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
 ## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
@@ -77,38 +87,36 @@ Use the code below to get started with the model.
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
 ### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 #### Preprocessing [optional]
-[More Information Needed]
 #### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 #### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 [More Information Needed]
 ## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
 ### Testing Data, Factors & Metrics
 #### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
 [More Information Needed]
@@ -192,8 +200,8 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
 ## Model Card Authors [optional]
-[More Information Needed]
 ## Model Card Contact
-[More Information Needed]

 ---
 library_name: transformers
+tags:
+- phi3
+- python
+- dpo
+- mypo
+license: mit
+datasets:
+- joshuasundance/mypo-4k-rfc
+language:
+- en
 ---
+**This is a merged version of `joshuasundance/phi3-mini-4k-qlora-python-code-20k-mypo-4k-rfc`**
 # Model Card for Model ID
+* **Base Model**: https://huggingface.co/edumunozsala/phi3-mini-4k-qlora-python-code-20k
+* **Preference Dataset**: https://huggingface.co/datasets/joshuasundance/mypo-4k-rfc
+* **Training Code**: https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc
+* **Training Metrics**: [trainer_state.json](trainer_state.json)
+This is an experimental model made by using `joshuasundance/mypo-4k-rfc` for DPO training of `edumunozsala/phi3-mini-4k-qlora-python-code-20k`.
+The goal is to learn about model training and potentially get the base model to reliably produce Python with type hints. I chose `edumunozsala/phi3-mini-4k-qlora-python-code-20k` because I was able to train this model in one hour on my laptop.
 ## Model Details
 ### Model Description
 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
+- **Developed by:** Joshua Sundance Bailey
+- **Model type:** phi 3 qlora DPO
+- **Language(s) (NLP):** English
+- **License:** MIT
+- **Finetuned from model [optional]:** `edumunozsala/phi3-mini-4k-qlora-python-code-20k`
 ### Model Sources [optional]
+- **Training Code:** https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc
 ## Uses
+For evaluation and testing only. Do not expect great results, and do not use this model for anything important. It has not been evaluated in any way after training.
 ### Direct Use
 ### Training Data
+* Original qlora: `iamtarun/python_code_instructions_18k_alpaca`
+* DPO: `joshuasundance/mypo-4k-rfc`
 ### Training Procedure
+See training code using `peft`, `transformers`, and `trl`
 #### Preprocessing [optional]
+See training code using `peft`, `transformers`, and `trl`
 #### Training Hyperparameters
+See training code using `peft`, `transformers`, and `trl`
 #### Speeds, Sizes, Times [optional]
+See [trainer_state.json](trainer_state.json) in this repo
 [More Information Needed]
 ## Evaluation
+See [trainer_state.json](trainer_state.json) in this repo
 ### Testing Data, Factors & Metrics
 #### Testing Data
+20% of DPO dataset (see training code)
 [More Information Needed]
 ## Model Card Authors [optional]
+Joshua Sundance Bailey
 ## Model Card Contact
+Joshua Sundance Bailey