Shotaro30678
/

response_generator_DPO

Text Generation

4-bit precision

Model card Files Files and versions Community

Shotaro30678 commited on Aug 26

Commit

2910e15

•

1 Parent(s): 4adb4e3

Update README.md

Files changed (1) hide show

README.md +26 -27

README.md CHANGED Viewed

@@ -27,33 +27,32 @@ Use dpo trainer to do the RLHF so that the model can be more precise and consist
 ## Model performance
-#### DPO Trained
-- **[sentiment score](https://huggingface.co/Shotaro30678/emotion_text_classifier_on_dd_v1)**
-  - Accuracy: 0.851
-  - F1-score: 0.85637
-- **[gibberish detection](https://huggingface.co/madhurjindal/autonlp-Gibberish-Detector-492513457)**
-  | **Clean** | **Mild Gibberish** | **Word Salad** | **Noise** |
-  |-----------|--------------------|----------------|-----------|
-  | 882       | 94                 | 21             | 3         |
-- **cut-off output**
-  - complete output: 985
-  - incomplete output: 15
-#### SFT model (ref)
-- **[sentiment score](https://huggingface.co/Shotaro30678/emotion_text_classifier_on_dd_v1)**
-  - Accuracy: 0.788
-  - F1-score: 0.79749
-- **[gibberish detection](https://huggingface.co/madhurjindal/autonlp-Gibberish-Detector-492513457)**
-| **Clean** | **Mild Gibberish** | **Word Salad** | **Noise** |
-|-----------|--------------------|----------------|-----------|
-| 898       | 58                 | 33             | 11         |
-- **cut-off output**
-  - complete output: 975
-  - incomplete output: 25
 on [hermeschen1116/daily_dialog_for_RG](https://huggingface.co/datasets/hermeschen1116/daily_dialog_for_RG) test split.

 ## Model performance
+### Model Comparison
+**Sentiment Score:**
+**[Shotaro30678/emotion_text_classifier_on_dd_v1](https://huggingface.co/Shotaro30678/emotion_text_classifier_on_dd_v1)**
+| **Metric**   | **DPO Trained Model** | **SFT Model (Reference)** |
+|--------------|-----------------------|---------------------------|
+| **Accuracy** | 0.851                 | 0.788                     |
+| **F1-score** | 0.8564                | 0.7975                    |
+**Gibberish Distribution:**
+**[madhurjindal/autonlp-Gibberish-Detector-492513457](https://huggingface.co/madhurjindal/autonlp-Gibberish-Detector-492513457)**
+| **Category**        | **DPO Trained Model** | **SFT Model (Reference)** |
+|---------------------|-----------------------|---------------------------|
+| **Clean**           | 882                   | 898                       |
+| **Mild Gibberish**  | 94                    | 58                        |
+| **Word Salad**      | 21                    | 33                        |
+| **Noise**           | 3                     | 11                        |
+**Cut-Off Output:**
+| **Output Type**     | **DPO Trained Model** | **SFT Model (Reference)** |
+|---------------------|-----------------------|---------------------------|
+| **Complete Output** | 985                   | 975                       |
+| **Incomplete Output** | 15                  | 25                        |
 on [hermeschen1116/daily_dialog_for_RG](https://huggingface.co/datasets/hermeschen1116/daily_dialog_for_RG) test split.