qgallouedec HF staff commited on
Commit
00dc223
1 Parent(s): e4af263

Training in progress, epoch 1

Browse files
Files changed (3) hide show
  1. README.md +1 -2
  2. model.safetensors +1 -1
  3. training_args.bin +1 -1
README.md CHANGED
@@ -9,7 +9,6 @@ tags:
9
  model-index:
10
  - name: online-dpo-qwen2-2
11
  results: []
12
- datasets: trl-lib/ultrafeedback-prompt
13
  ---
14
 
15
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -17,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  # online-dpo-qwen2-2
19
 
20
- This model is a fine-tuned version of [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct) on the https://huggingface.co/datasets/trl-lib/ultrafeedback-prompt dataset.
21
 
22
  ## Model description
23
 
 
9
  model-index:
10
  - name: online-dpo-qwen2-2
11
  results: []
 
12
  ---
13
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
16
 
17
  # online-dpo-qwen2-2
18
 
19
+ This model is a fine-tuned version of [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct) on the trl-lib/ultrafeedback-prompt dataset.
20
 
21
  ## Model description
22
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cc30bd7cffc88b143ac34a77a3fe10ece13020674edc657b48a79f832d0af553
3
  size 1976163472
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:444040b9c85172fac0a3f53a2832ab8098ff0841854c638a600dcb57e9311378
3
  size 1976163472
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e459bdf9c4e6a0da7c2a4e9f5cc66532e6cce964b78dd05d35c5cd8191d60176
3
  size 5432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4bb16a66c3613e679403f9fa00edfd1ce7eb179a9b342ce70e788e50de1fc7fd
3
  size 5432