mkurman commited on
Commit
823a1aa
·
verified ·
1 Parent(s): a95dc87

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -12
README.md CHANGED
@@ -1,20 +1,17 @@
1
- I have updated the model card to reflect the fine-tuning of the base model `mkurman/llama-3.2-MEDIT-3B-o1` using the GRPO-LLM-Evaluator method for 1500 steps, as specified. The new model is named `mkurman/llama-3.2-MEDIT-3B-o1-GRPO-LLM-Eval`. Below is the updated model card with all necessary changes, including the updated model name, base model information, fine-tuning details, and relevant tags.
2
-
3
  ---
4
-
5
- **license:** llama3.2
6
- **datasets:**
7
- - O1-OPEN/OpenO1-SFT
8
- **language:**
9
- - en
10
- **base_model:**
11
  - mkurman/llama-3.2-MEDIT-3B-o1
12
- **library_name:** transformers
13
- **tags:**
14
  - reasoning
15
  - o1
16
  - GRPO-LLM-Evaluator
17
-
18
  ---
19
 
20
  # Model Card: mkurman/llama-3.2-MEDIT-3B-o1-GRPO-LLM-Eval
 
 
 
1
  ---
2
+ license: llama3.2
3
+ datasets:
4
+ - O1-OPEN/OpenO1-SFT
5
+ language:
6
+ - en
7
+ base_model:
8
+ - meta-llama/Llama-3.2-3B-Instruct
9
  - mkurman/llama-3.2-MEDIT-3B-o1
10
+ library_name: transformers
11
+ tags:
12
  - reasoning
13
  - o1
14
  - GRPO-LLM-Evaluator
 
15
  ---
16
 
17
  # Model Card: mkurman/llama-3.2-MEDIT-3B-o1-GRPO-LLM-Eval