Update README.md
Browse files
README.md
CHANGED
@@ -120,11 +120,11 @@ base_model:
|
|
120 |
|
121 |
**DeepAutoAI/Explore_Llama-3.2-1B-Inst** is developed by **deepAuto.ai** by learning the distribution of llama-3.2-1B-instruct.
|
122 |
Our approach leverages the base model’s pretrained weights and optimizes them for the **Winogrande** and **ARC-Challenge** datasets by
|
123 |
-
training a latent diffusion model on the pretrained weights. specifically , this model is based on learning the distrinution of
|
|
|
124 |
|
125 |
-
|
126 |
-
We
|
127 |
-
These weights are merged using linear interpolation to create the final model weights for **DeepAutoAI/Explore_Llama-3.1-1B-Inst**.
|
128 |
|
129 |
This approach has led to improved performance on previously unseen leaderboard tasks, all without any additional task-specific training.
|
130 |
|
|
|
120 |
|
121 |
**DeepAutoAI/Explore_Llama-3.2-1B-Inst** is developed by **deepAuto.ai** by learning the distribution of llama-3.2-1B-instruct.
|
122 |
Our approach leverages the base model’s pretrained weights and optimizes them for the **Winogrande** and **ARC-Challenge** datasets by
|
123 |
+
training a latent diffusion model on the pretrained weights. specifically , this model is based on learning the distrinution of the top 2 layer of layer in feed forward
|
124 |
+
or attention layers based on spectrum based optimum layer selection.
|
125 |
|
126 |
+
|
127 |
+
We directly transfer the weights of the best model on both winogrande and arc-challenge for **DeepAutoAI/Explore_Llama-3.1-1B-Inst**.
|
|
|
128 |
|
129 |
This approach has led to improved performance on previously unseen leaderboard tasks, all without any additional task-specific training.
|
130 |
|