daqc
/

kuntur-peru-legal-es-gemma-2b-it-merged

@@ -54,7 +54,7 @@ The kuntur-peru-legal-es-gemma-2b-it-merged model is a state-of-the-art language
     + [QLoRA Configuration 🧮](#qlora-configuration)
     + [Model Merging and Saving 💾](#model-merging-and-saving)
   * [Logging with Wandb 📊](#logging-with-wandb)
-  * [Impacto Ambiental 🌳](#impacto-ambiental)
@@ -102,7 +102,7 @@ The dataset encompasses a wide range of topics and provisions within the Peruvia
   <img src="https://cdn-uploads.huggingface.co/production/uploads/64461026e1fd8d65b27e6187/m3yAx86LN-xLEZ4Mz1ALQ.png" alt="Train Graph" width="900">
 </p>
-## Val Progress
 <p align="center">
   <img src="https://cdn-uploads.huggingface.co/production/uploads/64461026e1fd8d65b27e6187/nuk-TgiEH8IRDjmP6_luR.png" alt="Val Graph" width="900">
 </p>
@@ -139,7 +139,7 @@ QLoRA (Quantization LoRA) was employed to optimize the model's computational eff
     - **bias:** Set to "none" to exclude bias terms from adaptation, simplifying the model architecture.
     - **lora_dropout:** Reduced to 0.025 from the default 0.05, controlling the dropout rate during adaptation.
     - **task_type:** Configured as "CAUSAL_LM" to indicate the task type of the language model.
-    -
     ```python
         config = LoraConfig(
         r=8,
@@ -154,18 +154,32 @@ QLoRA (Quantization LoRA) was employed to optimize the model's computational eff
 These configurations were crucial for optimizing the model's performance and resource utilization during training and inference, ensuring efficient deployment.
-## Model Merging and Saving 💾
 After fine-tuning, the LoRA-adjusted weights were merged back with the base Gemma model to create the final kuntur-peru-legal-es-gemma-2b-it-merged. The model was then saved and made available through Hugging Face for easy access and further development.
-## Logging with Wandb 📊
 During the training process, Wandb (Weights & Biases) was used for comprehensive logging and visualization of key metrics. Wandb's powerful tracking capabilities enabled real-time monitoring of training progress, evaluation metrics, and model performance. Through interactive dashboards and visualizations, Wandb facilitated deep insights into the training dynamics, allowing for efficient model optimization and debugging. This logging integration with Wandb enhances transparency, reproducibility, and collaboration among researchers and practitioners.
-## Environmental impact 🌳
 The training of `kuntur-peru-legal-es-gemma-2b-it-merged` was conducted optimizing the computational expenditure required.

     + [QLoRA Configuration 🧮](#qlora-configuration)
     + [Model Merging and Saving 💾](#model-merging-and-saving)
   * [Logging with Wandb 📊](#logging-with-wandb)
+  * [Environmental impact 🌳](#environmental-impac)
   <img src="https://cdn-uploads.huggingface.co/production/uploads/64461026e1fd8d65b27e6187/m3yAx86LN-xLEZ4Mz1ALQ.png" alt="Train Graph" width="900">
 </p>
+## Eval Progress
 <p align="center">
   <img src="https://cdn-uploads.huggingface.co/production/uploads/64461026e1fd8d65b27e6187/nuk-TgiEH8IRDjmP6_luR.png" alt="Val Graph" width="900">
 </p>
     - **bias:** Set to "none" to exclude bias terms from adaptation, simplifying the model architecture.
     - **lora_dropout:** Reduced to 0.025 from the default 0.05, controlling the dropout rate during adaptation.
     - **task_type:** Configured as "CAUSAL_LM" to indicate the task type of the language model.
     ```python
         config = LoraConfig(
         r=8,
 These configurations were crucial for optimizing the model's performance and resource utilization during training and inference, ensuring efficient deployment.
+## Model Merging and Saving
 After fine-tuning, the LoRA-adjusted weights were merged back with the base Gemma model to create the final kuntur-peru-legal-es-gemma-2b-it-merged. The model was then saved and made available through Hugging Face for easy access and further development.
+## Logging with Wandb
 During the training process, Wandb (Weights & Biases) was used for comprehensive logging and visualization of key metrics. Wandb's powerful tracking capabilities enabled real-time monitoring of training progress, evaluation metrics, and model performance. Through interactive dashboards and visualizations, Wandb facilitated deep insights into the training dynamics, allowing for efficient model optimization and debugging. This logging integration with Wandb enhances transparency, reproducibility, and collaboration among researchers and practitioners.
+ - eval/loss:1.1386919021606443
+ - eval/runtime:44.2153
+ - eval/samples_per_second:8.707
+ - eval/steps_per_second:8.707
+ - train/epoch:49.62
+ - train/global_step:4,850
+ - train/grad_norm:3.5548949241638184
+ - train/learning_rate:0
+ - train/loss:0.8596
+ - train/total_flos:236,149,029,419,876,350
+ - train/train_loss:1.105836234535139
+ - train/train_runtime:13,237.4947
+ - train/train_samples_per_second:5.9
+ - train/train_steps_per_second:0.366
+## Environmental impact
 The training of `kuntur-peru-legal-es-gemma-2b-it-merged` was conducted optimizing the computational expenditure required.