finetune-NLLB-600M-on-opus100-Ar2En-with-Dora

Browse files

Files changed (5) hide show

README.md +13 -14
adapter_model.safetensors +1 -1
tokenizer.json +2 -2
tokenizer_config.json +0 -3
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -15,15 +15,14 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/FinalProject_/NLLB_2/runs/wpd875tt)
 # NLLB_DoRA
 This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.3271
-- Bleu: 32.6656
-- Rouge: 0.593
-- Gen Len: 17.403
 ## Model description
@@ -43,11 +42,11 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
-- train_batch_size: 2
-- eval_batch_size: 2
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 3
@@ -56,15 +55,15 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Bleu    | Rouge  | Gen Len |
 |:-------------:|:-----:|:----:|:---------------:|:-------:|:------:|:-------:|
-| 2.722         | 1.0   | 875  | 1.3916          | 31.7382 | 0.5849 | 17.493  |
-| 1.4579        | 2.0   | 1750 | 1.3379          | 32.34   | 0.5931 | 17.3715 |
-| 1.4263        | 3.0   | 2625 | 1.3271          | 32.6656 | 0.593  | 17.403  |
 ### Framework versions
 - PEFT 0.12.0
-- Transformers 4.42.3
-- Pytorch 2.1.2
-- Datasets 2.20.0
 - Tokenizers 0.19.1

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 # NLLB_DoRA
 This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.2708
+- Bleu: 32.802
+- Rouge: 0.6028
+- Gen Len: 17.4444
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
+- train_batch_size: 1
+- eval_batch_size: 1
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 3
 | Training Loss | Epoch | Step | Validation Loss | Bleu    | Rouge  | Gen Len |
 |:-------------:|:-----:|:----:|:---------------:|:-------:|:------:|:-------:|
+| 1.3937        | 1.0   | 2000 | 1.3115          | 32.2196 | 0.5954 | 17.6569 |
+| 1.3309        | 2.0   | 4000 | 1.2781          | 32.6752 | 0.6011 | 17.4931 |
+| 1.3234        | 3.0   | 6000 | 1.2708          | 32.802  | 0.6028 | 17.4444 |
 ### Framework versions
 - PEFT 0.12.0
+- Transformers 4.44.0
+- Pytorch 2.4.0
+- Datasets 2.21.0
 - Tokenizers 0.19.1

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:43b72100fb62c015d8521db5551793b983f576d26112f4361bd607758a29db7d
 size 5044160

 version https://git-lfs.github.com/spec/v1
+oid sha256:0b5fd1449e34d1864823c7733416e774bc49f2ea7b6da0bb720fd43c9f6c1d06
 size 5044160

tokenizer.json CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:59075c9f30de2258f65f3a346147cf7b9938a3d043779cb7ee39e73a7977dd7e
-size 17331274

 version https://git-lfs.github.com/spec/v1
+oid sha256:2dde13bc0ee889a9225a407b8f8ede4db6eb7baa4da336ce0091f4f2a4351138
+size 17331373

tokenizer_config.json CHANGED Viewed

@@ -1864,9 +1864,6 @@
   "bos_token": "<s>",
   "clean_up_tokenization_spaces": true,
   "cls_token": "<s>",
-  "device_map": {
-    "": 0
-  },
   "eos_token": "</s>",
   "legacy_behaviour": false,
   "load_in_8bit": true,

   "bos_token": "<s>",
   "clean_up_tokenization_spaces": true,
   "cls_token": "<s>",
   "eos_token": "</s>",
   "legacy_behaviour": false,
   "load_in_8bit": true,

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7f9f708fdce4245f8effa48a9c5086e6b0040278c806e4e7a1d1db400d655500
-size 5304

 version https://git-lfs.github.com/spec/v1
+oid sha256:5ccc9d592c44930eb6d26a94c0ab38bf33e8e48e25cc537c44befbefb45c5252
+size 5368