atlasia
/

Terjman-Ultra

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Community

Terjman-Ultra / README.md

BounharAbdelaziz's picture

BounharAbdelaziz

BounharAbdelaziz/Terjman-Ultra

f9cf741 verified 6 months ago

|

3.26 kB

	---
	license: cc-by-nc-4.0
	base_model: facebook/nllb-200-1.3B
	tags:
	- generated_from_trainer
	metrics:
	- bleu
	model-index:
	- name: Terjman-Ultra
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Terjman-Ultra

	This model is a fine-tuned version of [facebook/nllb-200-1.3B](https://huggingface.co/facebook/nllb-200-1.3B) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.7070
	- Bleu: 4.6998
	- Gen Len: 35.6088

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 3e-05
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 16
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.03
	- num_epochs: 25

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \| Gen Len \|
	\|:-------------:\|:-------:\|:-----:\|:---------------:\|:------:\|:-------:\|
	\| 3.203 \| 0.9999 \| 2242 \| 2.9015 \| 4.3057 \| 36.7548 \|
	\| 2.9175 \| 1.9998 \| 4484 \| 2.7602 \| 4.4286 \| 35.708 \|
	\| 2.8558 \| 2.9997 \| 6726 \| 2.7303 \| 4.629 \| 35.562 \|
	\| 2.8696 \| 4.0 \| 8969 \| 2.7195 \| 4.6537 \| 35.562 \|
	\| 2.8604 \| 4.9999 \| 11211 \| 2.7144 \| 4.6905 \| 35.5702 \|
	\| 2.8509 \| 5.9998 \| 13453 \| 2.7112 \| 4.599 \| 35.5427 \|
	\| 2.853 \| 6.9997 \| 15695 \| 2.7098 \| 4.6625 \| 35.5317 \|
	\| 2.8475 \| 8.0 \| 17938 \| 2.7081 \| 4.6901 \| 35.6419 \|
	\| 2.8192 \| 8.9999 \| 20180 \| 2.7082 \| 4.5474 \| 35.6391 \|
	\| 2.8395 \| 9.9998 \| 22422 \| 2.7077 \| 4.722 \| 35.6088 \|
	\| 2.8395 \| 10.9997 \| 24664 \| 2.7076 \| 4.752 \| 35.5868 \|
	\| 2.8362 \| 12.0 \| 26907 \| 2.7074 \| 4.6664 \| 35.562 \|
	\| 2.8673 \| 12.9999 \| 29149 \| 2.7072 \| 4.7004 \| 35.6639 \|
	\| 2.8465 \| 13.9998 \| 31391 \| 2.7076 \| 4.6715 \| 35.5923 \|
	\| 2.8281 \| 14.9997 \| 33633 \| 2.7075 \| 4.7045 \| 35.5647 \|
	\| 2.8191 \| 16.0 \| 35876 \| 2.7068 \| 4.7487 \| 35.6253 \|
	\| 2.874 \| 16.9999 \| 38118 \| 2.7076 \| 4.71 \| 35.6006 \|
	\| 2.8666 \| 17.9998 \| 40360 \| 2.7069 \| 4.6047 \| 35.6281 \|
	\| 2.8645 \| 18.9997 \| 42602 \| 2.7063 \| 4.6664 \| 35.6088 \|
	\| 2.8458 \| 20.0 \| 44845 \| 2.7070 \| 4.6552 \| 35.5813 \|
	\| 2.8501 \| 20.9999 \| 47087 \| 2.7074 \| 4.6919 \| 35.5647 \|
	\| 2.8309 \| 21.9998 \| 49329 \| 2.7074 \| 4.623 \| 35.6226 \|
	\| 2.854 \| 22.9997 \| 51571 \| 2.7072 \| 4.6495 \| 35.5978 \|
	\| 2.8407 \| 24.0 \| 53814 \| 2.7070 \| 4.6879 \| 35.5482 \|
	\| 2.8129 \| 24.9972 \| 56050 \| 2.7070 \| 4.6998 \| 35.6088 \|


	### Framework versions

	- Transformers 4.40.2
	- Pytorch 2.2.1+cu121
	- Datasets 2.19.1
	- Tokenizers 0.19.1