Rzoro
/

checkpoints_3_14

Multiple Choice

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

checkpoints_3_14 / README.md

Rzoro's picture

Model save

d0c3c26 over 1 year ago

|

2.76 kB

	---
	license: mit
	base_model: microsoft/deberta-v3-large
	tags:
	- generated_from_trainer
	model-index:
	- name: checkpoints_3_14
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# checkpoints_3_14

	This model is a fine-tuned version of [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.9950
	- Map@3: 0.7360

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 0
	- gradient_accumulation_steps: 8
	- total_train_batch_size: 8
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Map@3 \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|
	\| 1.6075 \| 0.04 \| 200 \| 1.6074 \| 0.6027 \|
	\| 1.5444 \| 0.08 \| 400 \| 1.3248 \| 0.6428 \|
	\| 1.4506 \| 0.13 \| 600 \| 1.2670 \| 0.6707 \|
	\| 1.3635 \| 0.17 \| 800 \| 1.1671 \| 0.6850 \|
	\| 1.3478 \| 0.21 \| 1000 \| 1.0909 \| 0.7003 \|
	\| 1.3021 \| 0.25 \| 1200 \| 1.0701 \| 0.6923 \|
	\| 1.3284 \| 0.29 \| 1400 \| 1.0627 \| 0.7085 \|
	\| 1.2869 \| 0.34 \| 1600 \| 1.0645 \| 0.7003 \|
	\| 1.2735 \| 0.38 \| 1800 \| 1.1617 \| 0.7043 \|
	\| 1.3019 \| 0.42 \| 2000 \| 1.0272 \| 0.7120 \|
	\| 1.2824 \| 0.46 \| 2200 \| 1.0781 \| 0.7123 \|
	\| 1.2882 \| 0.51 \| 2400 \| 1.0454 \| 0.7178 \|
	\| 1.2699 \| 0.55 \| 2600 \| 1.0439 \| 0.7225 \|
	\| 1.2165 \| 0.59 \| 2800 \| 1.0208 \| 0.7260 \|
	\| 1.2419 \| 0.63 \| 3000 \| 1.0166 \| 0.7292 \|
	\| 1.2395 \| 0.67 \| 3200 \| 1.0065 \| 0.7310 \|
	\| 1.2368 \| 0.72 \| 3400 \| 1.0429 \| 0.7275 \|
	\| 1.2232 \| 0.76 \| 3600 \| 1.0105 \| 0.7353 \|
	\| 1.1969 \| 0.8 \| 3800 \| 1.0017 \| 0.7370 \|
	\| 1.2451 \| 0.84 \| 4000 \| 0.9982 \| 0.7383 \|
	\| 1.2088 \| 0.88 \| 4200 \| 0.9977 \| 0.7372 \|
	\| 1.2229 \| 0.93 \| 4400 \| 0.9953 \| 0.7367 \|
	\| 1.2612 \| 0.97 \| 4600 \| 0.9950 \| 0.7360 \|


	### Framework versions

	- Transformers 4.33.2
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.5
	- Tokenizers 0.13.3