kasohrab
/

electra-distilled-qa

Question Answering

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

electra-distilled-qa / README.md

kasohrab's picture

End of training

4c9a448 about 1 year ago

|

history blame contribute delete

3.05 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	metrics:
	- f1
	model-index:
	- name: electra-distilled-qa
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# electra-distilled-qa

	This model is a fine-tuned version of [google/electra-small-discriminator](https://huggingface.co/google/electra-small-discriminator) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Exact: 68.1799
	- F1: 71.7591
	- Total: 11873
	- Hasans Exact: 70.3441
	- Hasans F1: 77.5129
	- Hasans Total: 5928
	- Noans Exact: 66.0219
	- Noans F1: 66.0219
	- Noans Total: 5945
	- Best Exact: 68.1799
	- Best Exact Thresh: 0.0
	- Best F1: 71.7591
	- Best F1 Thresh: 0.0
	- Loss: No log

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 4.244429373516175e-05
	- train_batch_size: 128
	- eval_batch_size: 128
	- seed: 33
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 12

	### Training results

	\| Training Loss \| Epoch \| Step \| \| Exact \| F1 \| Total \| Exact Thresh \| F1 Thresh \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:-----:\|:-------:\|:-------:\|:-----:\|:------------:\|:---------:\|:---------------:\|
	\| 1.9086 \| 1.0 \| 1030 \| 11873 \| 57.9719 \| 62.0421 \| 5945 \| 0.0 \| 0.0 \| No log \|
	\| 1.2919 \| 2.0 \| 2060 \| 11873 \| 66.8155 \| 70.0115 \| 5945 \| 0.0 \| 0.0 \| No log \|
	\| 1.1194 \| 3.0 \| 3090 \| 11873 \| 66.8070 \| 70.1755 \| 5945 \| 0.0 \| 0.0 \| No log \|
	\| 1.0051 \| 4.0 \| 4120 \| 11873 \| 68.9632 \| 72.4292 \| 5945 \| 0.0 \| 0.0 \| No log \|
	\| 0.9191 \| 5.0 \| 5150 \| 11873 \| 67.9609 \| 71.3639 \| 5945 \| 0.0 \| 0.0 \| No log \|
	\| 0.8562 \| 6.0 \| 6180 \| 11873 \| 69.5949 \| 72.9986 \| 5945 \| 0.0 \| 0.0 \| No log \|
	\| 0.8017 \| 7.0 \| 7210 \| 11873 \| 68.6095 \| 72.2303 \| 5945 \| 0.0 \| 0.0 \| No log \|
	\| 0.7554 \| 8.0 \| 8240 \| 11873 \| 67.4556 \| 71.0028 \| 5945 \| 0.0 \| 0.0 \| No log \|
	\| 0.7196 \| 9.0 \| 9270 \| 11873 \| 68.0788 \| 71.6887 \| 5945 \| 0.0 \| 0.0 \| No log \|
	\| 0.6914 \| 10.0 \| 10300 \| 11873 \| 68.6431 \| 72.1849 \| 5945 \| 0.0 \| 0.0 \| No log \|
	\| 0.6687 \| 11.0 \| 11330 \| 11873 \| 68.2473 \| 71.7832 \| 5945 \| 0.0 \| 0.0 \| No log \|
	\| 0.6517 \| 12.0 \| 12360 \| 11873 \| 68.1799 \| 71.7591 \| 5945 \| 0.0 \| 0.0 \| No log \|


	### Framework versions

	- Transformers 4.28.1
	- Pytorch 2.2.1+cu121
	- Datasets 2.19.0
	- Tokenizers 0.13.3