poison-distill / README.md

End of training

10978cd verified 4 months ago

4.44 kB

	---
	library_name: transformers
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: poison-distill
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# poison-distill

	This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: -113.4181
	- Accuracy: 0.6917

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- num_epochs: 50
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| -0.7533 \| 1.0 \| 130 \| -7.9549 \| 0.5263 \|
	\| -8.2193 \| 2.0 \| 260 \| -15.5418 \| 0.4662 \|
	\| -14.3197 \| 3.0 \| 390 \| -32.2167 \| 0.4737 \|
	\| -18.5547 \| 4.0 \| 520 \| -18.9202 \| 0.5489 \|
	\| -22.6905 \| 5.0 \| 650 \| -55.1682 \| 0.4361 \|
	\| -27.5336 \| 6.0 \| 780 \| -32.4679 \| 0.3459 \|
	\| -29.5975 \| 7.0 \| 910 \| -48.1715 \| 0.3985 \|
	\| -34.1837 \| 8.0 \| 1040 \| -67.7293 \| 0.6165 \|
	\| -37.6123 \| 9.0 \| 1170 \| -52.1341 \| 0.4662 \|
	\| -40.7694 \| 10.0 \| 1300 \| -49.0945 \| 0.6767 \|
	\| -43.3691 \| 11.0 \| 1430 \| -37.0478 \| 0.5489 \|
	\| -47.6433 \| 12.0 \| 1560 \| -73.0523 \| 0.4511 \|
	\| -51.0141 \| 13.0 \| 1690 \| -110.8840 \| 0.4812 \|
	\| -54.6 \| 14.0 \| 1820 \| -81.2219 \| 0.3308 \|
	\| -57.2133 \| 15.0 \| 1950 \| -80.8684 \| 0.5113 \|
	\| -58.3442 \| 16.0 \| 2080 \| -66.5341 \| 0.4060 \|
	\| -64.7089 \| 17.0 \| 2210 \| -75.7059 \| 0.5564 \|
	\| -64.26 \| 18.0 \| 2340 \| -77.7801 \| 0.5263 \|
	\| -67.8509 \| 19.0 \| 2470 \| -61.1841 \| 0.6316 \|
	\| -71.9371 \| 20.0 \| 2600 \| -118.1544 \| 0.5038 \|
	\| -75.9672 \| 21.0 \| 2730 \| -179.2044 \| 0.4812 \|
	\| -78.0096 \| 22.0 \| 2860 \| -129.4854 \| 0.4436 \|
	\| -80.3581 \| 23.0 \| 2990 \| -100.0687 \| 0.4286 \|
	\| -84.623 \| 24.0 \| 3120 \| -82.5292 \| 0.3835 \|
	\| -86.5363 \| 25.0 \| 3250 \| -84.6636 \| 0.4211 \|
	\| -90.8566 \| 26.0 \| 3380 \| -96.3337 \| 0.5489 \|
	\| -92.2054 \| 27.0 \| 3510 \| -110.3293 \| 0.4737 \|
	\| -97.6982 \| 28.0 \| 3640 \| -195.6973 \| 0.4135 \|
	\| -95.8944 \| 29.0 \| 3770 \| -101.9933 \| 0.3609 \|
	\| -99.491 \| 30.0 \| 3900 \| -99.8199 \| 0.6541 \|
	\| -103.0877 \| 31.0 \| 4030 \| -94.2175 \| 0.6767 \|
	\| -102.7123 \| 32.0 \| 4160 \| -98.6300 \| 0.4887 \|
	\| -105.2087 \| 33.0 \| 4290 \| -152.7768 \| 0.4962 \|
	\| -105.3795 \| 34.0 \| 4420 \| -198.8245 \| 0.5263 \|
	\| -108.9734 \| 35.0 \| 4550 \| -105.7644 \| 0.4286 \|
	\| -111.1308 \| 36.0 \| 4680 \| -121.4677 \| 0.4962 \|
	\| -115.0085 \| 37.0 \| 4810 \| -75.3733 \| 0.3083 \|
	\| -114.714 \| 38.0 \| 4940 \| -115.4598 \| 0.6617 \|
	\| -117.5734 \| 39.0 \| 5070 \| -108.3964 \| 0.4135 \|
	\| -115.1971 \| 40.0 \| 5200 \| -123.7679 \| 0.3835 \|
	\| -117.5617 \| 41.0 \| 5330 \| -69.2224 \| 0.2932 \|
	\| -118.2803 \| 42.0 \| 5460 \| -104.5906 \| 0.6541 \|
	\| -119.6297 \| 43.0 \| 5590 \| -187.3416 \| 0.5188 \|
	\| -121.6325 \| 44.0 \| 5720 \| -221.8878 \| 0.5113 \|
	\| -120.9663 \| 45.0 \| 5850 \| -176.6644 \| 0.3759 \|
	\| -122.3583 \| 46.0 \| 5980 \| -142.5218 \| 0.4361 \|
	\| -126.6614 \| 47.0 \| 6110 \| -271.1018 \| 0.4962 \|
	\| -122.1615 \| 48.0 \| 6240 \| -240.8323 \| 0.3985 \|
	\| -125.4207 \| 49.0 \| 6370 \| -103.5760 \| 0.6466 \|
	\| -127.0661 \| 50.0 \| 6500 \| -113.2718 \| 0.6842 \|


	### Framework versions

	- Transformers 4.46.3
	- Pytorch 2.5.1+cu121
	- Datasets 3.1.0
	- Tokenizers 0.20.3

	---
	library_name: transformers
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: poison-distill
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# poison-distill

	This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: -113.4181
	- Accuracy: 0.6917

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- num_epochs: 50
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| -0.7533 \| 1.0 \| 130 \| -7.9549 \| 0.5263 \|
	\| -8.2193 \| 2.0 \| 260 \| -15.5418 \| 0.4662 \|
	\| -14.3197 \| 3.0 \| 390 \| -32.2167 \| 0.4737 \|
	\| -18.5547 \| 4.0 \| 520 \| -18.9202 \| 0.5489 \|
	\| -22.6905 \| 5.0 \| 650 \| -55.1682 \| 0.4361 \|
	\| -27.5336 \| 6.0 \| 780 \| -32.4679 \| 0.3459 \|
	\| -29.5975 \| 7.0 \| 910 \| -48.1715 \| 0.3985 \|
	\| -34.1837 \| 8.0 \| 1040 \| -67.7293 \| 0.6165 \|
	\| -37.6123 \| 9.0 \| 1170 \| -52.1341 \| 0.4662 \|
	\| -40.7694 \| 10.0 \| 1300 \| -49.0945 \| 0.6767 \|
	\| -43.3691 \| 11.0 \| 1430 \| -37.0478 \| 0.5489 \|
	\| -47.6433 \| 12.0 \| 1560 \| -73.0523 \| 0.4511 \|
	\| -51.0141 \| 13.0 \| 1690 \| -110.8840 \| 0.4812 \|
	\| -54.6 \| 14.0 \| 1820 \| -81.2219 \| 0.3308 \|
	\| -57.2133 \| 15.0 \| 1950 \| -80.8684 \| 0.5113 \|
	\| -58.3442 \| 16.0 \| 2080 \| -66.5341 \| 0.4060 \|
	\| -64.7089 \| 17.0 \| 2210 \| -75.7059 \| 0.5564 \|
	\| -64.26 \| 18.0 \| 2340 \| -77.7801 \| 0.5263 \|
	\| -67.8509 \| 19.0 \| 2470 \| -61.1841 \| 0.6316 \|
	\| -71.9371 \| 20.0 \| 2600 \| -118.1544 \| 0.5038 \|
	\| -75.9672 \| 21.0 \| 2730 \| -179.2044 \| 0.4812 \|
	\| -78.0096 \| 22.0 \| 2860 \| -129.4854 \| 0.4436 \|
	\| -80.3581 \| 23.0 \| 2990 \| -100.0687 \| 0.4286 \|
	\| -84.623 \| 24.0 \| 3120 \| -82.5292 \| 0.3835 \|
	\| -86.5363 \| 25.0 \| 3250 \| -84.6636 \| 0.4211 \|
	\| -90.8566 \| 26.0 \| 3380 \| -96.3337 \| 0.5489 \|
	\| -92.2054 \| 27.0 \| 3510 \| -110.3293 \| 0.4737 \|
	\| -97.6982 \| 28.0 \| 3640 \| -195.6973 \| 0.4135 \|
	\| -95.8944 \| 29.0 \| 3770 \| -101.9933 \| 0.3609 \|
	\| -99.491 \| 30.0 \| 3900 \| -99.8199 \| 0.6541 \|
	\| -103.0877 \| 31.0 \| 4030 \| -94.2175 \| 0.6767 \|
	\| -102.7123 \| 32.0 \| 4160 \| -98.6300 \| 0.4887 \|
	\| -105.2087 \| 33.0 \| 4290 \| -152.7768 \| 0.4962 \|
	\| -105.3795 \| 34.0 \| 4420 \| -198.8245 \| 0.5263 \|
	\| -108.9734 \| 35.0 \| 4550 \| -105.7644 \| 0.4286 \|
	\| -111.1308 \| 36.0 \| 4680 \| -121.4677 \| 0.4962 \|
	\| -115.0085 \| 37.0 \| 4810 \| -75.3733 \| 0.3083 \|
	\| -114.714 \| 38.0 \| 4940 \| -115.4598 \| 0.6617 \|
	\| -117.5734 \| 39.0 \| 5070 \| -108.3964 \| 0.4135 \|
	\| -115.1971 \| 40.0 \| 5200 \| -123.7679 \| 0.3835 \|
	\| -117.5617 \| 41.0 \| 5330 \| -69.2224 \| 0.2932 \|
	\| -118.2803 \| 42.0 \| 5460 \| -104.5906 \| 0.6541 \|
	\| -119.6297 \| 43.0 \| 5590 \| -187.3416 \| 0.5188 \|
	\| -121.6325 \| 44.0 \| 5720 \| -221.8878 \| 0.5113 \|
	\| -120.9663 \| 45.0 \| 5850 \| -176.6644 \| 0.3759 \|
	\| -122.3583 \| 46.0 \| 5980 \| -142.5218 \| 0.4361 \|
	\| -126.6614 \| 47.0 \| 6110 \| -271.1018 \| 0.4962 \|
	\| -122.1615 \| 48.0 \| 6240 \| -240.8323 \| 0.3985 \|
	\| -125.4207 \| 49.0 \| 6370 \| -103.5760 \| 0.6466 \|
	\| -127.0661 \| 50.0 \| 6500 \| -113.2718 \| 0.6842 \|


	### Framework versions

	- Transformers 4.46.3
	- Pytorch 2.5.1+cu121
	- Datasets 3.1.0
	- Tokenizers 0.20.3