clinical_bert / README.md

clinical_bert

8d76a47 over 1 year ago

3.89 kB

	---
	license: mit
	base_model: emilyalsentzer/Bio_ClinicalBERT
	tags:
	- generated_from_trainer
	model-index:
	- name: clinical_bert
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# clinical_bert

	This model is a fine-tuned version of [emilyalsentzer/Bio_ClinicalBERT](https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.6020

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0005
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- lr_scheduler_warmup_steps: 100
	- training_steps: 5000

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| No log \| 0.78 \| 100 \| 1.9485 \|
	\| No log \| 1.56 \| 200 \| 1.8681 \|
	\| No log \| 2.34 \| 300 \| 1.8152 \|
	\| No log \| 3.12 \| 400 \| 1.7886 \|
	\| 1.9285 \| 3.91 \| 500 \| 1.7309 \|
	\| 1.9285 \| 4.69 \| 600 \| 1.6810 \|
	\| 1.9285 \| 5.47 \| 700 \| 1.7065 \|
	\| 1.9285 \| 6.25 \| 800 \| 1.7067 \|
	\| 1.9285 \| 7.03 \| 900 \| 1.7312 \|
	\| 1.6644 \| 7.81 \| 1000 \| 1.7006 \|
	\| 1.6644 \| 8.59 \| 1100 \| 1.6736 \|
	\| 1.6644 \| 9.38 \| 1200 \| 1.6846 \|
	\| 1.6644 \| 10.16 \| 1300 \| 1.6621 \|
	\| 1.6644 \| 10.94 \| 1400 \| 1.6381 \|
	\| 1.5247 \| 11.72 \| 1500 \| 1.6281 \|
	\| 1.5247 \| 12.5 \| 1600 \| 1.6605 \|
	\| 1.5247 \| 13.28 \| 1700 \| 1.6770 \|
	\| 1.5247 \| 14.06 \| 1800 \| 1.6666 \|
	\| 1.5247 \| 14.84 \| 1900 \| 1.6620 \|
	\| 1.4334 \| 15.62 \| 2000 \| 1.6677 \|
	\| 1.4334 \| 16.41 \| 2100 \| 1.6311 \|
	\| 1.4334 \| 17.19 \| 2200 \| 1.6743 \|
	\| 1.4334 \| 17.97 \| 2300 \| 1.6586 \|
	\| 1.4334 \| 18.75 \| 2400 \| 1.6086 \|
	\| 1.3423 \| 19.53 \| 2500 \| 1.6229 \|
	\| 1.3423 \| 20.31 \| 2600 \| 1.6475 \|
	\| 1.3423 \| 21.09 \| 2700 \| 1.6388 \|
	\| 1.3423 \| 21.88 \| 2800 \| 1.6275 \|
	\| 1.3423 \| 22.66 \| 2900 \| 1.6372 \|
	\| 1.2712 \| 23.44 \| 3000 \| 1.6345 \|
	\| 1.2712 \| 24.22 \| 3100 \| 1.6442 \|
	\| 1.2712 \| 25.0 \| 3200 \| 1.6864 \|
	\| 1.2712 \| 25.78 \| 3300 \| 1.6139 \|
	\| 1.2712 \| 26.56 \| 3400 \| 1.6161 \|
	\| 1.215 \| 27.34 \| 3500 \| 1.6491 \|
	\| 1.215 \| 28.12 \| 3600 \| 1.6442 \|
	\| 1.215 \| 28.91 \| 3700 \| 1.6409 \|
	\| 1.215 \| 29.69 \| 3800 \| 1.6539 \|
	\| 1.215 \| 30.47 \| 3900 \| 1.6052 \|
	\| 1.1652 \| 31.25 \| 4000 \| 1.6459 \|
	\| 1.1652 \| 32.03 \| 4100 \| 1.6362 \|
	\| 1.1652 \| 32.81 \| 4200 \| 1.6413 \|
	\| 1.1652 \| 33.59 \| 4300 \| 1.6377 \|
	\| 1.1652 \| 34.38 \| 4400 \| 1.6344 \|
	\| 1.1213 \| 35.16 \| 4500 \| 1.6406 \|
	\| 1.1213 \| 35.94 \| 4600 \| 1.6113 \|
	\| 1.1213 \| 36.72 \| 4700 \| 1.6410 \|
	\| 1.1213 \| 37.5 \| 4800 \| 1.6378 \|
	\| 1.1213 \| 38.28 \| 4900 \| 1.6341 \|
	\| 1.0939 \| 39.06 \| 5000 \| 1.6020 \|


	### Framework versions

	- Transformers 4.33.2
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.5
	- Tokenizers 0.13.3

	---
	license: mit
	base_model: emilyalsentzer/Bio_ClinicalBERT
	tags:
	- generated_from_trainer
	model-index:
	- name: clinical_bert
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# clinical_bert

	This model is a fine-tuned version of [emilyalsentzer/Bio_ClinicalBERT](https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.6020

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0005
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- lr_scheduler_warmup_steps: 100
	- training_steps: 5000

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| No log \| 0.78 \| 100 \| 1.9485 \|
	\| No log \| 1.56 \| 200 \| 1.8681 \|
	\| No log \| 2.34 \| 300 \| 1.8152 \|
	\| No log \| 3.12 \| 400 \| 1.7886 \|
	\| 1.9285 \| 3.91 \| 500 \| 1.7309 \|
	\| 1.9285 \| 4.69 \| 600 \| 1.6810 \|
	\| 1.9285 \| 5.47 \| 700 \| 1.7065 \|
	\| 1.9285 \| 6.25 \| 800 \| 1.7067 \|
	\| 1.9285 \| 7.03 \| 900 \| 1.7312 \|
	\| 1.6644 \| 7.81 \| 1000 \| 1.7006 \|
	\| 1.6644 \| 8.59 \| 1100 \| 1.6736 \|
	\| 1.6644 \| 9.38 \| 1200 \| 1.6846 \|
	\| 1.6644 \| 10.16 \| 1300 \| 1.6621 \|
	\| 1.6644 \| 10.94 \| 1400 \| 1.6381 \|
	\| 1.5247 \| 11.72 \| 1500 \| 1.6281 \|
	\| 1.5247 \| 12.5 \| 1600 \| 1.6605 \|
	\| 1.5247 \| 13.28 \| 1700 \| 1.6770 \|
	\| 1.5247 \| 14.06 \| 1800 \| 1.6666 \|
	\| 1.5247 \| 14.84 \| 1900 \| 1.6620 \|
	\| 1.4334 \| 15.62 \| 2000 \| 1.6677 \|
	\| 1.4334 \| 16.41 \| 2100 \| 1.6311 \|
	\| 1.4334 \| 17.19 \| 2200 \| 1.6743 \|
	\| 1.4334 \| 17.97 \| 2300 \| 1.6586 \|
	\| 1.4334 \| 18.75 \| 2400 \| 1.6086 \|
	\| 1.3423 \| 19.53 \| 2500 \| 1.6229 \|
	\| 1.3423 \| 20.31 \| 2600 \| 1.6475 \|
	\| 1.3423 \| 21.09 \| 2700 \| 1.6388 \|
	\| 1.3423 \| 21.88 \| 2800 \| 1.6275 \|
	\| 1.3423 \| 22.66 \| 2900 \| 1.6372 \|
	\| 1.2712 \| 23.44 \| 3000 \| 1.6345 \|
	\| 1.2712 \| 24.22 \| 3100 \| 1.6442 \|
	\| 1.2712 \| 25.0 \| 3200 \| 1.6864 \|
	\| 1.2712 \| 25.78 \| 3300 \| 1.6139 \|
	\| 1.2712 \| 26.56 \| 3400 \| 1.6161 \|
	\| 1.215 \| 27.34 \| 3500 \| 1.6491 \|
	\| 1.215 \| 28.12 \| 3600 \| 1.6442 \|
	\| 1.215 \| 28.91 \| 3700 \| 1.6409 \|
	\| 1.215 \| 29.69 \| 3800 \| 1.6539 \|
	\| 1.215 \| 30.47 \| 3900 \| 1.6052 \|
	\| 1.1652 \| 31.25 \| 4000 \| 1.6459 \|
	\| 1.1652 \| 32.03 \| 4100 \| 1.6362 \|
	\| 1.1652 \| 32.81 \| 4200 \| 1.6413 \|
	\| 1.1652 \| 33.59 \| 4300 \| 1.6377 \|
	\| 1.1652 \| 34.38 \| 4400 \| 1.6344 \|
	\| 1.1213 \| 35.16 \| 4500 \| 1.6406 \|
	\| 1.1213 \| 35.94 \| 4600 \| 1.6113 \|
	\| 1.1213 \| 36.72 \| 4700 \| 1.6410 \|
	\| 1.1213 \| 37.5 \| 4800 \| 1.6378 \|
	\| 1.1213 \| 38.28 \| 4900 \| 1.6341 \|
	\| 1.0939 \| 39.06 \| 5000 \| 1.6020 \|


	### Framework versions

	- Transformers 4.33.2
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.5
	- Tokenizers 0.13.3