bioformers
/

bioformer-8L-squad1

Question Answering

Inference Endpoints

Model card Files Files and versions Community

bioformer-8L-squad1 / README.md

Li's picture

Li

Update README.md

ac6b336 almost 2 years ago

|

941 Bytes

	---
	license: apache-2.0
	language:
	- en
	pipeline_tag: question-answering
	---
	[bioformer-8L](https://huggingface.co/bioformers/bioformer-8L) fined-tuned on the [SQuAD1](https://rajpurkar.github.io/SQuAD-explorer) dataset for 3 epochs.

	The fine-tuning process was performed on a single P100 GPUs (16GB). The hyperparameters are:

	```
	max_seq_length=512
	per_device_train_batch_size=16
	gradient_accumulation_steps=1
	total train batch size (w. parallel, distributed & accumulation) = 16
	learning_rate=3e-5
	num_train_epochs=3
	```

	## Evaluation results

	```
	"eval_exact_match": 78.55250709555345
	"eval_f1": 85.91482799690257
	```

	Bioformer's performance is on par with [DistilBERT](https://arxiv.org/pdf/1910.01108.pdf) (EM/F1: 77.7/85.8),
	although Bioformer was pretrained only on biomedical texts.


	## Speed
	In our experiments, the inference speed of Bioformer is 3x as fast as BERT-base/BioBERT/PubMedBERT, and is 40% faster than DistilBERT.