t5-english-ner / README.md

Librarian Bot: Add base_model information to model (#1)

73ea1ba about 1 year ago

6.86 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	datasets:
	- private
	base_model: t5-large
	model-index:
	- name: ner-news-t5-large
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# T5-Encoder(T5-large model) fine-tuned on very small dataset for token classification

	Simple experimental model that was trained in 3 epochs on very small dataset

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForTokenClassification, NerPipeline

	model = AutoModelForTokenClassification.from_pretrained("imvladikon/t5-english-ner", trust_remote_code=True)
	tokenizer = AutoTokenizer.from_pretrained("imvladikon/t5-english-ner", trust_remote_code=True)

	pipe = NerPipeline(model=model, tokenizer=tokenizer, aggregation_strategy="max")
	print(pipe("London is the capital city of England and the United Kingdom"))
	"""
	[{'entity_group': 'LOCATION',
	'score': 0.84536326,
	'word': 'London',
	'start': 0,
	'end': 6},
	{'entity_group': 'LOCATION',
	'score': 0.8957489,
	'word': 'England',
	'start': 30,
	'end': 37},
	{'entity_group': 'LOCATION',
	'score': 0.73186326,
	'word': 'UnitedKingdom',
	'start': 46,
	'end': 60}]
	"""
	```

	## Usage in spacy

	```bash
	pip install spacy transformers git+https://github.com/explosion/spacy-huggingface-pipelines -q
	```

	```python
	import spacy
	from spacy import displacy

	text = "My name is Sarah and I live in London"

	nlp = spacy.blank("en")
	nlp.add_pipe("hf_token_pipe", config={"model": "imvladikon/t5-english-ner", "kwargs": {"trust_remote_code":True}})
	doc = nlp(text)
	print(doc.ents)
	# (Sarah, London)
	```


	This model is a fine-tuned version of [t5-large](https://huggingface.co/t5-large) on the private(en) dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.1956
	- Commercial Item Precision: 0.0
	- Commercial Item Recall: 0.0
	- Commercial Item F1: 0.0
	- Commercial Item Number: 1
	- Date Precision: 0.8125
	- Date Recall: 0.9286
	- Date F1: 0.8667
	- Date Number: 14
	- Location Precision: 0.7143
	- Location Recall: 0.75
	- Location F1: 0.7317
	- Location Number: 20
	- Organization Precision: 0.8588
	- Organization Recall: 0.9125
	- Organization F1: 0.8848
	- Organization Number: 80
	- Other Precision: 0.3684
	- Other Recall: 0.3333
	- Other F1: 0.35
	- Other Number: 21
	- Person Precision: 0.8182
	- Person Recall: 0.9310
	- Person F1: 0.8710
	- Person Number: 29
	- Quantity Precision: 0.8
	- Quantity Recall: 0.8571
	- Quantity F1: 0.8276
	- Quantity Number: 14
	- Title Precision: 0.0
	- Title Recall: 0.0
	- Title F1: 0.0
	- Title Number: 7
	- Overall Precision: 0.75
	- Overall Recall: 0.7903
	- Overall F1: 0.7696
	- Overall Accuracy: 0.9534

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Commercial Item Precision \| Commercial Item Recall \| Commercial Item F1 \| Commercial Item Number \| Date Precision \| Date Recall \| Date F1 \| Date Number \| Location Precision \| Location Recall \| Location F1 \| Location Number \| Organization Precision \| Organization Recall \| Organization F1 \| Organization Number \| Other Precision \| Other Recall \| Other F1 \| Other Number \| Person Precision \| Person Recall \| Person F1 \| Person Number \| Quantity Precision \| Quantity Recall \| Quantity F1 \| Quantity Number \| Title Precision \| Title Recall \| Title F1 \| Title Number \| Overall Precision \| Overall Recall \| Overall F1 \| Overall Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------------------------:\|:----------------------:\|:------------------:\|:----------------------:\|:--------------:\|:-----------:\|:-------:\|:-----------:\|:------------------:\|:---------------:\|:-----------:\|:---------------:\|:----------------------:\|:-------------------:\|:---------------:\|:-------------------:\|:---------------:\|:------------:\|:--------:\|:------------:\|:----------------:\|:-------------:\|:---------:\|:-------------:\|:------------------:\|:---------------:\|:-----------:\|:---------------:\|:---------------:\|:------------:\|:--------:\|:------------:\|:-----------------:\|:--------------:\|:----------:\|:----------------:\|
	\| 0.8868 \| 1.0 \| 708 \| 0.2725 \| 0.0 \| 0.0 \| 0.0 \| 1 \| 0.8125 \| 0.9286 \| 0.8667 \| 14 \| 0.4167 \| 0.75 \| 0.5357 \| 20 \| 0.8272 \| 0.8375 \| 0.8323 \| 80 \| 1.0 \| 0.0476 \| 0.0909 \| 21 \| 0.8438 \| 0.9310 \| 0.8852 \| 29 \| 0.6667 \| 0.7143 \| 0.6897 \| 14 \| 0.0 \| 0.0 \| 0.0 \| 7 \| 0.7348 \| 0.7151 \| 0.7248 \| 0.9446 \|
	\| 0.2984 \| 2.0 \| 1416 \| 0.2121 \| 0.0 \| 0.0 \| 0.0 \| 1 \| 0.8667 \| 0.9286 \| 0.8966 \| 14 \| 0.5 \| 0.8 \| 0.6154 \| 20 \| 0.8375 \| 0.8375 \| 0.8375 \| 80 \| 0.3077 \| 0.1905 \| 0.2353 \| 21 \| 0.8182 \| 0.9310 \| 0.8710 \| 29 \| 0.7333 \| 0.7857 \| 0.7586 \| 14 \| 0.0 \| 0.0 \| 0.0 \| 7 \| 0.7077 \| 0.7419 \| 0.7244 \| 0.9481 \|
	\| 0.1729 \| 3.0 \| 2124 \| 0.1956 \| 0.0 \| 0.0 \| 0.0 \| 1 \| 0.8125 \| 0.9286 \| 0.8667 \| 14 \| 0.7143 \| 0.75 \| 0.7317 \| 20 \| 0.8588 \| 0.9125 \| 0.8848 \| 80 \| 0.3684 \| 0.3333 \| 0.35 \| 21 \| 0.8182 \| 0.9310 \| 0.8710 \| 29 \| 0.8 \| 0.8571 \| 0.8276 \| 14 \| 0.0 \| 0.0 \| 0.0 \| 7 \| 0.75 \| 0.7903 \| 0.7696 \| 0.9534 \|


	### Framework versions

	- Transformers 4.21.1
	- Pytorch 1.12.0+cu113
	- Datasets 2.4.0
	- Tokenizers 0.12.1
	## WANDB
	[training logs and reports](https://wandb.ai/imvladikon/huggingface/runs/uyl6ihl1)