Add SetFit model

fc5981a about 1 year ago

13.7 kB

	---
	library_name: setfit
	tags:
	- setfit
	- sentence-transformers
	- text-classification
	- generated_from_setfit_trainer
	metrics:
	- metric
	widget:
	- text: Damn, my condolences to you bro
	- text: No Friday Im booked all day
	- text: Im sorry.
	- text: Hiding in the bush
	- text: '*"The conservative party is a cult." Says the group that bans words and follows
	socialism.??*'
	pipeline_tag: text-classification
	inference: false
	base_model: sentence-transformers/paraphrase-mpnet-base-v2
	model-index:
	- name: SetFit with sentence-transformers/paraphrase-mpnet-base-v2
	results:
	- task:
	type: text-classification
	name: Text Classification
	dataset:
	name: Unknown
	type: unknown
	split: test
	metrics:
	- type: metric
	value: 0.7340375623557441
	name: Metric
	---

	# SetFit with sentence-transformers/paraphrase-mpnet-base-v2

	This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) as the Sentence Transformer embedding model. A ClassifierChain instance is used for classification.

	The model has been trained using an efficient few-shot learning technique that involves:

	1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
	2. Training a classification head with features from the fine-tuned Sentence Transformer.

	## Model Details

	### Model Description
	- Model Type: SetFit
	- Sentence Transformer body: [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2)
	- Classification head: a ClassifierChain instance
	- Maximum Sequence Length: 512 tokens
	<!-- - Number of Classes: Unknown -->
	<!-- - Training Dataset: [Unknown](https://huggingface.co/datasets/unknown) -->
	<!-- - Language: Unknown -->
	<!-- - License: Unknown -->

	### Model Sources

	- Repository: [SetFit on GitHub](https://github.com/huggingface/setfit)
	- Paper: [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
	- Blogpost: [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)

	## Evaluation

	### Metrics
	\| Label \| Metric \|
	\|:--------\|:-------\|
	\| all \| 0.7340 \|

	## Uses

	### Direct Use for Inference

	First install the SetFit library:

	```bash
	pip install setfit
	```

	Then you can load this model and run inference.

	```python
	from setfit import SetFitModel

	# Download from the 🤗 Hub
	model = SetFitModel.from_pretrained("CrisisNarratives/setfit-8classes-multi_label")
	# Run inference
	preds = model("Im sorry.")
	```

	<!--
	### Downstream Use

	List how someone could finetune this model on their own dataset.
	-->

	<!--
	### Out-of-Scope Use

	List how the model may foreseeably be misused and address what users ought not to do with the model.
	-->

	<!--
	## Bias, Risks and Limitations

	What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.
	-->

	<!--
	### Recommendations

	What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.
	-->

	## Training Details

	### Training Set Metrics
	\| Training set \| Min \| Median \| Max \|
	\|:-------------\|:----\|:--------\|:-----\|
	\| Word count \| 1 \| 25.3789 \| 1681 \|

	### Training Hyperparameters
	- batch_size: (16, 16)
	- num_epochs: (3, 3)
	- max_steps: -1
	- sampling_strategy: oversampling
	- num_iterations: 40
	- body_learning_rate: (1.752e-05, 1.752e-05)
	- head_learning_rate: 1.752e-05
	- loss: CosineSimilarityLoss
	- distance_metric: cosine_distance
	- margin: 0.25
	- end_to_end: False
	- use_amp: False
	- warmup_proportion: 0.1
	- seed: 30
	- eval_max_steps: -1
	- load_best_model_at_end: False

	### Training Results
	\| Epoch \| Step \| Training Loss \| Validation Loss \|
	\|:------:\|:----:\|:-------------:\|:---------------:\|
	\| 0.0004 \| 1 \| 0.4024 \| - \|
	\| 0.0185 \| 50 \| 0.2502 \| - \|
	\| 0.0370 \| 100 \| 0.2222 \| - \|
	\| 0.0555 \| 150 \| 0.2279 \| - \|
	\| 0.0739 \| 200 \| 0.2556 \| - \|
	\| 0.0924 \| 250 \| 0.2444 \| - \|
	\| 0.1109 \| 300 \| 0.2441 \| - \|
	\| 0.1294 \| 350 \| 0.2538 \| - \|
	\| 0.1479 \| 400 \| 0.2245 \| - \|
	\| 0.1664 \| 450 \| 0.2111 \| - \|
	\| 0.1848 \| 500 \| 0.1554 \| - \|
	\| 0.2033 \| 550 \| 0.1361 \| - \|
	\| 0.2218 \| 600 \| 0.1712 \| - \|
	\| 0.2403 \| 650 \| 0.1506 \| - \|
	\| 0.2588 \| 700 \| 0.1175 \| - \|
	\| 0.2773 \| 750 \| 0.0695 \| - \|
	\| 0.2957 \| 800 \| 0.0916 \| - \|
	\| 0.3142 \| 850 \| 0.0884 \| - \|
	\| 0.3327 \| 900 \| 0.0412 \| - \|
	\| 0.3512 \| 950 \| 0.1189 \| - \|
	\| 0.3697 \| 1000 \| 0.0485 \| - \|
	\| 0.3882 \| 1050 \| 0.1098 \| - \|
	\| 0.4067 \| 1100 \| 0.0303 \| - \|
	\| 0.4251 \| 1150 \| 0.0244 \| - \|
	\| 0.4436 \| 1200 \| 0.0429 \| - \|
	\| 0.4621 \| 1250 \| 0.034 \| - \|
	\| 0.4806 \| 1300 \| 0.0725 \| - \|
	\| 0.4991 \| 1350 \| 0.0438 \| - \|
	\| 0.5176 \| 1400 \| 0.0124 \| - \|
	\| 0.5360 \| 1450 \| 0.1603 \| - \|
	\| 0.5545 \| 1500 \| 0.1134 \| - \|
	\| 0.5730 \| 1550 \| 0.098 \| - \|
	\| 0.5915 \| 1600 \| 0.0343 \| - \|
	\| 0.6100 \| 1650 \| 0.0354 \| - \|
	\| 0.6285 \| 1700 \| 0.0892 \| - \|
	\| 0.6470 \| 1750 \| 0.0137 \| - \|
	\| 0.6654 \| 1800 \| 0.071 \| - \|
	\| 0.6839 \| 1850 \| 0.0317 \| - \|
	\| 0.7024 \| 1900 \| 0.0285 \| - \|
	\| 0.7209 \| 1950 \| 0.0311 \| - \|
	\| 0.7394 \| 2000 \| 0.0755 \| - \|
	\| 0.7579 \| 2050 \| 0.09 \| - \|
	\| 0.7763 \| 2100 \| 0.0565 \| - \|
	\| 0.7948 \| 2150 \| 0.0099 \| - \|
	\| 0.8133 \| 2200 \| 0.0236 \| - \|
	\| 0.8318 \| 2250 \| 0.0663 \| - \|
	\| 0.8503 \| 2300 \| 0.1391 \| - \|
	\| 0.8688 \| 2350 \| 0.0176 \| - \|
	\| 0.8872 \| 2400 \| 0.0645 \| - \|
	\| 0.9057 \| 2450 \| 0.0318 \| - \|
	\| 0.9242 \| 2500 \| 0.0186 \| - \|
	\| 0.9427 \| 2550 \| 0.0514 \| - \|
	\| 0.9612 \| 2600 \| 0.0261 \| - \|
	\| 0.9797 \| 2650 \| 0.0535 \| - \|
	\| 0.9982 \| 2700 \| 0.018 \| - \|
	\| 1.0166 \| 2750 \| 0.0218 \| - \|
	\| 1.0351 \| 2800 \| 0.0351 \| - \|
	\| 1.0536 \| 2850 \| 0.0704 \| - \|
	\| 1.0721 \| 2900 \| 0.0251 \| - \|
	\| 1.0906 \| 2950 \| 0.0156 \| - \|
	\| 1.1091 \| 3000 \| 0.0821 \| - \|
	\| 1.1275 \| 3050 \| 0.0273 \| - \|
	\| 1.1460 \| 3100 \| 0.0719 \| - \|
	\| 1.1645 \| 3150 \| 0.0496 \| - \|
	\| 1.1830 \| 3200 \| 0.0124 \| - \|
	\| 1.2015 \| 3250 \| 0.0576 \| - \|
	\| 1.2200 \| 3300 \| 0.0453 \| - \|
	\| 1.2384 \| 3350 \| 0.0236 \| - \|
	\| 1.2569 \| 3400 \| 0.013 \| - \|
	\| 1.2754 \| 3450 \| 0.0909 \| - \|
	\| 1.2939 \| 3500 \| 0.024 \| - \|
	\| 1.3124 \| 3550 \| 0.0264 \| - \|
	\| 1.3309 \| 3600 \| 0.0397 \| - \|
	\| 1.3494 \| 3650 \| 0.0484 \| - \|
	\| 1.3678 \| 3700 \| 0.0301 \| - \|
	\| 1.3863 \| 3750 \| 0.0512 \| - \|
	\| 1.4048 \| 3800 \| 0.0625 \| - \|
	\| 1.4233 \| 3850 \| 0.0583 \| - \|
	\| 1.4418 \| 3900 \| 0.0506 \| - \|
	\| 1.4603 \| 3950 \| 0.0561 \| - \|
	\| 1.4787 \| 4000 \| 0.0295 \| - \|
	\| 1.4972 \| 4050 \| 0.1352 \| - \|
	\| 1.5157 \| 4100 \| 0.0101 \| - \|
	\| 1.5342 \| 4150 \| 0.0221 \| - \|
	\| 1.5527 \| 4200 \| 0.057 \| - \|
	\| 1.5712 \| 4250 \| 0.0389 \| - \|
	\| 1.5896 \| 4300 \| 0.0173 \| - \|
	\| 1.6081 \| 4350 \| 0.0605 \| - \|
	\| 1.6266 \| 4400 \| 0.0187 \| - \|
	\| 1.6451 \| 4450 \| 0.0401 \| - \|
	\| 1.6636 \| 4500 \| 0.0571 \| - \|
	\| 1.6821 \| 4550 \| 0.0612 \| - \|
	\| 1.7006 \| 4600 \| 0.03 \| - \|
	\| 1.7190 \| 4650 \| 0.0299 \| - \|
	\| 1.7375 \| 4700 \| 0.0583 \| - \|
	\| 1.7560 \| 4750 \| 0.0279 \| - \|
	\| 1.7745 \| 4800 \| 0.027 \| - \|
	\| 1.7930 \| 4850 \| 0.0343 \| - \|
	\| 1.8115 \| 4900 \| 0.0634 \| - \|
	\| 1.8299 \| 4950 \| 0.0748 \| - \|
	\| 1.8484 \| 5000 \| 0.0699 \| - \|
	\| 1.8669 \| 5050 \| 0.0678 \| - \|
	\| 1.8854 \| 5100 \| 0.0724 \| - \|
	\| 1.9039 \| 5150 \| 0.0211 \| - \|
	\| 1.9224 \| 5200 \| 0.037 \| - \|
	\| 1.9409 \| 5250 \| 0.0891 \| - \|
	\| 1.9593 \| 5300 \| 0.0235 \| - \|
	\| 1.9778 \| 5350 \| 0.0339 \| - \|
	\| 1.9963 \| 5400 \| 0.029 \| - \|
	\| 2.0148 \| 5450 \| 0.1292 \| - \|
	\| 2.0333 \| 5500 \| 0.0457 \| - \|
	\| 2.0518 \| 5550 \| 0.0577 \| - \|
	\| 2.0702 \| 5600 \| 0.063 \| - \|
	\| 2.0887 \| 5650 \| 0.0198 \| - \|
	\| 2.1072 \| 5700 \| 0.0367 \| - \|
	\| 2.1257 \| 5750 \| 0.0475 \| - \|
	\| 2.1442 \| 5800 \| 0.0368 \| - \|
	\| 2.1627 \| 5850 \| 0.0401 \| - \|
	\| 2.1811 \| 5900 \| 0.0353 \| - \|
	\| 2.1996 \| 5950 \| 0.0387 \| - \|
	\| 2.2181 \| 6000 \| 0.0325 \| - \|
	\| 2.2366 \| 6050 \| 0.046 \| - \|
	\| 2.2551 \| 6100 \| 0.03 \| - \|
	\| 2.2736 \| 6150 \| 0.0338 \| - \|
	\| 2.2921 \| 6200 \| 0.0374 \| - \|
	\| 2.3105 \| 6250 \| 0.0206 \| - \|
	\| 2.3290 \| 6300 \| 0.031 \| - \|
	\| 2.3475 \| 6350 \| 0.0493 \| - \|
	\| 2.3660 \| 6400 \| 0.0182 \| - \|
	\| 2.3845 \| 6450 \| 0.0352 \| - \|
	\| 2.4030 \| 6500 \| 0.0622 \| - \|
	\| 2.4214 \| 6550 \| 0.0682 \| - \|
	\| 2.4399 \| 6600 \| 0.0227 \| - \|
	\| 2.4584 \| 6650 \| 0.0401 \| - \|
	\| 2.4769 \| 6700 \| 0.0348 \| - \|
	\| 2.4954 \| 6750 \| 0.0417 \| - \|
	\| 2.5139 \| 6800 \| 0.0232 \| - \|
	\| 2.5323 \| 6850 \| 0.0603 \| - \|
	\| 2.5508 \| 6900 \| 0.0981 \| - \|
	\| 2.5693 \| 6950 \| 0.0433 \| - \|
	\| 2.5878 \| 7000 \| 0.0187 \| - \|
	\| 2.6063 \| 7050 \| 0.0099 \| - \|
	\| 2.6248 \| 7100 \| 0.0276 \| - \|
	\| 2.6433 \| 7150 \| 0.0516 \| - \|
	\| 2.6617 \| 7200 \| 0.0211 \| - \|
	\| 2.6802 \| 7250 \| 0.0191 \| - \|
	\| 2.6987 \| 7300 \| 0.1152 \| - \|
	\| 2.7172 \| 7350 \| 0.0442 \| - \|
	\| 2.7357 \| 7400 \| 0.0226 \| - \|
	\| 2.7542 \| 7450 \| 0.0429 \| - \|
	\| 2.7726 \| 7500 \| 0.0313 \| - \|
	\| 2.7911 \| 7550 \| 0.0601 \| - \|
	\| 2.8096 \| 7600 \| 0.0156 \| - \|
	\| 2.8281 \| 7650 \| 0.039 \| - \|
	\| 2.8466 \| 7700 \| 0.0239 \| - \|
	\| 2.8651 \| 7750 \| 0.1159 \| - \|
	\| 2.8835 \| 7800 \| 0.0223 \| - \|
	\| 2.9020 \| 7850 \| 0.0442 \| - \|
	\| 2.9205 \| 7900 \| 0.0254 \| - \|
	\| 2.9390 \| 7950 \| 0.0268 \| - \|
	\| 2.9575 \| 8000 \| 0.0415 \| - \|
	\| 2.9760 \| 8050 \| 0.0235 \| - \|
	\| 2.9945 \| 8100 \| 0.0177 \| - \|

	### Framework Versions
	- Python: 3.9.16
	- SetFit: 1.0.1
	- Sentence Transformers: 2.2.2
	- Transformers: 4.35.0
	- PyTorch: 2.1.0+cu121
	- Datasets: 2.14.6
	- Tokenizers: 0.14.1

	## Citation

	### BibTeX
	```bibtex
	@article{https://doi.org/10.48550/arxiv.2209.11055,
	doi = {10.48550/ARXIV.2209.11055},
	url = {https://arxiv.org/abs/2209.11055},
	author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
	keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
	title = {Efficient Few-Shot Learning Without Prompts},
	publisher = {arXiv},
	year = {2022},
	copyright = {Creative Commons Attribution 4.0 International}
	}
	```

	<!--
	## Glossary

	Clearly define terms in order to be accessible across audiences.
	-->

	<!--
	## Model Card Authors

	Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.
	-->

	<!--
	## Model Card Contact

	Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.
	-->