setfit_ar_hs / README.md

Push model using huggingface_hub.

4b4134a verified 10 days ago

12.9 kB

	---
	base_model: akhooli/sbert_ar_nli_500k_norm
	library_name: setfit
	metrics:
	- accuracy
	pipeline_tag: text-classification
	tags:
	- setfit
	- sentence-transformers
	- text-classification
	- generated_from_setfit_trainer
	widget:
	- text: يا زلمة يلي بيصنع معنا معروف بنتشكره شو ما كان يكون وانتم ادعياء الاخوة العرب
	هول مش ايرانيين ولا عجم عرب متلنا متلهم
	- text: لعمي
	- text: هلق رجع لمن قلو الريس تبعو هش قلو مشمو على عيني ؟
	- text: مثل الكليشيه وبشكل يومي في حدا بده يعاير التاني بيقوم بيشبهه بالكلب والله
	اذا حدا شبهني بالكلب بعتبرها مدح شديد
	- text: الله لا يحرمك من الهبل ان شاء الله
	inference: true
	model-index:
	- name: SetFit with akhooli/sbert_ar_nli_500k_norm
	results:
	- task:
	type: text-classification
	name: Text Classification
	dataset:
	name: Unknown
	type: unknown
	split: test
	metrics:
	- type: accuracy
	value: 0.8497652582159625
	name: Accuracy
	---

	# SetFit with akhooli/sbert_ar_nli_500k_norm

	This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [akhooli/sbert_ar_nli_500k_norm](https://huggingface.co/akhooli/sbert_ar_nli_500k_norm) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.

	The model has been trained using an efficient few-shot learning technique that involves:

	1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
	2. Training a classification head with features from the fine-tuned Sentence Transformer.

	## Model Details

	### Model Description
	- Model Type: SetFit
	- Sentence Transformer body: [akhooli/sbert_ar_nli_500k_norm](https://huggingface.co/akhooli/sbert_ar_nli_500k_norm)
	- Classification head: a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
	- Maximum Sequence Length: 512 tokens
	- Number of Classes: 2 classes
	<!-- - Training Dataset: [Unknown](https://huggingface.co/datasets/unknown) -->
	<!-- - Language: Unknown -->
	<!-- - License: Unknown -->

	### Model Sources

	- Repository: [SetFit on GitHub](https://github.com/huggingface/setfit)
	- Paper: [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
	- Blogpost: [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)

	### Model Labels
	\| Label \| Examples \|
	\|:---------\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| negative \| <ul><li>'الف تحية لشيخ العقل ومشايخنا الكرام'</li><li>'بتحبو او بتكرهو انشط وزير و رئيس تيار و ديبلوماسيتو بتتدرّس'</li><li>'نعم معاليك ستظل دمشق المدينة التي تغنى بها الشعراء وهذه الكلمات خير شاهد فرشت فوق ثراك الطاهرالهدبا'</li></ul> \|
	\| positive \| <ul><li>'لسانك حصانك وحسنا فعلت قطر لتلغي مركز الأبحاث لا مرحبا بكم انتم ولا تستاهلون اي عمل لكم ناكرين المعروف'</li><li>'ارنب وبضلك ارنب ابكي بترتاح يا صرماية'</li><li>'سليمان فرنجية عبارة عن كلب مسعور لديه حاسة شم قوية جداً شم ريحة كرسي الرئاسة ولكنه لن يجلس عليها ابداً وتصبحو على خير'</li></ul> \|

	## Evaluation

	### Metrics
	\| Label \| Accuracy \|
	\|:--------\|:---------\|
	\| all \| 0.8498 \|

	## Uses

	### Direct Use for Inference

	First install the SetFit library:

	```bash
	pip install setfit
	```

	Then you can load this model and run inference.

	```python
	from setfit import SetFitModel

	# Download from the 🤗 Hub
	model = SetFitModel.from_pretrained("akhooli/setfit_ar_hs")
	# Run inference
	preds = model("لعمي")
	```

	<!--
	### Downstream Use

	List how someone could finetune this model on their own dataset.
	-->

	<!--
	### Out-of-Scope Use

	List how the model may foreseeably be misused and address what users ought not to do with the model.
	-->

	<!--
	## Bias, Risks and Limitations

	What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.
	-->

	<!--
	### Recommendations

	What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.
	-->

	## Training Details

	### Training Set Metrics
	\| Training set \| Min \| Median \| Max \|
	\|:-------------\|:----\|:--------\|:----\|
	\| Word count \| 1 \| 12.2323 \| 52 \|

	\| Label \| Training Sample Count \|
	\|:---------\|:----------------------\|
	\| negative \| 1995 \|
	\| positive \| 2500 \|

	### Training Hyperparameters
	- batch_size: (32, 32)
	- num_epochs: (1, 1)
	- max_steps: 10000
	- sampling_strategy: undersampling
	- body_learning_rate: (2e-05, 1e-05)
	- head_learning_rate: 0.01
	- loss: CosineSimilarityLoss
	- distance_metric: cosine_distance
	- margin: 0.25
	- end_to_end: False
	- use_amp: False
	- warmup_proportion: 0.1
	- l2_weight: 0.01
	- seed: 42
	- run_name: setfit_hate_25kv
	- eval_max_steps: -1
	- load_best_model_at_end: False

	### Training Results
	\| Epoch \| Step \| Training Loss \| Validation Loss \|
	\|:------:\|:-----:\|:-------------:\|:---------------:\|
	\| 0.0002 \| 1 \| 0.3185 \| - \|
	\| 0.02 \| 100 \| 0.2901 \| - \|
	\| 0.04 \| 200 \| 0.2441 \| - \|
	\| 0.06 \| 300 \| 0.2209 \| - \|
	\| 0.08 \| 400 \| 0.1715 \| - \|
	\| 0.1 \| 500 \| 0.1304 \| - \|
	\| 0.12 \| 600 \| 0.0891 \| - \|
	\| 0.14 \| 700 \| 0.0604 \| - \|
	\| 0.16 \| 800 \| 0.0436 \| - \|
	\| 0.18 \| 900 \| 0.0408 \| - \|
	\| 0.2 \| 1000 \| 0.0265 \| - \|
	\| 0.22 \| 1100 \| 0.0239 \| - \|
	\| 0.24 \| 1200 \| 0.0235 \| - \|
	\| 0.26 \| 1300 \| 0.0232 \| - \|
	\| 0.28 \| 1400 \| 0.0241 \| - \|
	\| 0.3 \| 1500 \| 0.019 \| - \|
	\| 0.32 \| 1600 \| 0.0168 \| - \|
	\| 0.34 \| 1700 \| 0.0172 \| - \|
	\| 0.36 \| 1800 \| 0.0136 \| - \|
	\| 0.38 \| 1900 \| 0.0099 \| - \|
	\| 0.4 \| 2000 \| 0.0117 \| - \|
	\| 0.42 \| 2100 \| 0.0091 \| - \|
	\| 0.44 \| 2200 \| 0.0067 \| - \|
	\| 0.46 \| 2300 \| 0.0074 \| - \|
	\| 0.48 \| 2400 \| 0.0055 \| - \|
	\| 0.5 \| 2500 \| 0.0053 \| - \|
	\| 0.52 \| 2600 \| 0.0054 \| - \|
	\| 0.54 \| 2700 \| 0.0058 \| - \|
	\| 0.56 \| 2800 \| 0.0059 \| - \|
	\| 0.58 \| 2900 \| 0.0055 \| - \|
	\| 0.6 \| 3000 \| 0.0043 \| - \|
	\| 0.62 \| 3100 \| 0.0045 \| - \|
	\| 0.64 \| 3200 \| 0.0055 \| - \|
	\| 0.66 \| 3300 \| 0.0042 \| - \|
	\| 0.68 \| 3400 \| 0.0024 \| - \|
	\| 0.7 \| 3500 \| 0.0025 \| - \|
	\| 0.72 \| 3600 \| 0.0047 \| - \|
	\| 0.74 \| 3700 \| 0.0036 \| - \|
	\| 0.76 \| 3800 \| 0.0029 \| - \|
	\| 0.78 \| 3900 \| 0.0043 \| - \|
	\| 0.8 \| 4000 \| 0.0036 \| - \|
	\| 0.82 \| 4100 \| 0.0025 \| - \|
	\| 0.84 \| 4200 \| 0.0033 \| - \|
	\| 0.86 \| 4300 \| 0.0018 \| - \|
	\| 0.88 \| 4400 \| 0.0016 \| - \|
	\| 0.9 \| 4500 \| 0.0018 \| - \|
	\| 0.92 \| 4600 \| 0.0023 \| - \|
	\| 0.94 \| 4700 \| 0.0027 \| - \|
	\| 0.96 \| 4800 \| 0.0023 \| - \|
	\| 0.98 \| 4900 \| 0.0012 \| - \|
	\| 1.0 \| 5000 \| 0.0021 \| - \|
	\| 1.02 \| 5100 \| 0.0026 \| - \|
	\| 1.04 \| 5200 \| 0.0019 \| - \|
	\| 1.06 \| 5300 \| 0.002 \| - \|
	\| 1.08 \| 5400 \| 0.0022 \| - \|
	\| 1.1 \| 5500 \| 0.0025 \| - \|
	\| 1.12 \| 5600 \| 0.0033 \| - \|
	\| 1.1400 \| 5700 \| 0.001 \| - \|
	\| 1.16 \| 5800 \| 0.0016 \| - \|
	\| 1.18 \| 5900 \| 0.0015 \| - \|
	\| 1.2 \| 6000 \| 0.0008 \| - \|
	\| 1.22 \| 6100 \| 0.0011 \| - \|
	\| 1.24 \| 6200 \| 0.0012 \| - \|
	\| 1.26 \| 6300 \| 0.0009 \| - \|
	\| 1.28 \| 6400 \| 0.0012 \| - \|
	\| 1.3 \| 6500 \| 0.001 \| - \|
	\| 1.32 \| 6600 \| 0.0014 \| - \|
	\| 1.34 \| 6700 \| 0.0002 \| - \|
	\| 1.3600 \| 6800 \| 0.0005 \| - \|
	\| 1.38 \| 6900 \| 0.0003 \| - \|
	\| 1.4 \| 7000 \| 0.0001 \| - \|
	\| 1.42 \| 7100 \| 0.0007 \| - \|
	\| 1.44 \| 7200 \| 0.0003 \| - \|
	\| 1.46 \| 7300 \| 0.0002 \| - \|
	\| 1.48 \| 7400 \| 0.0005 \| - \|
	\| 1.5 \| 7500 \| 0.0001 \| - \|
	\| 1.52 \| 7600 \| 0.0003 \| - \|
	\| 1.54 \| 7700 \| 0.001 \| - \|
	\| 1.56 \| 7800 \| 0.0003 \| - \|
	\| 1.58 \| 7900 \| 0.0 \| - \|
	\| 1.6 \| 8000 \| 0.0002 \| - \|
	\| 1.62 \| 8100 \| 0.0 \| - \|
	\| 1.6400 \| 8200 \| 0.0002 \| - \|
	\| 1.6600 \| 8300 \| 0.0002 \| - \|
	\| 1.6800 \| 8400 \| 0.0 \| - \|
	\| 1.7 \| 8500 \| 0.0 \| - \|
	\| 1.72 \| 8600 \| 0.0002 \| - \|
	\| 1.74 \| 8700 \| 0.0002 \| - \|
	\| 1.76 \| 8800 \| 0.0002 \| - \|
	\| 1.78 \| 8900 \| 0.0002 \| - \|
	\| 1.8 \| 9000 \| 0.0 \| - \|
	\| 1.8200 \| 9100 \| 0.0004 \| - \|
	\| 1.8400 \| 9200 \| 0.0 \| - \|
	\| 1.8600 \| 9300 \| 0.0002 \| - \|
	\| 1.88 \| 9400 \| 0.0002 \| - \|
	\| 1.9 \| 9500 \| 0.0 \| - \|
	\| 1.92 \| 9600 \| 0.0003 \| - \|
	\| 1.94 \| 9700 \| 0.0 \| - \|
	\| 1.96 \| 9800 \| 0.0 \| - \|
	\| 1.98 \| 9900 \| 0.0 \| - \|
	\| 2.0 \| 10000 \| 0.0 \| - \|

	### Framework Versions
	- Python: 3.10.14
	- SetFit: 1.2.0.dev0
	- Sentence Transformers: 3.2.1
	- Transformers: 4.45.1
	- PyTorch: 2.4.0
	- Datasets: 3.0.1
	- Tokenizers: 0.20.0

	## Citation

	### BibTeX
	```bibtex
	@article{https://doi.org/10.48550/arxiv.2209.11055,
	doi = {10.48550/ARXIV.2209.11055},
	url = {https://arxiv.org/abs/2209.11055},
	author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
	keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
	title = {Efficient Few-Shot Learning Without Prompts},
	publisher = {arXiv},
	year = {2022},
	copyright = {Creative Commons Attribution 4.0 International}
	}
	```

	<!--
	## Glossary

	Clearly define terms in order to be accessible across audiences.
	-->

	<!--
	## Model Card Authors

	Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.
	-->

	<!--
	## Model Card Contact

	Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.
	-->