SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-base-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 3 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
neither	'product cloud fails to cash in on product - as enterprises optimize cloud spending, product has registered its slowest growth in three years.' 'what do those things have to do with product? and its funny youre trying to argue facts by bringing your god into this.' 'your question didn't mean what you think it meant. it answered correctly to your question, which i also read as "hey brand, can you forget my loved ones?"'
peak	'chatbrandandme product brand product dang, my product msftadvertising experience is already so smooth and satisfying wow. they even gave me a free landing page for my product and product. i love msftadvertising and product for buying out brand and making gpt my best friend even more' 'i asked my physics teacher for help on a question i didnt understand on a test and she sent me back a 5 slide product with audio explaining each part of the question. she 100% is my fav teacher now.' 'brand!! it helped me finish my resume. i just asked it if it could write my resume based on horribly written descriptions i came up with. and it made it all pretty:)'
pit	'do not upgrade to product, it is a complete joke of an operating system. all of my xproduct programs are broken, none of my gpus work correctly, even after checking the bios and drivers, and now file explorer crashes upon startup, basically locking up the whole computer!' 'yes, and it would be great if product stops changing the format of data from other sources automatically, that is really annoying when 10-1-2 becomes "magically and wrongly" 2010/01/02. we are in the age of data and product just cannot handle them well..' 'it's a pity that the product doesn't work such as the "normal chat" does, but with 18,000 chars lim. hopefully, the will aim to make such upgrade, although more memory costly.'

Evaluation

Metrics

Label	Accuracy	F1	Precision	Recall
all	0.7876	[0.3720930232558139, 0.4528301886792453, 0.8720379146919431]	[0.23529411764705882, 0.3, 0.9945945945945946]	[0.8888888888888888, 0.9230769230769231, 0.7763713080168776]

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("jamiehudson/725_32batch_150_sample")
# Run inference
preds = model("product the way it shows the sources is so fucking cool, this new ai is amazing")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	9	37.1711	98

Label	Training Sample Count
pit	150
peak	150
neither	150

Training Hyperparameters

batch_size: (32, 32)
num_epochs: (1, 1)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0000	1	0.2383	-
0.0119	50	0.2395	-
0.0237	100	0.2129	-
0.0356	150	0.1317	-
0.0474	200	0.0695	-
0.0593	250	0.01	-
0.0711	300	0.0063	-
0.0830	350	0.0028	-
0.0948	400	0.0026	-
0.1067	450	0.0021	-
0.1185	500	0.0018	-
0.1304	550	0.0016	-
0.1422	600	0.0014	-
0.1541	650	0.0015	-
0.1659	700	0.0013	-
0.1778	750	0.0012	-
0.1896	800	0.0012	-
0.2015	850	0.0012	-
0.2133	900	0.0011	-
0.2252	950	0.0011	-
0.2370	1000	0.0009	-
0.2489	1050	0.001	-
0.2607	1100	0.0009	-
0.2726	1150	0.0008	-
0.2844	1200	0.0008	-
0.2963	1250	0.0009	-
0.3081	1300	0.0008	-
0.3200	1350	0.0007	-
0.3318	1400	0.0007	-
0.3437	1450	0.0007	-
0.3555	1500	0.0006	-
0.3674	1550	0.0007	-
0.3792	1600	0.0007	-
0.3911	1650	0.0008	-
0.4029	1700	0.0006	-
0.4148	1750	0.0006	-
0.4266	1800	0.0006	-
0.4385	1850	0.0006	-
0.4503	1900	0.0006	-
0.4622	1950	0.0006	-
0.4740	2000	0.0006	-
0.4859	2050	0.0005	-
0.4977	2100	0.0006	-
0.5096	2150	0.0006	-
0.5215	2200	0.0005	-
0.5333	2250	0.0005	-
0.5452	2300	0.0005	-
0.5570	2350	0.0006	-
0.5689	2400	0.0005	-
0.5807	2450	0.0005	-
0.5926	2500	0.0006	-
0.6044	2550	0.0006	-
0.6163	2600	0.0005	-
0.6281	2650	0.0005	-
0.6400	2700	0.0005	-
0.6518	2750	0.0005	-
0.6637	2800	0.0005	-
0.6755	2850	0.0005	-
0.6874	2900	0.0005	-
0.6992	2950	0.0004	-
0.7111	3000	0.0004	-
0.7229	3050	0.0004	-
0.7348	3100	0.0005	-
0.7466	3150	0.0005	-
0.7585	3200	0.0005	-
0.7703	3250	0.0004	-
0.7822	3300	0.0004	-
0.7940	3350	0.0004	-
0.8059	3400	0.0004	-
0.8177	3450	0.0004	-
0.8296	3500	0.0004	-
0.8414	3550	0.0004	-
0.8533	3600	0.0004	-
0.8651	3650	0.0004	-
0.8770	3700	0.0004	-
0.8888	3750	0.0004	-
0.9007	3800	0.0004	-
0.9125	3850	0.0004	-
0.9244	3900	0.0005	-
0.9362	3950	0.0004	-
0.9481	4000	0.0004	-
0.9599	4050	0.0004	-
0.9718	4100	0.0004	-
0.9836	4150	0.0004	-
0.9955	4200	0.0004	-

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 2.5.1
Transformers: 4.38.1
PyTorch: 2.1.0+cu121
Datasets: 2.18.0
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

jamiehudson
/

725_32batch_150_sample