Model Card for Model ID
This is a fine-tuned BERT-based model for intent classification, capable of categorizing intents into 82 distinct labels. It was trained on a consolidated dataset of multilingual intent datasets.
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
model = AutoModelForSequenceClassification.from_pretrained("yeniguno/bert-uncased-intent-classification")
tokenizer = AutoTokenizer.from_pretrained("yeniguno/bert-uncased-intent-classification")
pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
text = "Play the song, Sam."
prediction = pipe(text)
print(prediction)
# [{'label': 'play_music', 'score': 0.9997674822807312}]
Uses
This model is intended for:
Natural Language Understanding (NLU) tasks. Classifying user intents for applications such as:
- Voice assistants
- Chatbots
- Customer support automation
- Conversational AI systems
Bias, Risks, and Limitations
The model's performance may degrade on intents that are underrepresented in the training data. Not optimized for languages other than English. Domain-specific intents not included in the dataset may require additional fine-tuning.
Training Details
Training Data
his model was trained on a combination of intent datasets from various sources:
Datasets Used:
- mteb/amazon_massive_intent
- mteb/mtop_intent
- sonos-nlu-benchmark/snips_built_in_intents
- Mozilla/smart_intent_dataset
- Bhuvaneshwari/intent_classification
- clinc/clinc_oos
Each dataset was preprocessed, and intent labels were consolidated into 82 unique classes.
Dataset Sizes:
- Train size: 138228
- Validation size: 17279
- Test size: 17278
Training Procedure
The model was fine-tuned with the following hyperparameters:
Base Model: bert-base-uncased Learning Rate: 3e-5 Batch Size: 32 Epochs: 4 Weight Decay: 0.01 Evaluation Strategy: Per epoch Mixed Precision: FP32 Hardware: A100
Evaluation
Results
Training and Validation:
Epoch | Training Loss | Validation Loss | Accuracy | F1 Score | Precision | Recall |
---|---|---|---|---|---|---|
1 | 0.1143 | 0.1014 | 97.38% | 97.33% | 97.36% | 97.38% |
2 | 0.0638 | 0.0833 | 97.78% | 97.79% | 97.83% | 97.78% |
3 | 0.0391 | 0.0946 | 97.98% | 97.98% | 97.99% | 97.98% |
4 | 0.0122 | 0.1013 | 98.04% | 98.04% | 98.05% | 98.04% |
Test Results:
Metric | Value |
---|---|
Loss | 0.0814 |
Accuracy | 98.37% |
F1 Score | 98.37% |
Precision | 98.38% |
Recall | 98.37% |
- Downloads last month
- 7
Model tree for yeniguno/bert-uncased-intent-classification
Base model
google-bert/bert-base-uncased