|
--- |
|
tags: |
|
- multi-label-classification |
|
- multi-intent-detection |
|
- huggingface |
|
- deberta-v3 |
|
- transformers |
|
library_name: transformers |
|
task: |
|
- text-classification |
|
license: apache-2.0 |
|
--- |
|
|
|
# Multi-Intent Detection (MID) Model |
|
|
|
This model was fine-tuned for the task of **Multi-Intent Detection (MID)**, a type of multi-label classification where each input can have multiple labels assigned. The dataset used for fine-tuning is specifically designed to simplify the MID task, with the number of labels limited to two per instance. |
|
|
|
## Model Details |
|
|
|
- **Base Model:** DeBERTa-v3-base |
|
- **Task:** Multi-label classification |
|
- **Number of Labels:** 2 |
|
- **Fine-tuning Framework:** Hugging Face Transformers |
|
|
|
|
|
## Training Configuration |
|
|
|
- **Training Arguments:** |
|
- **Learning Rate:** 2e-5 |
|
- **Batch Size (Train):** 16 |
|
- **Batch Size (Eval):** 16 |
|
- **Gradient Accumulation Steps:** 2 |
|
- **Number of Epochs:** 8 |
|
- **Weight Decay:** 0.01 |
|
- **Warmup Ratio:** 10% |
|
- **Learning Rate Scheduler Type:** Cosine |
|
- **Mixed Precision Training:** Enabled (FP16) |
|
- **Logging Steps:** 50 |
|
|
|
## Performance Metrics |
|
|
|
| Epoch | Training Loss | Validation Loss | Precision | Recall | F1 Score | Accuracy | |
|
|-------|---------------|-----------------|-----------|--------|----------|----------| |
|
| 0 | 0.069100 | 0.069115 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | |
|
| 2 | 0.024100 | 0.022929 | 0.952334 | 0.316920 | 0.475576 | 0.078652 | |
|
| 4 | 0.009200 | 0.010799 | 0.959768 | 0.819894 | 0.884334 | 0.653668 | |
|
| 6 | 0.006300 | 0.008773 | 0.963243 | 0.883344 | 0.921565 | 0.770654 | |
|
| 7 | 0.006200 | 0.008707 | 0.961635 | 0.886319 | 0.922442 | 0.775281 | |
|
|
|
### Final Evaluation Metrics (Epoch 8): |
|
- **Validation Loss:** 0.0087 |
|
- **Precision:** 0.9616 |
|
- **Recall:** 0.8863 |
|
- **F1 Score:** 0.9224 |
|
- **Accuracy:** 0.7753 |
|
|
|
|
|
## Limitations |
|
|
|
- **Simplified Multi-Label Setting:** This model assumes a fixed number of two labels per instance, which may not generalize to datasets with more complex multi-label settings. |
|
- **Performance on Unseen Data:** The model's performance may degrade if applied to data distributions significantly different from the training dataset. |