# Model Card: DistilBERT with LoRA for Text Classification
Model Details
Model Name: DistilBERT with LoRA for Text Classification
Model Type: Transformer-based Language Model
Base Model: distilbert-base-multilingual-cased
Fine-tuning Framework: LoRA (Low-Rank Adaptation of Large Language Models)
Trained By: ABODO Brice Donald
License: Apache 2.0
This model is a fine-tuned version of distilbert-base-multilingual-cased on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.0019
- Accuracy: 0.8276
- F1: 0.8284
- Precision: 0.8317
- Recall: 0.8276
Model description
This model is a fine-tuned version of distilbert-base-multilingual-cased
for text classification tasks. The model has been adapted using LoRA (Low-Rank Adaptation) to efficiently train on the target dataset with fewer parameters, allowing for better performance with less computational resources.
Intended uses & limitations
The model was trained and evaluated on the Russian Language news dataset, which consists of news texts labeled as positive, negative or neutral. The dataset is divided into training and test sets for evaluation purposes.
Intended Use
This model is intended for text classification tasks, particularly multilabel sentiment analysis. It can be fine-tuned further for other classification tasks by using appropriate datasets and modifying the number of labels.
Limitations and Risks
- Bias: The model may inherit biases present in the training data.
- Generalization: Performance may vary on datasets with different distributions from the training data.
- Resource Usage: Although more efficient than larger models, fine-tuning and inference still require significant computational resources.
Training and evaluation data
The model was evaluated using the following metrics:
- Accuracy: Measures the fraction of correct predictions.
- F1 Score: Harmonic mean of precision and recall.
- Precision: Proportion of positive identifications that are actually correct.
- Recall: Proportion of actual positives that are correctly identified.
Training procedure
Preprocessing
- Tokenization: The text data was tokenized using the
DistilBertTokenizer
with a maximum length of 512 tokens. - Padding and Truncation: Applied to ensure uniform input size.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0009143508688456378
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 7
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
---|---|---|---|---|---|---|---|
No log | 1.0 | 91 | 0.5987 | 0.7634 | 0.7621 | 0.7648 | 0.7634 |
No log | 2.0 | 182 | 0.3768 | 0.8693 | 0.8698 | 0.8767 | 0.8693 |
No log | 3.0 | 273 | 0.2620 | 0.9065 | 0.9063 | 0.9093 | 0.9065 |
No log | 4.0 | 364 | 0.2427 | 0.9202 | 0.9203 | 0.9220 | 0.9202 |
No log | 5.0 | 455 | 0.2244 | 0.9367 | 0.9369 | 0.9387 | 0.9367 |
0.3641 | 6.0 | 546 | 0.2385 | 0.9491 | 0.9491 | 0.9495 | 0.9491 |
0.3641 | 7.0 | 637 | 0.2560 | 0.9464 | 0.9464 | 0.9465 | 0.9464 |
How to Use
from transformers import DistilBertTokenizer, DistilBertForSequenceClassification, Trainer, TrainingArguments
from peft import PeftConfig, PeftModel
# Load the tokenizer and model
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-multilingual-cased')
model_id = 'pyteach237/multilabel_lora_distilbert_runews_classifier_tuned'
config = PeftConfig.from_pretrained(model_id)
# Define the model with LoRA
model = DistilBertForSequenceClassification.from_pretrained(
config.base_model_name_or_path,
num_labels=3
)
model = PeftModel.from_pretrained(model, model_id, config=config)
text = "Your text here :)"
# Tokenize input
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding='max_length', max_length=512)
# Make predictions
outputs = model(**inputs)
predictions = outputs.logits.argmax(dim=-1)
# Convert predictions to labels
labels = ['negative', 'neutral', 'positive']
predicted_label = labels[predictions.item()]
print(f'Predicted label: {predicted_label}')
Acknowledgements
This model card template was inspired by the Hugging Face model cards. Special thanks to the contributors of the Hugging Face transformers
library and the LoRA adaptation framework.
Contact Information
For further information, please contact [Brice Donald] at [[email protected]].
Framework versions
- PEFT 0.11.1
- Transformers 4.41.2
- Pytorch 2.1.2
- Datasets 2.19.2
- Tokenizers 0.19.1
- Downloads last month
- 2