Model Card for distilbert-base-uncased-finetuned-amazon-reviews
Table of Contents
- Model Card for distilbert-base-uncased-finetuned-amazon-reviews
- Table of Contents
- Model Details
- Uses
- Fine-tuning hyperparameters
- Evaluation
- Framework versions
Model Details
Model Description
This model is a fine-tuned version of distilbert-base-uncased on amazon_reviews_multi dataset. This model reaches an accuracy of xxx on the dev set.
- Model type: Language model
- Language(s) (NLP): en
- License: apache-2.0
- Parent Model: For more details about DistilBERT, check out this model card.
- Resources for more information:
Uses
You can use this model directly with a pipeline for text classification.
from transformers import pipeline
checkpoint = "amir7d0/distilbert-base-uncased-finetuned-amazon-reviews"
classifier = pipeline("text-classification", model=checkpoint)
classifier(["Replace me by any text you'd like."])
and in TensorFlow:
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
checkpoint = "amir7d0/distilbert-base-uncased-finetuned-amazon-reviews"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint)
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='tf')
output = model(encoded_input)
Training Details
Training and Evaluation Data
Here is the raw dataset (amazon_reviews_multi) we used for finetuning the model. The dataset contains 200,000, 5,000, and 5,000 reviews in the training, dev, and test sets respectively.
Fine-tuning hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5
Accuracy
The fine-tuned model was evaluated on the test set of amazon_reviews_multi
.
- Accuracy (exact) is the exact match of the number of stars.
- Accuracy (off-by-1) is the percentage of reviews where the number of stars the model predicts differs by a maximum of 1 from the number given by the human reviewer.
Split | Accuracy (exact) | Accuracy (off-by-1) |
---|---|---|
Dev set | 56.96% | 85.50% |
Test set | 57.36% | 85.58% |
Framework versions
- Transformers 4.26.1
- TensorFlow 2.11.0
- Datasets 2.1.0
- Tokenizers 0.13.2
- Downloads last month
- 17
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Dataset used to train amir7d0/distilbert-base-uncased-finetuned-amazon-reviews
Space using amir7d0/distilbert-base-uncased-finetuned-amazon-reviews 1
Evaluation results
- Accuracy top2 on amazon_reviews_multitest set self-reported0.856
- Loss on amazon_reviews_multitest set self-reported1.234