Update README.md
Browse files
README.md
CHANGED
@@ -3,7 +3,8 @@
|
|
3 |
This repository features sentiment analysis projects that leverage BERT, a leading NLP model.
|
4 |
This project involves pre-processing, tokenization, and BERT customization for airline tweet sentiment classification.
|
5 |
The tasks in this model use the original model "BERT base model (no casing)",
|
6 |
-
uses a data set: https://www.kaggle.com/datasets/crowdflower/twitter-airline-sentiment,
|
|
|
7 |
Accuracy: 0.8203551912568307
|
8 |
|
9 |
Colab notebook for improvements: https://colab.research.google.com/drive/1IQen2iNXkjOgdzjyi7PQyLFqHyqHTF3A?usp=sharing
|
@@ -20,7 +21,23 @@ uses a data set: https://www.kaggle.com/datasets/crowdflower/twitter-airline-sen
|
|
20 |
| macro avg | 0.76 | 0.76 | 0.76 | 1464 |
|
21 |
| weighted avg | 0.82 | 0.82 | 0.82 | 1464 |
|
22 |
|
23 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
|
25 |
|
26 |
|
|
|
3 |
This repository features sentiment analysis projects that leverage BERT, a leading NLP model.
|
4 |
This project involves pre-processing, tokenization, and BERT customization for airline tweet sentiment classification.
|
5 |
The tasks in this model use the original model "BERT base model (no casing)",
|
6 |
+
uses a data set: https://www.kaggle.com/datasets/crowdflower/twitter-airline-sentiment,
|
7 |
+
and there are several stages in achieving results, below are the evaluation sets
|
8 |
Accuracy: 0.8203551912568307
|
9 |
|
10 |
Colab notebook for improvements: https://colab.research.google.com/drive/1IQen2iNXkjOgdzjyi7PQyLFqHyqHTF3A?usp=sharing
|
|
|
21 |
| macro avg | 0.76 | 0.76 | 0.76 | 1464 |
|
22 |
| weighted avg | 0.82 | 0.82 | 0.82 | 1464 |
|
23 |
|
24 |
+
The sentiment classification model achieved a promising
|
25 |
+
overall accuracy of 82.04%, built on BertForSequenceClassifi-
|
26 |
+
cation and trained for 10 epochs using AdamW optimization.
|
27 |
+
The model exhibited stable performance, with validation ac-
|
28 |
+
curacy consistently between 0.79 to 0.81, indicating effective
|
29 |
+
learning. Additionally, it showed high precision, particularly
|
30 |
+
for negative sentiment (0.88), along with moderate scores for
|
31 |
+
neutral (0.68) and positive (0.72) sentiments. These results
|
32 |
+
were supported by recall and F1-score metrics, providing a
|
33 |
+
comprehensive understanding of performance across sentiment
|
34 |
+
classes. The analysis of the confusion matrix revealed strong
|
35 |
+
alignment between model predictions and actual labels, al-
|
36 |
+
beit with opportunities for improvement, such as addressing
|
37 |
+
overfitting or parameter adjustment, evident from performance
|
38 |
+
fluctuations across epochs.
|
39 |
+
|
40 |
+
Developed by:Mastika
|
41 |
|
42 |
|
43 |
|