#NLP-Sentiment-Analysis-Airline-Tweets-with-BERT-V2 This repository features sentiment analysis projects that leverage BERT, a leading NLP model. This project involves pre-processing, tokenization, and BERT customization for airline tweet sentiment classification. The tasks in this model use the original model "BERT base model (no casing)", uses a data set: https://www.kaggle.com/datasets/crowdflower/twitter-airline-sentiment, and there are several stages in achieving results, below are the evaluation sets Accuracy: 0.8203551912568307 Colab notebook for improvements: https://colab.research.google.com/drive/1IQen2iNXkjOgdzjyi7PQyLFqHyqHTF3A?usp=sharing #classification report for more detailed evaluation : | | precision | recall | f1-score | support | |-----------|-----------|--------|----------|---------| | negative | 0.88 | 0.90 | 0.89 | 959 | | neutral | 0.68 | 0.58 | 0.62 | 293 | | positive | 0.72 | 0.81 | 0.76 | 212 | |-----------|-----------|--------|----------|---------| | accuracy | | | 0.82 | 1464 | | macro avg | 0.76 | 0.76 | 0.76 | 1464 | | weighted avg | 0.82 | 0.82 | 0.82 | 1464 | The sentiment classification model achieved a promising overall accuracy of 82.04%, built on BertForSequenceClassifi- cation and trained for 10 epochs using AdamW optimization. The model exhibited stable performance, with validation ac- curacy consistently between 0.79 to 0.81, indicating effective learning. Additionally, it showed high precision, particularly for negative sentiment (0.88), along with moderate scores for neutral (0.68) and positive (0.72) sentiments. These results were supported by recall and F1-score metrics, providing a comprehensive understanding of performance across sentiment classes. The analysis of the confusion matrix revealed strong alignment between model predictions and actual labels, al- beit with opportunities for improvement, such as addressing overfitting or parameter adjustment, evident from performance fluctuations across epochs. Developed by:Mastika