|
--- |
|
tags: |
|
- pytorch |
|
- sentiment-analysis |
|
- yoruba |
|
- cnn |
|
- afriberta |
|
--- |
|
|
|
# Yoruba Sentiment Analysis with CNN and Afriberta |
|
|
|
This repository contains a PyTorch model for sentiment analysis of Yoruba text. The model utilizes a Convolutional Neural Network (CNN) architecture on top of a pre-trained Afriberta model, specifically "Davlan/naija-twitter-sentiment-afriberta-large". |
|
|
|
## Model Description |
|
|
|
The model consists of the following components: |
|
|
|
- **Afriberta Base:** The pre-trained Afriberta model serves as a powerful feature extractor for Yoruba text. |
|
- **CNN Layers:** Multiple 1D convolutional layers with varying kernel sizes capture local patterns and n-gram features from the Afriberta embeddings. |
|
- **Max Pooling:** Max pooling layers extract the most salient features from the convolutional outputs. |
|
- **Dropout:** Dropout regularization helps prevent overfitting. |
|
- **Fully Connected Layer:** A final fully connected layer maps the concatenated pooled features to sentiment classes. |
|
|
|
## Intended Uses & Limitations |
|
|
|
This model is designed for sentiment analysis of Yoruba text and can be applied to various use cases, such as: |
|
|
|
- **Social Media Monitoring:** Analyze sentiment expressed in Yoruba tweets or social media posts. |
|
- **Customer Feedback Analysis:** Understand customer sentiment towards products or services in Yoruba. |
|
- **Opinion Mining:** Extract opinions and sentiments from Yoruba text data. |
|
|
|
**Limitations:** |
|
|
|
- The model's performance may be limited by the size and quality of the training data. |
|
- It may not generalize well to domains significantly different from the training data. |
|
- As with any language model, there's a risk of bias and potential for misuse. |
|
|
|
## Training and Evaluation Data |
|
|
|
The model was trained on a dataset of Yoruba tweets annotated with sentiment labels. The dataset was split into training, validation, and test sets. |
|
|
|
## Training Procedure |
|
|
|
The model was trained using the following steps: |
|
|
|
1. **Data Preprocessing:** Text data was tokenized using the Afriberta tokenizer. |
|
2. **Model Initialization:** The SentimentCNNModel was initialized with the pre-trained Afriberta model and CNN layers. |
|
3. **Optimization:** The model was trained using the Adam optimizer and cross-entropy loss. |
|
4. **Early Stopping:** Training was stopped early based on validation loss to prevent overfitting. |
|
|
|
## Evaluation Results |
|
|
|
The model achieved the following performance on the test set: |
|
|
|
- **Test Loss:** [0.6707] |
|
- **F1-Score:** [0.8095] |
|
|
|
## How to Use |
|
|
|
1. **Install Dependencies:** Ensure you have PyTorch and Transformers installed: |
|
|
|
```bash |
|
pip install torch transformers |
|
``` |
|
|
|
2. **Load the Model:** You can load the model using the Hugging Face `transformers` library: |
|
|
|
```python |
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
|
|
model_name = "Testys/cnn_sent_yor" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForSequenceClassification.from_pretrained(model_name) |
|
``` |
|
|
|
3. **Make Predictions:** Use the tokenizer to prepare your input text and the model to get predictions: |
|
|
|
```python |
|
def predict(text): |
|
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128) |
|
outputs = model(**inputs) |
|
return torch.softmax(outputs.logits, dim=1).items() |
|
|
|
sample_text = "Your Yoruba text here" |
|
prediction = predict(sample_text) |
|
print("Sentiment:", prediction) |
|
``` |
|
|
|
## Citing the Model |
|
|
|
If you use this model in your research, please cite it using the following format: |
|
|
|
```bibtex |
|
@misc{your_model_name, |
|
author = {Your Name}, |
|
title = {Yoruba Sentiment Analysis with CNN and Afriberta}, |
|
year = {2024}, |
|
publisher = {Hugging Face's Model Hub}, |
|
journal = {Hugging Face's Model Hub}, |
|
howpublished = {\\url{https://huggingface.co/your_model_name}} |
|
} |
|
``` |
|
|
|
## License |
|
|
|
This model is open-sourced under the MIT license. The license allows commercial use, modification, distribution, and private use. |
|
|
|
## Contact Information |
|
|
|
For any queries regarding the model, feel free to reach out via GitHub or direct email: |
|
- **GitHub:** [https://github.com/dev-tyta] |
|
- **Email:** [[email protected]] |
|
|
|
``` |
|
|