File size: 4,191 Bytes
cad2dfa |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 |
---
tags:
- pytorch
- sentiment-analysis
- yoruba
- cnn
- afriberta
---
# Yoruba Sentiment Analysis with CNN and Afriberta
This repository contains a PyTorch model for sentiment analysis of Yoruba text. The model utilizes a Convolutional Neural Network (CNN) architecture on top of a pre-trained Afriberta model, specifically "Davlan/naija-twitter-sentiment-afriberta-large".
## Model Description
The model consists of the following components:
- **Afriberta Base:** The pre-trained Afriberta model serves as a powerful feature extractor for Yoruba text.
- **CNN Layers:** Multiple 1D convolutional layers with varying kernel sizes capture local patterns and n-gram features from the Afriberta embeddings.
- **Max Pooling:** Max pooling layers extract the most salient features from the convolutional outputs.
- **Dropout:** Dropout regularization helps prevent overfitting.
- **Fully Connected Layer:** A final fully connected layer maps the concatenated pooled features to sentiment classes.
## Intended Uses & Limitations
This model is designed for sentiment analysis of Yoruba text and can be applied to various use cases, such as:
- **Social Media Monitoring:** Analyze sentiment expressed in Yoruba tweets or social media posts.
- **Customer Feedback Analysis:** Understand customer sentiment towards products or services in Yoruba.
- **Opinion Mining:** Extract opinions and sentiments from Yoruba text data.
**Limitations:**
- The model's performance may be limited by the size and quality of the training data.
- It may not generalize well to domains significantly different from the training data.
- As with any language model, there's a risk of bias and potential for misuse.
## Training and Evaluation Data
The model was trained on a dataset of Yoruba tweets annotated with sentiment labels. The dataset was split into training, validation, and test sets.
## Training Procedure
The model was trained using the following steps:
1. **Data Preprocessing:** Text data was tokenized using the Afriberta tokenizer.
2. **Model Initialization:** The SentimentCNNModel was initialized with the pre-trained Afriberta model and CNN layers.
3. **Optimization:** The model was trained using the Adam optimizer and cross-entropy loss.
4. **Early Stopping:** Training was stopped early based on validation loss to prevent overfitting.
## Evaluation Results
The model achieved the following performance on the test set:
- **Test Loss:** [0.6707]
- **F1-Score:** [0.8095]
## How to Use
1. **Install Dependencies:** Ensure you have PyTorch and Transformers installed:
```bash
pip install torch transformers
```
2. **Load the Model:** You can load the model using the Hugging Face `transformers` library:
```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model_name = "Testys/cnn_sent_yor"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
```
3. **Make Predictions:** Use the tokenizer to prepare your input text and the model to get predictions:
```python
def predict(text):
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128)
outputs = model(**inputs)
return torch.softmax(outputs.logits, dim=1).items()
sample_text = "Your Yoruba text here"
prediction = predict(sample_text)
print("Sentiment:", prediction)
```
## Citing the Model
If you use this model in your research, please cite it using the following format:
```bibtex
@misc{your_model_name,
author = {Your Name},
title = {Yoruba Sentiment Analysis with CNN and Afriberta},
year = {2024},
publisher = {Hugging Face's Model Hub},
journal = {Hugging Face's Model Hub},
howpublished = {\\url{https://huggingface.co/your_model_name}}
}
```
## License
This model is open-sourced under the MIT license. The license allows commercial use, modification, distribution, and private use.
## Contact Information
For any queries regarding the model, feel free to reach out via GitHub or direct email:
- **GitHub:** [https://github.com/dev-tyta]
- **Email:** [[email protected]]
```
|