grounded-ai
/

phi3-toxicity-judge-merge

@@ -2,7 +2,54 @@
 library_name: transformers
 tags: []
 ---
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
@@ -196,4 +243,4 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
 ## Model Card Contact
-[More Information Needed]

 library_name: transformers
 tags: []
 ---
+# Wiki Toxic Comment Classification Model Card
+## Model Details
+| Model Name | Wiki Toxic |
+| --- | --- |
+| License | apache-2.0 |
+| Dataset | wiki_toxic |
+| Language | English |
+## Model Metrics
+| Metric | Value | Description |
+| --- | --- | --- |
+| Accuracy | 0.87 | Overall accuracy on the test set |
+| Precision | 0.85 (0), 0.89 (1) | Precision for non-toxic and toxic classes |
+| Recall | 0.90 (0), 0.85 (1) | Recall for non-toxic and toxic classes |
+| F1-Score | 0.87 (0), 0.87 (1) | F1-Score for non-toxic and toxic classes |
+| Macro Avg | Precision: 0.87 <br> Recall: 0.87 <br> F1-Score: 0.87 | Macro-averaged values across classes |
+| Weighted Avg | Precision: 0.87 <br> Recall: 0.87 <br> F1-Score: 0.87 | Weighted-averaged values across classes |
+| Support | 0: 175 <br> 1: 175 <br> Total: 350 | Support for each class |
+## Model Description
+This model has been trained on the wiki_toxic dataset, comprising comments from Wikipedia talk pages labeled as toxic or non-toxic. The model's performance is evaluated on a held-out test set, with results indicating a balanced performance across both classes.
+Achieving an overall accuracy of 0.87, the model demonstrates a strong ability to classify toxic and non-toxic comments accurately. For the non-toxic class (0), the model excels in precision (0.91), indicating a low rate of false positives. Meanwhile, for the toxic class (1), the model's recall of 0.91 highlights its effectiveness in capturing the majority of toxic comments.
+While the model performs well, there's room for enhancement. Improving precision for the toxic class and recall for the non-toxic class could further boost its performance. This may involve fine-tuning the model, incorporating additional features, or expanding the dataset to cover a broader range of toxic comment variations.
+## Intended Uses & Limitations
+The Wiki Toxic model is designed for comment classification tasks, specifically identifying toxic behavior in online discussions. It can be employed in moderation systems to flag potentially harmful comments, fostering a healthier online environment.
+However, it's crucial to acknowledge that the model's performance is tied to the data it was trained on. As such, its effectiveness may vary with different datasets or comment styles. Additionally, the model doesn't consider context, user relationships, or nuances of language, which could impact its accuracy in real-world applications.
+## Training Data
+The wiki_toxic dataset serves as the training data for this model. It contains comments from Wikipedia talk pages, manually labeled as toxic or non-toxic by human annotators. This dataset offers a diverse range of comments, ensuring the model learns to identify toxic behavior effectively.
+## Ethical Considerations
+It is important to note that the model's performance is dependent on the quality and representativeness of the training data. As such, it may reflect biases present in the data, potentially leading to unfair or inaccurate predictions. Careful monitoring and ongoing evaluation are necessary to ensure the model's responsible use and address any ethical concerns.
+## Acknowledgements
+We would like to acknowledge the contributors who curated the wiki_toxic dataset and made it publicly available. Their efforts have significantly advanced the development of toxic comment classification models, fostering a safer online community.
+<!--
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
 ## Model Card Contact
+[More Information Needed] -->