File size: 3,591 Bytes
991da8a d8db620 0acf2e4 c172a56 1935e3e 0acf2e4 5bbc6fe 0acf2e4 5bbc6fe 0acf2e4 a22c9da 0acf2e4 51dab09 0acf2e4 d9981da 0acf2e4 d9981da 9c374fe 0acf2e4 98b65ca 0acf2e4 b478a46 785727e 0acf2e4 8c0d2bd 0acf2e4 849a9c7 785727e 849a9c7 0acf2e4 90a717c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 |
---
license: mit
language:
- en
metrics:
- accuracy
- matthews_correlation
widget:
- text: "Highway work zones create potential risks for both traffic and workers in addition to traffic congestion and delays that result in increased road user delay."
- text: "A circular economy is a way of achieving sustainable consumption and production, as well as nature positive outcomes."
---
# sadickam/sdgBERT (previously - sadickam/sdg-classification-bert)
<!-- Provide a quick summary of what the model is/does. -->
sgdBERT (previously named "sdg-classification-bert"), is an NLP model for classifying text with respect to the United Nations sustainable development goals (SDG).

Source:https://www.un.org/development/desa/disabilities/about-us/sustainable-development-goals-sdgs-and-disability.html
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
This text classification model was developed by fine-tuning the bert-base-uncased pre-trained model. The training data for this fine-tuned model was sourced from the publicly available OSDG Community Dataset (OSDG-CD) Version 2023.10 at https://zenodo.org/records/8397907.
This model was made as part of academic research at Deakin University. The goal was to make a transformer-based SDG text classification model that anyone could use. Only the first 16 UN SDGs supported. The primary model details are highlighted below:
- **Model type:** Text classification
- **Language(s) (NLP):** English
- **License:** mit
- **Finetuned from model [optional]:** bert-base-uncased
### Model Sources
<!-- Provide the basic links for the model. -->
- **Repository:** https://huggingface.co/sadickam/sdg-classification-bert
- **Demo:** option 1 (copy/past text and csv): https://sadickam-sdg-text-classifier.hf.space/; option 2 (PDF documents): https://sadickam-document-sdg-app-cpu.hf.space
### Direct Use
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
This is a fine-tuned model and therefore requires no further training.
## How to Get Started with the Model
Use the code below to get started with the model.
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("sadickam/sdg-classification-bert")
model = AutoModelForSequenceClassification.from_pretrained("sadickam/sdg-classification-bert")
```
## Training Data
<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
The training data includes text from a wide range of industries and academic research fields. Hence, this fine-tuned model is not for a specific industry.
See training here: https://zenodo.org/records/8397907
## Training Hyperparameters
- Num_epoch = 3
- Learning rate = 5e-5
- Batch size = 16
## Evaluation
#### Metrics
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
- Accuracy = 0.90
- Matthews correlation = 0.89
## Citation
Will be provided soon. Paper currently under review.
<!-- Sadick, A.M. (2023). SDG classification with BERT. https://huggingface.co/sadickam/sdg-classification-bert -->
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
## Model Card Contact
[email protected] |