kelingwang
commited on
Commit
•
e3e6403
1
Parent(s):
8ead19d
Update README.md
Browse files
README.md
CHANGED
@@ -41,7 +41,7 @@ datasets:
|
|
41 |
# Model description
|
42 |
This `bert-causation-rating-dr1` model is a fine-tuned [biobert-base-cased-v1.2](https://huggingface.co/dmis-lab/biobert-base-cased-v1.2) model on a small set of manually annotated texts with causation labels. This model is tasked with classifying a sentence into different levels of strength of causation expressed in this sentence.
|
43 |
|
44 |
-
This `dr1` version is tuned on the set of sentences rated by Rater 1.
|
45 |
|
46 |
# Intended use and limitations
|
47 |
|
@@ -69,7 +69,7 @@ This performance is achieved with the following hyperparameters:
|
|
69 |
* Weight decay: 0.111616
|
70 |
* Warmup ratio: 0.301057
|
71 |
* Power of polynomial learning rate scheduler: 2.619975
|
72 |
-
* Power to the distance measure used in the loss function
|
73 |
|
74 |
|
75 |
## Hyperparameter tuning metrics
|
@@ -82,13 +82,13 @@ The following training configurations apply:
|
|
82 |
* `batch_size`: 128
|
83 |
* `epoch`: 8
|
84 |
* `max_length` in `torch.utils.data.Dataset`: 128
|
85 |
-
* Loss function: the [OLL loss](https://aclanthology.org/2022.coling-1.407/) with a tunable hyperparameter
|
86 |
* `lr`: 7.94278e-05
|
87 |
* `weight_decay`: 0.111616
|
88 |
* `warmup_ratio`: 0.301057
|
89 |
* `lr_scheduler_type`: polynomial
|
90 |
* `lr_scheduler_kwargs`: `{"power": 2.619975, "lr_end": 1e-8}`
|
91 |
-
* Power to the distance measure used in the loss function
|
92 |
|
93 |
# Framework versions and devices
|
94 |
|
|
|
41 |
# Model description
|
42 |
This `bert-causation-rating-dr1` model is a fine-tuned [biobert-base-cased-v1.2](https://huggingface.co/dmis-lab/biobert-base-cased-v1.2) model on a small set of manually annotated texts with causation labels. This model is tasked with classifying a sentence into different levels of strength of causation expressed in this sentence.
|
43 |
|
44 |
+
The sentences in the dataset were rated independently by two researchers. This `dr1` version is tuned on the set of sentences with labels rated by Rater 1.
|
45 |
|
46 |
# Intended use and limitations
|
47 |
|
|
|
69 |
* Weight decay: 0.111616
|
70 |
* Warmup ratio: 0.301057
|
71 |
* Power of polynomial learning rate scheduler: 2.619975
|
72 |
+
* Power to the distance measure used in the loss function \alpha: 2.0
|
73 |
|
74 |
|
75 |
## Hyperparameter tuning metrics
|
|
|
82 |
* `batch_size`: 128
|
83 |
* `epoch`: 8
|
84 |
* `max_length` in `torch.utils.data.Dataset`: 128
|
85 |
+
* Loss function: the [OLL loss](https://aclanthology.org/2022.coling-1.407/) with a tunable hyperparameter \alpha (Power to the distance measure used in the loss function).
|
86 |
* `lr`: 7.94278e-05
|
87 |
* `weight_decay`: 0.111616
|
88 |
* `warmup_ratio`: 0.301057
|
89 |
* `lr_scheduler_type`: polynomial
|
90 |
* `lr_scheduler_kwargs`: `{"power": 2.619975, "lr_end": 1e-8}`
|
91 |
+
* Power to the distance measure used in the loss function \alpha: 2.0
|
92 |
|
93 |
# Framework versions and devices
|
94 |
|