Update README.md
Browse files
README.md
CHANGED
@@ -61,6 +61,36 @@ We evaluated the _roberta-base-ca-cased-sts_ on the STS-ca test set against stan
|
|
61 |
|
62 |
For more details, check the fine-tuning and evaluation scripts in the official [GitHub repository](https://github.com/projecte-aina/berta).
|
63 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
64 |
## Citing
|
65 |
If you use any of these resources (datasets or models) in your work, please cite our latest paper:
|
66 |
```bibtex
|
@@ -84,7 +114,3 @@ If you use any of these resources (datasets or models) in your work, please cite
|
|
84 |
pages = "4933--4946",
|
85 |
}
|
86 |
```
|
87 |
-
## Funding
|
88 |
-
TODO
|
89 |
-
## Disclaimer
|
90 |
-
TODO
|
|
|
61 |
|
62 |
For more details, check the fine-tuning and evaluation scripts in the official [GitHub repository](https://github.com/projecte-aina/berta).
|
63 |
|
64 |
+
## How to use
|
65 |
+
To get the correct<sup>1</sup> model's prediction scores with values between 0.0 and 5.0, use the following code:
|
66 |
+
|
67 |
+
```python
|
68 |
+
from transformers import pipeline, AutoTokenizer
|
69 |
+
from scipy.special import logit
|
70 |
+
|
71 |
+
model = 'projecte-aina/roberta-base-ca-cased-sts'
|
72 |
+
tokenizer = AutoTokenizer.from_pretrained(model)
|
73 |
+
pipe = pipeline('text-classification', model=model, tokenizer=tokenizer)
|
74 |
+
|
75 |
+
def prepare(sentence_pairs):
|
76 |
+
sentence_pairs_prep = []
|
77 |
+
for s1, s2 in sentence_pairs:
|
78 |
+
sentence_pairs_prep.append(f"{tokenizer.cls_token} {s1}{tokenizer.sep_token}{tokenizer.sep_token} {s2}{tokenizer.sep_token}")
|
79 |
+
return sentence_pairs_prep
|
80 |
+
|
81 |
+
sentence_pairs = [("El llibre va caure per la finestra.", "El llibre va sortir volant."),
|
82 |
+
("M'agrades.", "T'estimo."),
|
83 |
+
("M'agrada el sol i la calor", "A la Garrotxa plou molt.")]
|
84 |
+
|
85 |
+
predictions = pipe(prepare(sentence_pairs), add_special_tokens=False)
|
86 |
+
|
87 |
+
# convert back to scores to the original 1 and 5 interval
|
88 |
+
for prediction in predictions:
|
89 |
+
prediction['score'] = logit(prediction['score'])
|
90 |
+
print(predictions)
|
91 |
+
```
|
92 |
+
|
93 |
+
1: avoid using the widget scores since they are normalized and do not reflect the original annotation values.
|
94 |
## Citing
|
95 |
If you use any of these resources (datasets or models) in your work, please cite our latest paper:
|
96 |
```bibtex
|
|
|
114 |
pages = "4933--4946",
|
115 |
}
|
116 |
```
|
|
|
|
|
|
|
|