davebulaval
commited on
Commit
•
46b176e
1
Parent(s):
c5be21b
Update README.md
Browse files
README.md
CHANGED
@@ -29,4 +29,61 @@ tags:
|
|
29 |
- text-simplification
|
30 |
- meaning
|
31 |
- assess
|
32 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
- text-simplification
|
30 |
- meaning
|
31 |
- assess
|
32 |
+
---
|
33 |
+
|
34 |
+
# Here is MeaningBERT
|
35 |
+
|
36 |
+
MeaningBERT is an automatic and trainable metric for assessing meaning preservation between sentences. MeaningBERT was
|
37 |
+
proposed in our
|
38 |
+
article [MeaningBERT: assessing meaning preservation between sentences](https://www.frontiersin.org/articles/10.3389/frai.2023.1223924/full).
|
39 |
+
Its goal is to assess meaning preservation between two sentences that correlate highly with human judgments and sanity checks. For more details, refer to our publicly available article.
|
40 |
+
|
41 |
+
## Sanity Check
|
42 |
+
|
43 |
+
Correlation to human judgment is one way to evaluate the quality of a meaning preservation metric.
|
44 |
+
However, it is inherently subjective, since it uses human judgment as a gold standard, and expensive, since it requires
|
45 |
+
a large dataset
|
46 |
+
annotated by several humans. As an alternative, we designed two automated tests: evaluating meaning preservation between
|
47 |
+
identical sentences (which should be 100% preserving) and between unrelated sentences (which should be 0% preserving).
|
48 |
+
In these tests, the meaning preservation target value is not subjective and does not require human annotation to
|
49 |
+
measure. They represent a trivial and minimal threshold a good automatic meaning preservation metric should be able to
|
50 |
+
achieve. Namely, a metric should be minimally able to return a perfect score (i.e., 100%) if two identical sentences are
|
51 |
+
compared and return a null score (i.e., 0%) if two sentences are completely unrelated.
|
52 |
+
|
53 |
+
### Identical sentences
|
54 |
+
|
55 |
+
The first test evaluates meaning preservation between identical sentences. To analyze the metrics' capabilities to pass
|
56 |
+
this test, we count the number of times a metric rating was greater or equal to a threshold value X∈[95, 99] and divide
|
57 |
+
it by the number of sentences to create a ratio of the number of times the metric gives the expected rating. To account
|
58 |
+
for computer floating-point inaccuracy, we round the ratings to the nearest integer and do not use a threshold value of
|
59 |
+
100%.
|
60 |
+
|
61 |
+
### Unrelated sentences
|
62 |
+
|
63 |
+
Our second test evaluates meaning preservation between a source sentence and an unrelated sentence generated by a large
|
64 |
+
language model.3 The idea is to verify that the metric finds a meaning preservation rating of 0 when given a completely
|
65 |
+
irrelevant sentence mainly composed of irrelevant words (also known as word soup). Since this test's expected rating is 0, we check that the metric rating is lower or equal to a threshold value X∈[5, 1].
|
66 |
+
Again, to account for computer floating-point inaccuracy, we round the ratings to the nearest integer and do not use a
|
67 |
+
a threshold value of 0%.
|
68 |
+
|
69 |
+
## Cite
|
70 |
+
|
71 |
+
Use the following citation to cite MeaningBERT
|
72 |
+
|
73 |
+
```
|
74 |
+
@ARTICLE{10.3389/frai.2023.1223924,
|
75 |
+
AUTHOR={Beauchemin, David and Saggion, Horacio and Khoury, Richard},
|
76 |
+
TITLE={MeaningBERT: assessing meaning preservation between sentences},
|
77 |
+
JOURNAL={Frontiers in Artificial Intelligence},
|
78 |
+
VOLUME={6},
|
79 |
+
YEAR={2023},
|
80 |
+
URL={https://www.frontiersin.org/articles/10.3389/frai.2023.1223924},
|
81 |
+
DOI={10.3389/frai.2023.1223924},
|
82 |
+
ISSN={2624-8212},
|
83 |
+
}
|
84 |
+
```
|
85 |
+
|
86 |
+
## License
|
87 |
+
|
88 |
+
MeaningBERT is MIT licensed, as found in
|
89 |
+
the [LICENSE file](https://github.com/GRAAL-Research/risc/blob/main/LICENSE).
|