davebulaval
/

MeaningBERT

 - text-simplification
 - meaning
 - assess
+---
+# Here is MeaningBERT
+MeaningBERT is an automatic and trainable metric for assessing meaning preservation between sentences. MeaningBERT was
+proposed in our
+article [MeaningBERT: assessing meaning preservation between sentences](https://www.frontiersin.org/articles/10.3389/frai.2023.1223924/full).
+Its goal is to assess meaning preservation between two sentences that correlate highly with human judgments and sanity checks. For more details, refer to our publicly available article.
+## Sanity Check
+Correlation to human judgment is one way to evaluate the quality of a meaning preservation metric.
+However, it is inherently subjective, since it uses human judgment as a gold standard, and expensive, since it requires
+a large dataset
+annotated by several humans. As an alternative, we designed two automated tests: evaluating meaning preservation between
+identical sentences (which should be 100% preserving) and between unrelated sentences (which should be 0% preserving).
+In these tests, the meaning preservation target value is not subjective and does not require human annotation to
+measure. They represent a trivial and minimal threshold a good automatic meaning preservation metric should be able to
+achieve. Namely, a metric should be minimally able to return a perfect score (i.e., 100%) if two identical sentences are
+compared and return a null score (i.e., 0%) if two sentences are completely unrelated.
+### Identical sentences
+The first test evaluates meaning preservation between identical sentences. To analyze the metrics' capabilities to pass
+this test, we count the number of times a metric rating was greater or equal to a threshold value X∈[95, 99] and divide
+it by the number of sentences to create a ratio of the number of times the metric gives the expected rating. To account
+for computer floating-point inaccuracy, we round the ratings to the nearest integer and do not use a threshold value of
+100%.
+### Unrelated sentences
+Our second test evaluates meaning preservation between a source sentence and an unrelated sentence generated by a large
+language model.3 The idea is to verify that the metric finds a meaning preservation rating of 0 when given a completely
+irrelevant sentence mainly composed of irrelevant words (also known as word soup). Since this test's expected rating is 0, we check that the metric rating is lower or equal to a threshold value X∈[5, 1].
+Again, to account for computer floating-point inaccuracy, we round the ratings to the nearest integer and do not use a
+a threshold value of 0%.
+## Cite
+Use the following citation to cite MeaningBERT
+```
+@ARTICLE{10.3389/frai.2023.1223924,
+AUTHOR={Beauchemin, David and Saggion, Horacio and Khoury, Richard},
+TITLE={MeaningBERT: assessing meaning preservation between sentences},
+JOURNAL={Frontiers in Artificial Intelligence},
+VOLUME={6},
+YEAR={2023},
+URL={https://www.frontiersin.org/articles/10.3389/frai.2023.1223924},
+DOI={10.3389/frai.2023.1223924},
+ISSN={2624-8212},
+}
+```
+## License
+MeaningBERT is MIT licensed, as found in
+the [LICENSE file](https://github.com/GRAAL-Research/risc/blob/main/LICENSE).