jgrosjean-mathesis
/

sentence-swissbert

Sentence Similarity

Inference Endpoints

Model card Files Files and versions Community

jgrosjean commited on Mar 18

Commit

ab82a4b

•

1 Parent(s): c6948ed

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -142,7 +142,7 @@ The fine-tuning script can be accessed [here](https://github.com/jgrosjean-mathe
 The two evaluation tasks make use of the [20 Minuten dataset](https://www.zora.uzh.ch/id/eprint/234387/) compiled by Kew et al. (2023), which contains Swiss news articles with topic tags and summaries. Parts of the dataset were automatically translated to French, Italian using a Google Cloud API and to Romash via a [Textshuttle](https://textshuttle.com/en) API.
-#### Evaluation via Semantic Textual Similarity
 <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
@@ -166,10 +166,10 @@ Sentence SwissBERT achieves comparable or better results as the best-performing
 | Evaluation task |Swissbert | |Sentence Swissbert| |Sentence-BERT| |
 |------------------------|----------|-----------|------------------|-----------|-------------|-----------|
 | |accuracy |f1-score |accuracy |f1-score |accuracy |f1-score |
-| Semantic Similarity DE | 87.20 % | -- | **93.40 %** | -- | 91.80 % | -- |
-| Semantic Similarity FR | 84.97 % | -- | **93.99 %** | -- | 93.19 % | -- |
-| Semantic Similarity IT | 84.17 % | -- | **92.18 %** | -- | 91.58 % | -- |
-| Semantic Similarity RM | 83.17 % | -- | **91.58 %** | -- | 73.35 % | -- |
 | Text Classification DE | -- | 77.93 % | -- |**78.49 %**| -- | 77.23 % |
 | Text Classification FR | -- | 69.62 % | -- |**77.18 %**| -- | 76.83 % |
 | Text Classification IT | -- | 67.09 % | -- | 76.65 % | -- |**76.90 %**|

 The two evaluation tasks make use of the [20 Minuten dataset](https://www.zora.uzh.ch/id/eprint/234387/) compiled by Kew et al. (2023), which contains Swiss news articles with topic tags and summaries. Parts of the dataset were automatically translated to French, Italian using a Google Cloud API and to Romash via a [Textshuttle](https://textshuttle.com/en) API.
+#### Evaluation via Document Retrieval
 <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
 | Evaluation task |Swissbert | |Sentence Swissbert| |Sentence-BERT| |
 |------------------------|----------|-----------|------------------|-----------|-------------|-----------|
 | |accuracy |f1-score |accuracy |f1-score |accuracy |f1-score |
+| Document Retrieval DE | 87.20 % | -- | **93.40 %** | -- | 91.80 % | -- |
+| Document Retrieval FR | 84.97 % | -- | **93.99 %** | -- | 93.19 % | -- |
+| Document Retrieval IT | 84.17 % | -- | **92.18 %** | -- | 91.58 % | -- |
+| Document Retrieval RM | 83.17 % | -- | **91.58 %** | -- | 73.35 % | -- |
 | Text Classification DE | -- | 77.93 % | -- |**78.49 %**| -- | 77.23 % |
 | Text Classification FR | -- | 69.62 % | -- |**77.18 %**| -- | 76.83 % |
 | Text Classification IT | -- | 67.09 % | -- | 76.65 % | -- |**76.90 %**|