davidheineman commited on
Commit
4bec69e
1 Parent(s): b547a07
README.md CHANGED
@@ -1,3 +1,47 @@
1
  ---
 
 
 
 
 
 
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ datasets:
5
+ - simpeval
6
+ tags:
7
+ - simplification
8
  license: apache-2.0
9
  ---
10
+
11
+ This contains the trained checkpoint for LENS-SALSA, as introduced in [**Dancing Between Success and Failure: Edit-level Simplification Evaluation using SALSA**](https://arxiv.org/abs/2305.14458). For more information, please refer to the [**SALSA repository**](https://github.com/davidheineman/salsa).
12
+
13
+ ```bash
14
+ pip install lens-metric
15
+ ```
16
+
17
+ ```python
18
+ from lens import download_model
19
+ from lens.lens_salsa import LENS_SALSA
20
+
21
+ model_path = download_model("davidheineman/lens-salsa")
22
+ lens_salsa = LENS_SALSA(model_path)
23
+
24
+ score = lens_salsa.score(
25
+ complex = [
26
+ "They are culturally akin to the coastal peoples of Papua New Guinea."
27
+ ],
28
+ simple = [
29
+ "They are culturally similar to the people of Papua New Guinea."
30
+ ]
31
+ )
32
+ ```
33
+
34
+ ## Intended uses
35
+
36
+ Our model is intented to be used for **reference-free simplification evaluation**. Given a source text and its translation, outputs a single score between 0 and 1 where 1 represents a perfect simplification and 0 a random simplification. LENS-SALSA was trained on edit annotations of the SimpEval dataset, which covers manually-written, complex Wikipedia simplifications. We have not evaluated our model on non-English languages or non-Wikipedia domains.
37
+
38
+ ## Cite SALSA
39
+ If you find our paper, code or data helpful, please consider citing [**our work**](https://arxiv.org/abs/2305.14458):
40
+ ```tex
41
+ @article{heineman2023dancing,
42
+ title={Dancing {B}etween {S}uccess and {F}ailure: {E}dit-level {S}implification {E}valuation using {SALSA}},
43
+ author = "Heineman, David and Dou, Yao and Xu, Wei",
44
+ journal={arXiv preprint arXiv:2305.14458},
45
+ year={2023}
46
+ }
47
+ ```
checkpoints/epoch=3-step=1460-val_kendall=0.409.ckpt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:971a9c705c90bb97fe85e73211aa8ca2beff7e7f438395d2ac86403a4960c0b3
3
+ size 1419010479
hparams.yaml ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ activations: Tanh
2
+ batch_size: 4
3
+ class_identifier: unified_metric
4
+ continuous_word_labels: false
5
+ dropout: 0.15
6
+ encoder_learning_rate: 1.0e-05
7
+ encoder_model: RoBERTa
8
+ final_activation: null
9
+ hidden_sizes:
10
+ - 384
11
+ initalize_pretrained_unified_weights: true
12
+ input_segments:
13
+ - edit_id_simplified
14
+ - edit_id_original
15
+ keep_embeddings_frozen: true
16
+ layer: mix
17
+ layer_norm: true
18
+ layer_transformation: sparsemax
19
+ layerwise_decay: 0.95
20
+ learning_rate: 3.1e-05
21
+ load_pretrained_weights: true
22
+ loss: mse
23
+ loss_lambda: 0.9
24
+ nr_frozen_epochs: 0.3
25
+ optimizer: AdamW
26
+ pool: avg
27
+ pretrained_model: roberta-large
28
+ score_target: lens_score
29
+ sent_layer: mix
30
+ span_targets:
31
+ - edit_id_simplified
32
+ - edit_id_original
33
+ span_tokens:
34
+ - bad
35
+ warmup_steps: 0
36
+ word_layer: 24
37
+ word_level_training: true
38
+ word_weights:
39
+ - 0.1
40
+ - 0.9