mcse-flickr-bert-base-uncased / eval_results.log
Miaoran's picture
Upload 2 files
b17f672 verified
2021-10-03 08:30:09,513 : ***** Transfer task : STS12 *****
2021-10-03 08:30:12,822 : MSRpar : pearson = 0.6552, spearman = 0.6487, align_loss = 0.2364, uniform_loss = -2.7071
2021-10-03 08:30:14,166 : MSRvid : pearson = 0.8936, spearman = 0.8875, align_loss = 0.2080, uniform_loss = -2.4933
2021-10-03 08:30:15,306 : SMTeuroparl : pearson = 0.4988, spearman = 0.5867, align_loss = 0.2905, uniform_loss = -1.7737
2021-10-03 08:30:17,420 : surprise.OnWN : pearson = 0.7522, spearman = 0.6933, align_loss = 0.2701, uniform_loss = -2.5032
2021-10-03 08:30:18,534 : surprise.SMTnews : pearson = 0.6894, spearman = 0.5845, align_loss = 0.2694, uniform_loss = -1.9022
2021-10-03 08:30:18,548 : ALL : Pearson = 0.8190, Spearman = 0.7163, align_loss = 0.2510, uniform_loss = -2.3433
2021-10-03 08:30:18,548 : ALL (weighted average) : Pearson = 0.7174, Spearman = 0.6997, align_loss = 0.2499, uniform_loss = -2.3651
2021-10-03 08:30:18,548 : ALL (average) : Pearson = 0.6978, Spearman = 0.6802, align_loss = 0.2549, uniform_loss = -2.2759
2021-10-03 08:30:18,555 : ***** Transfer task : STS13 (-SMT) *****
2021-10-03 08:30:19,558 : FNWN : pearson = 0.6301, spearman = 0.6294, align_loss = 0.3905, uniform_loss = -2.2757
2021-10-03 08:30:21,091 : headlines : pearson = 0.8155, spearman = 0.8165, align_loss = 0.2408, uniform_loss = -2.5272
2021-10-03 08:30:22,268 : OnWN : pearson = 0.8329, spearman = 0.8242, align_loss = 0.2874, uniform_loss = -2.2479
2021-10-03 08:30:22,272 : ALL : Pearson = 0.8145, Spearman = 0.8213, align_loss = 0.2817, uniform_loss = -2.3811
2021-10-03 08:30:22,272 : ALL (weighted average) : Pearson = 0.7987, Spearman = 0.7958, align_loss = 0.2771, uniform_loss = -2.3911
2021-10-03 08:30:22,272 : ALL (average) : Pearson = 0.7595, Spearman = 0.7567, align_loss = 0.3062, uniform_loss = -2.3503
2021-10-03 08:30:22,274 : ***** Transfer task : STS14 *****
2021-10-03 08:30:23,499 : deft-forum : pearson = 0.5769, spearman = 0.5598, align_loss = 0.3282, uniform_loss = -2.5534
2021-10-03 08:30:24,679 : deft-news : pearson = 0.8023, spearman = 0.7736, align_loss = 0.1958, uniform_loss = -2.4303
2021-10-03 08:30:26,339 : headlines : pearson = 0.7991, spearman = 0.7902, align_loss = 0.2404, uniform_loss = -2.5019
2021-10-03 08:30:27,948 : images : pearson = 0.8699, spearman = 0.8319, align_loss = 0.2295, uniform_loss = -2.6572
2021-10-03 08:30:29,636 : OnWN : pearson = 0.8668, spearman = 0.8523, align_loss = 0.2953, uniform_loss = -2.3135
2021-10-03 08:30:31,795 : tweet-news : pearson = 0.7786, spearman = 0.6952, align_loss = 0.4321, uniform_loss = -2.4954
2021-10-03 08:30:31,808 : ALL : Pearson = 0.7931, Spearman = 0.7594, align_loss = 0.2930, uniform_loss = -2.4940
2021-10-03 08:30:31,808 : ALL (weighted average) : Pearson = 0.7963, Spearman = 0.7630, align_loss = 0.2945, uniform_loss = -2.4944
2021-10-03 08:30:31,808 : ALL (average) : Pearson = 0.7823, Spearman = 0.7505, align_loss = 0.2869, uniform_loss = -2.4920
2021-10-03 08:30:31,813 : ***** Transfer task : STS15 *****
2021-10-03 08:30:33,333 : answers-forums : pearson = 0.7507, spearman = 0.7556, align_loss = 0.5185, uniform_loss = -2.6958
2021-10-03 08:30:35,056 : answers-students : pearson = 0.7498, spearman = 0.7557, align_loss = 0.3069, uniform_loss = -1.7109
2021-10-03 08:30:36,623 : belief : pearson = 0.8234, spearman = 0.8440, align_loss = 0.4319, uniform_loss = -2.5455
2021-10-03 08:30:38,401 : headlines : pearson = 0.8250, spearman = 0.8296, align_loss = 0.2501, uniform_loss = -2.5245
2021-10-03 08:30:40,203 : images : pearson = 0.9113, spearman = 0.9161, align_loss = 0.2386, uniform_loss = -2.6473
2021-10-03 08:30:40,208 : ALL : Pearson = 0.8383, Spearman = 0.8463, align_loss = 0.3177, uniform_loss = -2.3758
2021-10-03 08:30:40,208 : ALL (weighted average) : Pearson = 0.8183, Spearman = 0.8253, align_loss = 0.3177, uniform_loss = -2.3758
2021-10-03 08:30:40,208 : ALL (average) : Pearson = 0.8121, Spearman = 0.8202, align_loss = 0.3492, uniform_loss = -2.4248
2021-10-03 08:30:40,212 : ***** Transfer task : STS16 *****
2021-10-03 08:30:40,999 : answer-answer : pearson = 0.6759, spearman = 0.6784, align_loss = 0.3704, uniform_loss = -2.1641
2021-10-03 08:30:41,653 : headlines : pearson = 0.8141, spearman = 0.8351, align_loss = 0.2300, uniform_loss = -2.5482
2021-10-03 08:30:42,387 : plagiarism : pearson = 0.8314, spearman = 0.8398, align_loss = 0.2420, uniform_loss = -2.1775
2021-10-03 08:30:43,652 : postediting : pearson = 0.8652, spearman = 0.8798, align_loss = 0.1522, uniform_loss = -2.5769
2021-10-03 08:30:44,240 : question-question : pearson = 0.6909, spearman = 0.6944, align_loss = 0.2678, uniform_loss = -2.2807
2021-10-03 08:30:44,243 : ALL : Pearson = 0.7664, Spearman = 0.7750, align_loss = 0.2525, uniform_loss = -2.3495
2021-10-03 08:30:44,243 : ALL (weighted average) : Pearson = 0.7767, Spearman = 0.7868, align_loss = 0.2531, uniform_loss = -2.3528
2021-10-03 08:30:44,243 : ALL (average) : Pearson = 0.7755, Spearman = 0.7855, align_loss = 0.2525, uniform_loss = -2.3495
2021-10-03 08:30:44,245 :
***** Transfer task : STSBenchmark*****
2021-10-03 08:31:03,739 : train : pearson = 0.8231, spearman = 0.8058, align_loss = 0.2532, uniform_loss = -2.6089
2021-10-03 08:31:09,180 : dev : pearson = 0.8537, spearman = 0.8558, align_loss = 0.2799, uniform_loss = -2.6759
2021-10-03 08:31:13,943 : test : pearson = 0.8001, spearman = 0.7996, align_loss = 0.2558, uniform_loss = -2.5835
2021-10-03 08:31:13,953 : ALL : Pearson = 0.8260, Spearman = 0.8164, align_loss = 0.2583, uniform_loss = -2.6166
2021-10-03 08:31:13,954 : ALL (weighted average) : Pearson = 0.8247, Spearman = 0.8135, align_loss = 0.2582, uniform_loss = -2.6165
2021-10-03 08:31:13,954 : ALL (average) : Pearson = 0.8256, Spearman = 0.8204, align_loss = 0.2630, uniform_loss = -2.6228
2021-10-03 08:31:13,966 :
***** Transfer task : SICKRelatedness*****
2021-10-03 08:31:26,378 : train : pearson = 0.8149, spearman = 0.7307, align_loss = 0.2282, uniform_loss = -2.5580
2021-10-03 08:31:27,937 : dev : pearson = 0.8175, spearman = 0.7571, align_loss = 0.2344, uniform_loss = -2.7867
2021-10-03 08:31:41,491 : test : pearson = 0.8030, spearman = 0.7212, align_loss = 0.2274, uniform_loss = -2.5465
2021-10-03 08:31:41,514 : ALL : Pearson = 0.8092, Spearman = 0.7273, align_loss = 0.2281, uniform_loss = -2.5639
2021-10-03 08:31:41,514 : ALL (weighted average) : Pearson = 0.8091, Spearman = 0.7273, align_loss = 0.2281, uniform_loss = -2.5638
2021-10-03 08:31:41,514 : ALL (average) : Pearson = 0.8118, Spearman = 0.7364, align_loss = 0.2300, uniform_loss = -2.6304
2021-10-03 08:31:41,515 : ------ test ------
2021-10-03 08:31:41,517 : +--------+--------+--------+--------+--------+--------------+-----------------+--------+
| STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness | Avg. |
+--------+--------+--------+--------+--------+--------------+-----------------+--------+
| 71.63 | 82.13 | 75.94 | 84.63 | 77.50 | 79.96 | 72.12 | 77.70 |
| 0.251 | 0.282 | 0.293 | 0.318 | 0.252 | 0.256 | 0.227 | 0.268 |
| -2.343 | -2.381 | -2.494 | -2.376 | -2.349 | -2.583 | -2.547 | -2.439 |
+--------+--------+--------+--------+--------+--------------+-----------------+--------+
2021-10-03 08:31:41,518 : +------+------+------+------+------+------+------+------+
| MR | CR | SUBJ | MPQA | SST2 | TREC | MRPC | Avg. |
+------+------+------+------+------+------+------+------+
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
+------+------+------+------+------+------+------+------+