Spaces:

finosfoundation
/

Open-Financial-LLM-Leaderboard

Running

Jimin Huang commited on May 22, 2024

Commit

75fb8ba

•

1 Parent(s): 527b158

feat: modify about.py

Files changed (1) hide show

src/about.py CHANGED Viewed

@@ -121,15 +121,15 @@ Our evaluation metrics include, but are not limited to, Accuracy, F1 Score, ROUG
 - **BigData22**: Acc
 - **ACL18**: Acc
 - **CIKM18**: Acc
-- **German**: F1
-- **Australian**: F1
-- **LendingClub**: F1
-- **ccf**: F1
-- **ccfraud**: F1
-- **polish**: F1
-- **taiwan**: F1
-- **portoseguro**: F1
-- **travelinsurance**: F1
 To ensure a fair and unbiased assessment of the models' true capabilities, all evaluations are conducted in zero-shot settings (0-shots). This approach eliminates any potential advantage from task-specific fine-tuning, providing a clear indication of how well the models can generalize to new tasks.

 - **BigData22**: Acc
 - **ACL18**: Acc
 - **CIKM18**: Acc
+- **German**: MCC
+- **Australian**: MCC
+- **LendingClub**: MCC
+- **ccf**: MCC
+- **ccfraud**: MCC
+- **polish**: MCC
+- **taiwan**: MCC
+- **portoseguro**: MCC
+- **travelinsurance**: MCC
 To ensure a fair and unbiased assessment of the models' true capabilities, all evaluations are conducted in zero-shot settings (0-shots). This approach eliminates any potential advantage from task-specific fine-tuning, providing a clear indication of how well the models can generalize to new tasks.