--- base_model: - Kaoeiri/Magnum-v4-Cydonia-v1.3-22B-2 --- vllm (pretrained=/root/autodl-tmp/Magnum-v4-Cydonia-v1.3-22B-2,add_bos_token=true,tensor_parallel_size=4,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto |Tasks|Version| Filter |n-shot| Metric | |Value| |Stderr| |-----|------:|----------------|-----:|-----------|---|----:|---|-----:| |gsm8k| 3|flexible-extract| 5|exact_match|↑ |0.832|± |0.0237| | | |strict-match | 5|exact_match|↑ |0.808|± |0.0250| vllm (pretrained=/root/autodl-tmp/Magnum-v4-Cydonia-v1.3-22B-2,add_bos_token=true,tensor_parallel_size=4,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 500.0, num_fewshot: 5, batch_size: auto |Tasks|Version| Filter |n-shot| Metric | |Value| |Stderr| |-----|------:|----------------|-----:|-----------|---|----:|---|-----:| |gsm8k| 3|flexible-extract| 5|exact_match|↑ |0.846|± |0.0162| | | |strict-match | 5|exact_match|↑ |0.814|± |0.0174| vllm (pretrained=/root/autodl-tmp/Magnum-v4-Cydonia-v1.3-22B-2-89,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto |Tasks|Version| Filter |n-shot| Metric | |Value| |Stderr| |-----|------:|----------------|-----:|-----------|---|----:|---|-----:| |gsm8k| 3|flexible-extract| 5|exact_match|↑ |0.848|± |0.0228| | | |strict-match | 5|exact_match|↑ |0.820|± |0.0243| vllm (pretrained=/root/autodl-tmp/Magnum-v4-Cydonia-v1.3-22B-2-89,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 500.0, num_fewshot: 5, batch_size: auto |Tasks|Version| Filter |n-shot| Metric | |Value| |Stderr| |-----|------:|----------------|-----:|-----------|---|----:|---|-----:| |gsm8k| 3|flexible-extract| 5|exact_match|↑ |0.850|± |0.0160| | | |strict-match | 5|exact_match|↑ |0.812|± |0.0175|