Surprise result!

by sometimesanotion - opened 4 days ago

Owner 4 days ago

•

@sthenno , @CultriX , I think you'll want to see this. I made this merge because I felt Lamarck hadn't integrated DeepSeek R1 enough, and a model_stock would make the MUSR pop. That's not what happened. Most scores fell slightly towards the average, but - look at the MATH.

It appears that R1 and Qwenvergence v9 (hence DRT) are clashing on MUSR, but a model_stock shows where they are synergistic on MATH.

sthenno

4 days ago

Amazing! --but, I got a lot of confusions in MATH. See: https://huggingface.co/bamec66557/Qwen-2.5-14B-MINUS/discussions/1#6792f65509f4f9090f0c62bd

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment