Benchmarks are πŸ’©

#6
by MrDevolver - opened

Benchmarks are πŸ’©. For the love of God please stop claiming that small models like this one can compete with big models such as ChatGPT or Claude if it can't even fix small issues such as missing paddle movement logic in a simple pong game code written in javascript!

Agreed. For my use case the benchmarks and leader-boards seem very misleading most of the time.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment