Athene-V2-Chat-72B: Rivaling GPT-4o across Benchmarks
- AWQ 4bit version of Nexusflow/Athene-V2-Chat
- Quantization code
Eval AWQ version
Evaluation results on ZebraLogic
β Model β Mode β N_Mode β N_Size β Puzzle Acc β Easy Puzzle Acc β Hard Puzzle Acc β Cell Acc β No answer β Total Puzzles β Reason Lens β
β o1-preview-2024-09-12 β greedy β single β 1 β 71.4 β 98.57 β 60.83 β 75.14 β 0.3 β 1000 β 1565.88 β
β claude-3-5-sonnet-20241022 β greedy β single β 1 β 36.2 β 91.07 β 14.86 β 54.27 β 0 β 1000 β 861.18 β
β Llama-3.1-405B-Inst-fp8@together β greedy β single β 1 β 32.6 β 87.14 β 11.39 β 45.8 β 12.5 β 1000 β 314.66 β
β Athene-V2-Chat-AWQ β greedy β single β 1 β 27.8 β 77.14 β 8.61 β 45.83 β 6.4 β 1000 β 1785.7 β
β Qwen2.5-72B-Instruct β greedy β single β 1 β 26.6 β 76.43 β 7.22 β 40.92 β 11.9 β 1000 β 1795.9 β
β Qwen2.5-32B-Instruct β greedy β single β 1 β 26.1 β 77.5 β 6.11 β 43.39 β 6.3 β 1000 β 1333.07 β
β Athene-70B β greedy β single β 1 β 16.7 β 52.5 β 2.78 β 32.98 β 21.1 β 1000 β 391.19 β
- Downloads last month
- 191
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.