|
--- |
|
base_model: |
|
- h2oai/h2o-danube3-500m-base |
|
- appvoid/arco |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
|
|
--- |
|
|
|
# arco lite |
|
|
|
arco lite is a passthrough arco model based on danube outputs to keep generality, even though its performance decreased, it's stil competitive to qwen2 at most benchmarks, being mmlu the only reason why is better on average. Note, arco-lite is still un-trained, i'm expecting it to be better after some iterations. |
|
|
|
#### benchmarks |
|
|
|
zero-shot evaluations, as you can see is smarter than qwen but without world knowledge, so don't use it for tasks that need factual output. |
|
|
|
| Parameters | Model | MMLU | ARC | HellaSwag | PIQA | Winogrande | Average | |
|
| -----------|--------------------------------|-------|-------|-----------|--------|------------|---------| |
|
| 488m | arco-lite | 23.22 | **33.45** | **56.55**| **69.70** | **59.19**| **48.46** | |
|
| 494m | qwen2 |**44.13**| 28.92| 49.05 | 69.31 | 56.99 | **49.68** | |
|
|
|
|
|
#### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
slices: |
|
- sources: |
|
- model: appvoid/arco |
|
layer_range: [0, 14] |
|
- sources: |
|
- model: h2oai/h2o-danube3-500m-base |
|
layer_range: [15, 16] |
|
|
|
merge_method: passthrough |
|
dtype: float16 |
|
|
|
``` |
|
|