GEB-1.3B

GEB-1.3B是北京集异璧科技有限公司发布的轻量级语言大模型，拥有13亿参数，由550B中英文tokens数据训练而成。采用了目前较新的训练技术，包括 ROPE位置编码、多组注意力机制和FlashAttention-2，以在加速训练的同时保持模型性能。此外，我们使用了 1000 万条指令数据进行微调，以增强模型的对齐能力，并采用DPO方法更新模型以符合人类偏好。 GEB-1.3B在MMLU、C-Eval和CMMLU等常用基准测试中表现优异，超过了类似同参数级别的模型如TinyLLaMA-1.1B。值得注意的是，GEB-1.3B的FP32版本在CPU上实现了令人满意的推理时间，我们正在通过先进的量化技术进一步提升速度。

评测结果

Model	MMLU	C-Eval	CMMLU	Average
Baichuan-7B	42.30	42.80	44.02	43.04
ChatGLM-6B	40.63	38.90	-	39.77
GEB-1.3B	31.20	33.30	32.20	32.23
Llama-7B	35.10	27.10	26.75	29.65
Falcon-7B	28.00	-	-	28.00
MPT-7B	27.93	27.15	26.00	27.03
MindLLM-1.3B	26.20	26.10	25.33	25.88
MindLLM-3B	26.20	25.70	25.00	25.63
TinyLlama-1.1B	25.34	25.02	24.03	24.80

运行模型

使用 transformers 后端进行推理:

from transformers import AutoTokenizer, AutoModel
import torch
model = AutoModel.from_pretrained("GEB-AGI/geb-1.3b", trust_remote_code=True).bfloat16().cuda()
tokenizer = AutoTokenizer.from_pretrained("GEB-AGI/geb-1.3b", trust_remote_code=True)

query = "你好"
response, history = model.chat(tokenizer, query, history=[])
print(response)

如果无法下载，请手动clone repo把模型文件下载到本地，并将本地路径替换model和tokenizer的路径。

推理速度

推理硬件	速度token/s
CPU	12
3090	45
4090	50

协议

GEB-1.3B 模型的权重的使用则需要遵循 LICENSE。

引用

@article{geb-1.3b,
  title={GEB-1.3B: Open Lightweight Large Language Model},
  author={Jie Wu and Yufeng Zhu and Lei Shen and Xuqing Lu},
  journal={arXiv preprint arXiv:2406.09900},
  year={2024}
}