Model Card for Model ID

Model Details

Model Description

The Gemma Self-Attention Merged model is a large language model created by merging the self-attention layers of an English-based Gemma 7B model and a Korean-based Gemma 7B model. This merger allows the model to leverage the capabilities of both the English and Korean models, resulting in a more versatile and capable language model that can perform well on tasks involving both English and Korean text.

The key features of this merged model include:

  • Increased self-attention capacity with doubled number of attention heads
  • Ability to handle both English and Korean language input
  • Potential for improved performance on a wide range of natural language processing tasks

Chat template

system: system message...
B: user message...
A: assistant message...

Model Sources

Downloads last month
18
Safetensors
Model size
9.95B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.