|
|
|
--- |
|
|
|
library_name: transformers |
|
tags: |
|
- gemma2 |
|
- instruct |
|
- bggpt |
|
- insait |
|
license: gemma |
|
language: |
|
- bg |
|
- en |
|
base_model: |
|
- google/gemma-2-2b-it |
|
- google/gemma-2-2b |
|
pipeline_tag: text-generation |
|
|
|
--- |
|
|
|
[![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory) |
|
|
|
|
|
# QuantFactory/BgGPT-Gemma-2-2.6B-IT-v1.0-GGUF |
|
This is quantized version of [INSAIT-Institute/BgGPT-Gemma-2-2.6B-IT-v1.0](https://huggingface.co/INSAIT-Institute/BgGPT-Gemma-2-2.6B-IT-v1.0) created using llama.cpp |
|
|
|
# Original Model Card |
|
|
|
|
|
# INSAIT-Institute/BgGPT-Gemma-2-2.6B-IT-v1.0 |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/637e1f8cf7e01589cc17bf7e/p6d0YFHjWCQ3S12jWqO1m.png) |
|
|
|
INSAIT introduces **BgGPT-Gemma-2-2.6B-IT-v1.0**, a state-of-the-art Bulgarian language model based on **google/gemma-2-2b** and **google/gemma-2-2b-it**. |
|
BgGPT-Gemma-2-2.6B-IT-v1.0 is **free to use** and distributed under the [Gemma Terms of Use](https://ai.google.dev/gemma/terms). |
|
This model was created by [`INSAIT`](https://insait.ai/), part of Sofia University St. Kliment Ohridski, in Sofia, Bulgaria. |
|
|
|
|
|
# Model description |
|
|
|
The model was built on top of Google’s Gemma 2 2B open models. |
|
It was continuously pre-trained on around 100 billion tokens (85 billion in Bulgarian) using the Branch-and-Merge strategy INSAIT presented at [EMNLP’24](https://aclanthology.org/2024.findings-emnlp.1000/), |
|
allowing the model to gain outstanding Bulgarian cultural and linguistic capabilities while retaining its English performance. |
|
During the pre-training stage, we use various datasets, including Bulgarian web crawl data, freely available datasets such as Wikipedia, a range of specialized Bulgarian datasets sourced by the INSAIT Institute, |
|
and machine translations of popular English datasets. |
|
The model was then instruction-fine-tuned on a newly constructed Bulgarian instruction dataset created using real-world conversations. |
|
For more information check our [blogpost](https://models.bggpt.ai/blog/). |
|
|
|
# Benchmarks and Results |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65fefdc282708115868203aa/9pp8aD1yvoW-cJWzhbHXk.png) |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65fefdc282708115868203aa/33CjjtmCeAcw5qq8DEtJj.png) |
|
|
|
We evaluate our models on a set of standard English benchmarks, a translated version of them in Bulgarian, as well as, Bulgarian specific benchmarks we collected: |
|
|
|
- **Winogrande challenge**: testing world knowledge and understanding |
|
- **Hellaswag**: testing sentence completion |
|
- **ARC Easy/Challenge**: testing logical reasoning |
|
- **TriviaQA**: testing trivia knowledge |
|
- **GSM-8k**: solving multiple-choice questions in high-school mathematics |
|
- **Exams**: solving high school problems from natural and social sciences |
|
- **MON**: contains exams across various subjects for grades 4 to 12 |
|
|
|
These benchmarks test logical reasoning, mathematics, knowledge, language understanding and other skills of the models and are provided at https://github.com/insait-institute/lm-evaluation-harness-bg. |
|
The graphs above show the performance of BgGPT 2.6B compared to other small open language models such as Microsoft's Phi 3.5 and Alibaba's Qwen 2.5 3B. |
|
The BgGPT model not only surpasses them, but also **retains English performance** inherited from the original Google Gemma 2 models upon which it is based. |
|
|
|
# Use in 🤗 Transformers |
|
First install the latest version of the transformers library: |
|
``` |
|
pip install -U 'transformers[torch]' |
|
``` |
|
Then load the model in transformers: |
|
```python |
|
from transformers import AutoModelForCausalLM |
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
"INSAIT-Institute/BgGPT-Gemma-2-2.6B-IT-v1.0", |
|
torch_dtype=torch.bfloat16, |
|
attn_implementation="eager", |
|
device_map="auto", |
|
) |
|
``` |
|
|
|
# Recommended Parameters |
|
|
|
For optimal performance, we recommend the following parameters for text generation, as we have extensively tested our model with them: |
|
|
|
```python |
|
from transformers import GenerationConfig |
|
|
|
generation_params = GenerationConfig( |
|
max_new_tokens=2048, # Choose maximum generation tokens |
|
temperature=0.1, |
|
top_k=25, |
|
top_p=1, |
|
repetition_penalty=1.1, |
|
eos_token_id=[1,107] |
|
) |
|
``` |
|
|
|
In principle, increasing temperature should work adequately as well. |
|
|
|
# Instruction format |
|
|
|
In order to leverage instruction fine-tuning, your prompt should begin with a beginning-of-sequence token `<bos>` and be formatted in the Gemma 2 chat template. `<bos>` should only be the first token in a chat sequence. |
|
|
|
E.g. |
|
``` |
|
<bos><start_of_turn>user |
|
Кога е основан Софийският университет?<end_of_turn> |
|
<start_of_turn>model |
|
|
|
``` |
|
|
|
This format is also available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating) via the `apply_chat_template()` method: |
|
|
|
```python |
|
tokenizer = AutoTokenizer.from_pretrained( |
|
"INSAIT-Institute/BgGPT-Gemma-2-2.6B-IT-v1.0", |
|
use_default_system_prompt=False, |
|
) |
|
|
|
messages = [ |
|
{"role": "user", "content": "Кога е основан Софийският университет?"}, |
|
] |
|
|
|
input_ids = tokenizer.apply_chat_template( |
|
messages, |
|
return_tensors="pt", |
|
add_generation_prompt=True, |
|
return_dict=True |
|
) |
|
|
|
outputs = model.generate( |
|
**input_ids, |
|
generation_config=generation_params |
|
) |
|
print(tokenizer.decode(outputs[0])) |
|
|
|
``` |
|
|
|
**Important Note:** Models based on Gemma 2 such as BgGPT-Gemma-2-2.6B-IT-v1.0 do not support flash attention. Using it results in degraded performance. |
|
|
|
# Use with GGML / llama.cpp |
|
|
|
The model and instructions for usage in GGUF format are available at [INSAIT-Institute/BgGPT-Gemma-2-2.6B-IT-v1.0-GGUF](https://huggingface.co/INSAIT-Institute/BgGPT-Gemma-2-2.6B-IT-v1.0-GGUF). |
|
|
|
# Community Feedback |
|
|
|
We welcome feedback from the community to help improve BgGPT. If you have suggestions, encounter any issues, or have ideas for improvements, please: |
|
- Share your experience using the model through Hugging Face's community discussion feature or |
|
- Contact us at [[email protected]](mailto:[email protected]) |
|
|
|
Your real-world usage and insights are valuable in helping us optimize the model's performance and behaviour for various use cases. |
|
|
|
# Summary |
|
- **Finetuned from:** [google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it); [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b); |
|
- **Model type:** Causal decoder-only transformer language model |
|
- **Language:** Bulgarian and English |
|
- **Contact:** [[email protected]](mailto:[email protected]) |
|
- **License:** BgGPT is distributed under [Gemma Terms of Use](https://huggingface.co/INSAIT-Institute/BgGPT-Gemma-2-2.6B-IT-v1.0/raw/main/LICENSE) |
|
|