|
--- |
|
library_name: peft |
|
base_model: mistralai/Mistral-7B-Instruct-v0.2 |
|
datasets: |
|
- iamshnoo/alpaca-cleaned-bengali |
|
language: |
|
- bn |
|
--- |
|
|
|
# Model Card for Rashik24/Mistral-Instruct-Bangla |
|
|
|
Blog post: https://blog.rashik.sh/mistral-instruct-bangla-bridging-the-gap-in-bengali-ai |
|
|
|
The Rashik24/Mistral-Instruct-Bangla model is a language model specifically tailored for the Bengali language. Based on the Mistralai/Mistral-7B-Instruct-v0.2 base model, it has been fine-tuned using the iamshnoo/alpaca-cleaned-bengali dataset. This model is designed to understand and generate Bengali text, making it a valuable tool for a variety of natural language processing tasks in the Bengali language context. |
|
## Uses |
|
|
|
The Mistral-Instruct-Bangla model is intended for a range of applications where understanding and generating Bengali text is crucial. This includes but is not limited to machine translation, content creation, sentiment analysis, and language understanding tasks in Bengali. The model is suited for both academic researchers and industry practitioners who are working on Bengali language processing. |
|
### Direct Use |
|
|
|
This model can be directly used for generating Bengali text, understanding Bengali context in conversations, and translating between Bengali and other languages. It is designed to be straightforward to implement in various software environments, requiring minimal additional setup for direct use cases. |
|
## How to Get Started with the Model |
|
|
|
To start using the Rashik24/Mistral-Instruct-Bangla model, you can use the following code as a basic guide. This will help you integrate the model into your application or research project. |
|
|
|
```Python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import torch |
|
from peft import PeftModel, PeftConfig |
|
from transformers import AutoModelForCausalLM |
|
|
|
|
|
def load_model(model_name): |
|
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2", trust_remote_code=True) |
|
tokenizer.pad_token = tokenizer.eos_token |
|
tokenizer.padding_side = "right" |
|
config = PeftConfig.from_pretrained("Rashik24/Mistral-Instruct-Bangla") |
|
model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2").to("cuda") |
|
model = PeftModel.from_pretrained(model, "Rashik24/Mistral-Instruct-Bangla").to("cuda") |
|
model.eval() |
|
return model, tokenizer |
|
|
|
def generate_text(prompt, model, tokenizer): |
|
inputs = tokenizer(prompt, return_tensors="pt").to("cuda") |
|
with torch.no_grad(): |
|
generated_code = tokenizer.decode(model.generate(**inputs, max_new_tokens=256, pad_token_id=2)[0], skip_special_tokens=True) |
|
print(generated_code) |
|
return generated_code |
|
|
|
#Load the model |
|
model_name = 'Rashik24/Mistral-Instruct-Bangla' |
|
model, tokenizer = load_model(model_name) |
|
|
|
prompt = "একটি গ্রামের বর্ণনা করুন।" |
|
generated_text = generate_text(prompt, model, tokenizer) |
|
``` |
|
|
|
## Training Details |
|
### Training Data |
|
|
|
The model has been trained on the 'iamshnoo/alpaca-cleaned-bengali' dataset. |
|
|
|
For more details on the training data and methodology, refer to the dataset card linked here:https://huggingface.co/datasets/iamshnoo/alpaca-cleaned-bengali |