|
--- |
|
license: cc |
|
language: |
|
- fa |
|
- en |
|
library_name: transformers |
|
tags: |
|
- text-generation-inference |
|
inference: false |
|
|
|
|
|
|
|
|
|
metrics: |
|
- bleu |
|
- comet |
|
- accuracy |
|
- perplexity |
|
- spearmanr |
|
pipeline_tag: text-generation |
|
co2_eq_emissions: |
|
emissions: 232380 |
|
--- |
|
|
|
|
|
<img src="PersianMind.jpg" alt="PersianMind logo" width=300/> |
|
|
|
|
|
# PersianMind |
|
|
|
PersianMind is a a cross-lingual Persian-English large language model. |
|
|
|
### Model Description |
|
|
|
- **Developed by:** [Pedram Rostami](mailto:[email protected]), [Ali Salemi](mailto:[email protected]), and [Mohammad Javad Dousti](mailto:[email protected]) |
|
- **Model type:** Language model |
|
- **Languages:** English and Persian |
|
- **License:** [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. Note that you need to install `sentencepiece` and `accelerate` libraries to run this code. |
|
|
|
```python |
|
from transformers import LlamaTokenizer, LlamaForCausalLM |
|
import torch |
|
|
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
model = LlamaForCausalLM.from_pretrained( |
|
"universitytehran/PersianMind-v1.0", |
|
torch_dtype=torch.bfloat16, |
|
low_cpu_mem_usage=True, |
|
device_map={"": device}, |
|
) |
|
tokenizer = LlamaTokenizer.from_pretrained( |
|
"universitytehran/PersianMind-v1.0", |
|
) |
|
|
|
TEMPLATE = "{context}\nYou: {prompt}\nPersianMind: " |
|
CONTEXT = "This is a conversation with PersianMind. It is an artificial intelligence model designed by a team of " \ |
|
"NLP experts at the University of Tehran to help you with various tasks such as answering questions, " \ |
|
"providing recommendations, and helping with decision making. You can ask it anything you want and " \ |
|
"it will do its best to give you accurate and relevant information." |
|
PROMPT = "در مورد هوش مصنوعی توضیح بده." |
|
|
|
model_input = TEMPLATE.format(context=CONTEXT, prompt=PROMPT) |
|
input_tokens = tokenizer(model_input, return_tensors="pt") |
|
input_tokens = input_tokens.to(device) |
|
generate_ids = model.generate(**input_tokens, max_new_tokens=512, do_sample=False, repetition_penalty=1.1) |
|
model_output = tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0] |
|
model_output = model_output.replace(model_input, "") |
|
|
|
print(model_output) |
|
``` |
|
|
|
## License |
|
PersianMind is subject to Meta's [LLaMa2 Community License](https://raw.githubusercontent.com/facebookresearch/llama/main/LICENSE). |
|
It is further licensed under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/), which allows non-commercial use of the model. |
|
Commercial use of this model requires written agreement which must be obtained from the copyright holders who are listed as developers in this page. |
|
If you suspect any violations, please reach out to us. |
|
|
|
|
|
## Citation |
|
|
|
If you find the following model helpful, please ensure to cite the following paper. |
|
|
|
**BibTeX:** |
|
```bibtex |
|
@article{persianmind, |
|
title={{PersianMind: A Cross-Lingual Persian-English Large Language Model}}, |
|
author={Rostami, Pedram and Salemi, Ali and Dousti, Mohammad Javad}, |
|
journal={arXiv preprint}, |
|
year={2024} |
|
} |
|
``` |