|
--- |
|
license: llama2 |
|
--- |
|
|
|
* indo-instruct-llama2-32kmodel card |
|
* Model Details |
|
* Developed by: monuminu |
|
* Backbone Model: LLaMA-2 |
|
* Language(s): English |
|
* Library: HuggingFace Transformers |
|
* License: Fine-tuned checkpoints is licensed under the Non-Commercial Creative Commons license (CC BY-NC-4.0) |
|
* Where to send comments: Instructions on how to provide feedback or comments on a model can be found by opening an issue in the Hugging Face community's model repository |
|
* Contact: For questions and comments about the model |
|
* Dataset Details |
|
* Used Datasets |
|
* alpaca dataset |
|
|
|
``` |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("monuminu/indo-instruct-llama2-13b") |
|
model = AutoModelForCausalLM.from_pretrained( |
|
"monuminu/indo-instruct-llama2-32k", |
|
device_map="auto", |
|
torch_dtype=torch.float16, |
|
load_in_8bit=True, |
|
) |
|
|
|
prompt = "### User:\nThomas is healthy, but he has to go to the hospital. What could be the reasons?\n\n### Assistant:\n" |
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
del inputs["token_type_ids"] |
|
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True) |
|
|
|
output = model.generate(**inputs, streamer=streamer, use_cache=True, max_new_tokens=float('inf')) |
|
output_text = tokenizer.decode(output[0], skip_special_tokens=True) |
|
``` |