nordenxgt commited on
Commit
11a59de
·
verified ·
1 Parent(s): 2df3cbc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -1
README.md CHANGED
@@ -20,4 +20,56 @@ Directly quantized 4bit model with bitsandbytes. Built with Meta Llama 3. By Uns
20
  - **Developed by:** Norden Ghising Tamang under DarviLab Pvt. Ltd
21
  - **Model type:** Transformer-based language model
22
  - **Language(s) (NLP):** Nepali
23
- - **License:** A custom commercial license is available at: https://llama.meta.com/llama3/license
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  - **Developed by:** Norden Ghising Tamang under DarviLab Pvt. Ltd
21
  - **Model type:** Transformer-based language model
22
  - **Language(s) (NLP):** Nepali
23
+ - **License:** A custom commercial license is available at: https://llama.meta.com/llama3/license
24
+
25
+ ## How To Use
26
+
27
+ ### Using HuggingFace's AutoModelForPeftCausalLM
28
+
29
+ ```python
30
+ from peft import AutoPeftModelForCausalLM
31
+ from transformers import AutoTokenizer
32
+ model = AutoPeftModelForCausalLM.from_pretrained(
33
+ "nordenxgt/nelm-chat-unsloth-llama3-v.0.0.1"
34
+ load_in_4bit=True
35
+ )
36
+ tokenizer = AutoTokenizer.from_pretrained("nordenxgt/nelm-chat-unsloth-llama3-v.0.0.1")
37
+ ```
38
+
39
+ ### Using UnslothAI [x2 Faster Inference]
40
+
41
+ ```python
42
+ from unsloth import FastLanguageModel
43
+ model, tokenizer = FastLanguageModel.from_pretrained(
44
+ model_name="nordenxgt/nelm-chat-unsloth-llama3-v.0.0.1",
45
+ max_seq_length=2048,
46
+ dtype=None,
47
+ load_in_4bit=True,
48
+ )
49
+ FastLanguageModel.for_inference(model)
50
+ ```
51
+
52
+ ```python
53
+ alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
54
+
55
+ ### Instruction:
56
+ {}
57
+
58
+ ### Input:
59
+ {}
60
+
61
+ ### Response:
62
+ {}"""
63
+
64
+ inputs = tokenizer(
65
+ [
66
+ alpaca_prompt.format(
67
+ "गौतम बुद्धको जन्म कुन देशमा भएको थियो? # instruction
68
+ "", # input
69
+ "", # output - leave this blank for generation!
70
+ )
71
+ ], return_tensors = "pt").to("cuda")
72
+
73
+ outputs = model.generate(**inputs, max_new_tokens=64, use_cache=True)
74
+ tokenizer.batch_decode(outputs)
75
+ ```