Hemanth-thunder commited on
Commit
6acf238
1 Parent(s): 8d97901

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -20
README.md CHANGED
@@ -38,16 +38,30 @@ Tamil LLM: A Breakthrough in Tamil Language Understanding In the realm of langua
38
 
39
  ## Instruction format
40
 
41
- In order to leverage instruction fine-tuning, your prompt should be surrounded by `[INST]` and `[/INST]` tokens. The very first instruction should begin with a begin of sentence id. The next instructions should not. The assistant generation will be ended by the end-of-sentence token id.
42
-
43
  E.g.
 
 
44
  ```
 
 
 
 
 
 
 
 
 
45
  prompt_template =<s>"""சரியான பதிலுடன் வேலையை வெற்றிகரமாக முடிக்க, வழங்கப்பட்ட வழிகாட்டுதல்களைப் பின்பற்றி, தேவையான தகவலை உள்ளிடவும்.
46
 
47
  ### Instruction:
48
  {}
49
 
 
 
 
50
  ### Response:"""
 
51
  ```
52
 
53
  This format is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating) via the `apply_chat_template()` method:
@@ -55,25 +69,7 @@ This format is available as a [chat template](https://huggingface.co/docs/transf
55
  ```python
56
  from transformers import AutoModelForCausalLM, AutoTokenizer
57
 
58
- device = "cuda" # the device to load the model onto
59
-
60
- model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
61
- tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
62
-
63
- messages = [
64
- {"role": "user", "content": "What is your favourite condiment?"},
65
- {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
66
- {"role": "user", "content": "Do you have mayonnaise recipes?"}
67
- ]
68
-
69
- encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
70
-
71
- model_inputs = encodeds.to(device)
72
- model.to(device)
73
 
74
- generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
75
- decoded = tokenizer.batch_decode(generated_ids)
76
- print(decoded[0])
77
  ```
78
  ## Python function to format query
79
  ```python
 
38
 
39
  ## Instruction format
40
 
41
+ To harness the power of instruction fine-tuning, your prompt must be encapsulated within <s> and </s> tokens. This instructional format revolves around three key elements: Instruction, Input, and Response. The Tamil Mistral instruct model is adept at engaging in conversations based on this structured template.
 
42
  E.g.
43
+
44
+
45
  ```
46
+ # without Input
47
+ prompt_template =<s>"""சரியான பதிலுடன் வேலையை வெற்றிகரமாக முடிக்க, தேவையான தகவலை உள்ளிடவும்.
48
+
49
+ ### Instruction:
50
+ {}
51
+
52
+ ### Response:"""
53
+
54
+ # with Input
55
  prompt_template =<s>"""சரியான பதிலுடன் வேலையை வெற்றிகரமாக முடிக்க, வழங்கப்பட்ட வழிகாட்டுதல்களைப் பின்பற்றி, தேவையான தகவலை உள்ளிடவும்.
56
 
57
  ### Instruction:
58
  {}
59
 
60
+ ### Input:
61
+ {}
62
+
63
  ### Response:"""
64
+
65
  ```
66
 
67
  This format is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating) via the `apply_chat_template()` method:
 
69
  ```python
70
  from transformers import AutoModelForCausalLM, AutoTokenizer
71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
72
 
 
 
 
73
  ```
74
  ## Python function to format query
75
  ```python