dreamerdeo commited on
Commit
b3af2ba
·
verified ·
1 Parent(s): 1294477

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -21
README.md CHANGED
@@ -61,37 +61,65 @@ Through systematic experiments to determine the weights of different languages,
61
  The approach boosts their performance on SEA languages while maintaining proficiency in English and Chinese without significant compromise.
62
  Finally, we continually pre-train the Qwen1.5-0.5B model with 400 Billion tokens, and other models with 200 Billion tokens to obtain the Sailor models.
63
 
64
- ## Requirements
65
- The code of Sailor has been in the latest Hugging face transformers and we advise you to install `transformers>=4.37.0`.
66
 
67
- ## Quickstart
 
 
 
 
 
68
 
69
- Here provides a code snippet to show you how to load the tokenizer and model and how to generate contents.
70
-
71
- ```python
72
- from transformers import AutoModelForCausalLM, AutoTokenizer
73
- device = "cuda" # the device to load the model
74
 
75
- model = AutoModelForCausalLM.from_pretrained("sail/Sailor-7B", device_map="auto")
76
- tokenizer = AutoTokenizer.from_pretrained("sail/Sailor-7B")
77
 
78
- input_message = "Model bahasa adalah model probabilistik"
79
- ### The given Indonesian input translates to 'A language model is a probabilistic model of.'
80
 
81
- model_inputs = tokenizer([input_message], return_tensors="pt").to(device)
 
 
82
 
83
- generated_ids = model.generate(
84
- model_inputs.input_ids,
85
- max_new_tokens=64
 
 
 
 
 
 
 
 
 
86
  )
87
 
88
- generated_ids = [
89
- output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
90
- ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
91
 
92
- response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
93
- print(response)
94
  ```
 
 
 
95
 
96
  # License
97
 
 
61
  The approach boosts their performance on SEA languages while maintaining proficiency in English and Chinese without significant compromise.
62
  Finally, we continually pre-train the Qwen1.5-0.5B model with 400 Billion tokens, and other models with 200 Billion tokens to obtain the Sailor models.
63
 
64
+ ### How to run with `llama.cpp`
 
65
 
66
+ ```shell
67
+ # install llama.cpp
68
+ git clone https://github.com/ggerganov/llama.cpp.git
69
+ cd llama.cpp
70
+ make
71
+ pip install -r requirements.txt
72
 
73
+ # generate with llama.cpp
74
+ ./main -ngl 40 -m ggml-model-Q4_K_M.gguf -p "<|im_start|>question\nCara memanggang ikan?\n<|im_start|>answer\n" --temp 0.7 --repeat_penalty 1.1 -n 400 -e
75
+ ```
 
 
76
 
77
+ > Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
 
78
 
79
+ ### How to run with `llama-cpp-python`
 
80
 
81
+ ```shell
82
+ pip install llama-cpp-python
83
+ ```
84
 
85
+ ```python
86
+ import llama_cpp
87
+ import llama_cpp.llama_tokenizer
88
+
89
+ # load model
90
+ llama = llama_cpp.Llama.from_pretrained(
91
+ repo_id="sail/Sailor-4B-Chat-gguf",
92
+ filename="ggml-model-Q4_K_M.gguf",
93
+ tokenizer=llama_cpp.llama_tokenizer.LlamaHFTokenizer.from_pretrained("sail/Sailor-4B-Chat"),
94
+ n_gpu_layers=40,
95
+ n_threads=8,
96
+ verbose=False,
97
  )
98
 
99
+ system_role= 'system'
100
+ user_role = 'question'
101
+ assistant_role = "answer"
102
+
103
+ system_prompt= \
104
+ 'You are an AI assistant named Sailor created by Sea AI Lab. \
105
+ Your answer should be friendly, unbiased, faithful, informative and detailed.'
106
+ system_prompt = f"<|im_start|>{system_role}\n{system_prompt}<|im_end|>"
107
+
108
+ # inference example
109
+ output = llama(
110
+ system_prompt + '\n' + f"<|im_start|>{user_role}\nCara memanggang ikan?\n<|im_start|>{assistant_role}\n",
111
+ max_tokens=256,
112
+ temperature=0.7,
113
+ top_p=0.75,
114
+ top_k=60,
115
+ stop=["<|im_end|>", "<|endoftext|>"]
116
+ )
117
 
118
+ print(output['choices'][0]['text'])
 
119
  ```
120
+ ### How to build demo
121
+
122
+ Install `llama-cpp-python` and `gradio`, then run [script](https://github.com/sail-sg/sailor-llm/blob/main/demo/llamacpp_demo.py).
123
 
124
  # License
125