arunb74 commited on
Commit
91ba7e3
·
verified ·
1 Parent(s): dd47d85

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +85 -10
README.md CHANGED
@@ -15,21 +15,96 @@ Luxeai-anu-1-bit-70M
15
 
16
  ## Model Description
17
  The Luxeai-anu-1-bit-70M Large Language Model (LLM) is my first trial to implement one-bit LLM based on the original paper - "The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits". I have taken the pre-trained Mistral-7B-v0.3 and abideen/Cosmopedia-100k-pretrain dataset.
 
18
 
19
  ## Intended Use
20
- - **Task**: Describe the specific tasks (e.g., sentiment analysis, text generation) the model is designed for.
21
- - **Industries**: Mention any particular industries or applications where the model could be applied.
22
- - **Users**: Identify the intended users (e.g., researchers, developers).
23
 
24
  ## How to Use
25
- Provide code examples for loading and using the model:
26
 
27
  ```python
28
- from transformers import AutoModel, AutoTokenizer
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
- tokenizer = AutoTokenizer.from_pretrained("username/model_name")
31
- model = AutoModel.from_pretrained("username/model_name")
32
 
33
- # Example usage
34
- inputs = tokenizer("Hello, world!", return_tensors="pt")
35
- outputs = model(**inputs)
 
15
 
16
  ## Model Description
17
  The Luxeai-anu-1-bit-70M Large Language Model (LLM) is my first trial to implement one-bit LLM based on the original paper - "The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits". I have taken the pre-trained Mistral-7B-v0.3 and abideen/Cosmopedia-100k-pretrain dataset.
18
+ I used Microsoft Azure Standard_NC6s_v3 6 cores, 112GB RAM, 736GB storage 1 x NVIDIA Tesla V100 to train this initial model. I will be training on a much bigger dataset once I get a sponshorship for a 8x DGX System. I have tested on a sub-set of the same dataset.
19
 
20
  ## Intended Use
21
+ - **Task**: text generation
22
+
 
23
 
24
  ## How to Use
25
+ Please follow the below code to run and test it in Python Jupyter Notebook
26
 
27
  ```python
28
+ pip install transformers sentencepiece
29
+
30
+ from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
31
+ from transformers.models.llama.modeling_llama import *
32
+ # Load a pretrained BitNet model
33
+ model = "arunb74/Luxeai-anu-1-bit-70M"
34
+ tokenizer = AutoTokenizer.from_pretrained(model)
35
+ model = AutoModelForCausalLM.from_pretrained(model)
36
+
37
+ def activation_norm_quant(x):
38
+ x = RMSNorm(x)
39
+ scale = 127.0 / x.abs().max(dim=-1, keepdim=True).values.clamp_(min=1e-5)
40
+ y = (x * scale).round().clamp_(-128, 127)
41
+ y = y / scale
42
+ return y, scale
43
+
44
+ def weight_quant(w):
45
+ scale = 1.0 / w.abs().mean().clamp_(min=1e-5)
46
+ u = (w * scale).round().clamp_(-1, 1)
47
+ u = u / scale
48
+ return u
49
+
50
+ class BitNetInference(nn.Linear):
51
+ def forward(self, x):
52
+ w = self.weight # a weight tensor with shape [d, k]
53
+ x = x.to(w.device)
54
+ RMSNorm = LlamaRMSNorm(x.shape[-1]).to(w.device)
55
+ x_norm = RMSNorm(x)
56
+ # A trick for implementing Straight−Through−Estimator (STE) using detach()
57
+ x_quant = x_norm + (activation_norm_quant(x_norm) - x_norm).detach()
58
+ w_quant = w + (weight_quant(w) - w).detach()
59
+ y = F.linear(x_quant, w_quant)
60
+ return y
61
+
62
+
63
+ def convert_to_bitnet(model, copy_weights):
64
+ for name, module in model.named_modules():
65
+ # Replace linear layers with BitNet
66
+ if isinstance(module, LlamaSdpaAttention) or isinstance(module, LlamaMLP):
67
+ for child_name, child_module in module.named_children():
68
+ if isinstance(child_module, nn.Linear):
69
+ bitlinear = BitNetInference(child_module.in_features, child_module.out_features, child_module.bias is not None).to(device="cuda:0")
70
+ if copy_weights:
71
+ bitlinear.weight = child_module.weight
72
+ if child_module.bias is not None:
73
+ bitlinear.bias = child_module.bias
74
+ setattr(module, child_name, bitlinear)
75
+ # Remove redundant input_layernorms
76
+ elif isinstance(module, LlamaDecoderLayer):
77
+ for child_name, child_module in module.named_children():
78
+ if isinstance(child_module, LlamaRMSNorm) and child_name == "input_layernorm":
79
+ setattr(module, child_name, nn.Identity().to(device="cuda:0"))
80
+
81
+
82
+ convert_to_bitnet(model, copy_weights=True)
83
+
84
+ # Create a text generation pipeline
85
+ pipe = pipeline(
86
+ "text-generation",
87
+ model=model,
88
+ tokenizer=tokenizer,
89
+ device_map="auto"
90
+ )
91
+
92
+ prompt = "The LISA Pathfinder scientific collaboration will meet in Trento"
93
+
94
+ sequences = pipe(
95
+ f"<s>[INST] {prompt} [/INST]",
96
+ do_sample=True,
97
+ max_new_tokens=100,
98
+ temperature=0.7,
99
+ top_k=50,
100
+ top_p=0.95,
101
+ num_return_sequences=1,
102
+ )
103
+
104
+ print(sequences[0]['generated_text'])
105
+
106
+ The output will be as follows - <s>[INST] The LISA Pathfinder scientific collaboration will meet in Trento [/INST]
107
 
108
+ The LISA Pathfinder Biology, a leading provider of biochemistry and molecular biology, provides a comprehensive understanding of the mechanisms and mechanisms of the LISA pathways. The LISA Pathfinder Biology, a researcher specializing in molecular biology, is a clinical trial of the disease, and its pathophysiology, and a combination of the most commonly used and widely used treatments. It is a relatively simple procedure that involves two steps.
 
109
 
110
+ # I need community members to help me further for feedback, suitable dataset for further training, testing, evaluation.