Xenova HF Staff commited on
Commit
5dfe498
·
verified ·
1 Parent(s): da2decb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md CHANGED
@@ -8,6 +8,58 @@ tags: []
8
  <!-- Provide a quick summary of what the model is/does. -->
9
 
10
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
  ## Model Details
13
 
 
8
  <!-- Provide a quick summary of what the model is/does. -->
9
 
10
 
11
+ ## Conversion code
12
+
13
+ ```py
14
+ from transformers import (
15
+ AutoProcessor,
16
+ Llama4ForConditionalGeneration,
17
+ Llama4VisionConfig,
18
+ Llama4TextConfig,
19
+ Llama4Config,
20
+ )
21
+ import torch
22
+
23
+ torch.manual_seed(0)
24
+
25
+ model_id = "meta-llama/Llama-4-Scout-17B-16E-Instruct"
26
+ torch_dtype = torch.bfloat16 # or torch.float32
27
+
28
+ intermediate_size_mlp = 64
29
+ config = Llama4Config.from_pretrained(
30
+ model_id,
31
+ text_config=Llama4TextConfig.from_pretrained(
32
+ model_id,
33
+ head_dim=8,
34
+ hidden_size=16,
35
+ intermediate_size=32,
36
+ intermediate_size_mlp=intermediate_size_mlp,
37
+ moe_layers=[0,1,2,3,4],
38
+ no_rope_layers=[1,1,1,0,1],
39
+ num_attention_heads=10,
40
+ num_experts_per_tok=1,
41
+ num_hidden_layers=5,
42
+ num_key_value_heads=2,
43
+ num_local_experts=4,
44
+ ),
45
+ vision_config=Llama4VisionConfig.from_pretrained(
46
+ model_id,
47
+ hidden_size=16,
48
+ intermediate_size=intermediate_size_mlp,
49
+ num_attention_heads=4,
50
+ num_hidden_layers=2,
51
+ projector_input_dim=128,
52
+ projector_output_dim=128,
53
+ vision_output_dim=128,
54
+ ),
55
+ )
56
+
57
+ model = Llama4ForConditionalGeneration(config).to(torch_dtype)
58
+ print(model.num_parameters()) # 6571696
59
+
60
+ processor = AutoProcessor.from_pretrained(model_id)
61
+ ```
62
+
63
 
64
  ## Model Details
65