branch:global_step95000_universal ,there is no weight file of layer0,layer1?
there is no weight of layer0,layer1?
I didn't find model file as follow:
1.input_layernorm.bias
1.input_layernorm.weight
1.mlp.dense_4h_to_h.bias
1.mlp.dense_4h_to_h.weight
1.mlp.dense_h_to_4h.bias
1.mlp.dense_h_to_4h.weight
1.post_attention_layernorm.bias
1.post_attention_layernorm.weight
1.self_attention.dense.bias
1.self_attention.dense.weight
1.self_attention.query_key_value.bias
1.self_attention.query_key_value.weight
@Muennighoff
@stas
Not 100% sure, but I think it's because 1 is the tied embeddings which are not numbered but in tied_modules.
I find model file of layer.10 is about 27GB(the same as layer.11), but tied_modules only 42GB.
layer.0 + layer.1 = 27GB x2 =54GB.
I don't think the tied_modules includes the layer0 and layer1, Is that correct?
@Muennighoff
I don't think there is a layer.0 (see https://huggingface.co/bigscience/bloom-optimizer-states/tree/main/global_step95000); The numbering is a bit weird because it includes layers that do not have parameters hence some numbers are missing; Just try loading it & you will see if sth is missing - I think it should work