layer ranames

Files changed (3) hide show

__pycache__/modeling_minimamba.cpython-312.pyc CHANGED Viewed

Binary files a/__pycache__/modeling_minimamba.cpython-312.pyc and b/__pycache__/modeling_minimamba.cpython-312.pyc differ

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5dcc4c001780680c5f2b337e31b51fe35a5473a211ac3a2aa5ec0a6c19af0413
-size 3065238552

 version https://git-lfs.github.com/spec/v1
+oid sha256:395592f9d4d414560ff89dfc7ce95dd4bf258d66daac079b1672923e27d76270
+size 3065241488

modeling_minimamba.py CHANGED Viewed

@@ -52,7 +52,7 @@ class MiniMamba(PreTrainedModel):
         # But Mamba2 does that internally if config.weight_tying == True.
         # This is optional: store any device or dtype you might want
-        self.device_ = torch.device(config.device)
         if isinstance(config.torch_dtype, str):
             self.dtype_ = getattr(torch, config.torch_dtype)
         else:

         # But Mamba2 does that internally if config.weight_tying == True.
         # This is optional: store any device or dtype you might want
+        self.device_ = 'cuda' if torch.cuda.is_available() else 'cpu'
         if isinstance(config.torch_dtype, str):
             self.dtype_ = getattr(torch, config.torch_dtype)
         else: