This model uses Mamba Architecture trained on a research abstract dataset.

  • Optimizer: AdamW
  • Leanring Rate: 0.001

Import the scripts from the code folder

from model import Mamba, ModelArgs

Loading Model

mamba_model = Mamba.from_pretrained("pt-sk/mamba").to("cuda")

Loading Tokenizer

tokenizer = AutoTokenizer.from_pretrained('pt-sk/mamba')

mamba_reserach file contains the state dict of optimizer and the model.

Downloads last month
9
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train pt-sk/mamba