LLamaStory-70M is a LLama Model Pre-trained on a story-generation dataset

About Training:

  • EasyDel Platform Used
  • TPU-v4
  • batch-size 2048
  • max positioning embedding 512
  • 12 Epochs (yet)

this model will be used to Debug 4 and 8 bit training and inference in JAX and Rust with EasyDel

Downloads last month
24
Safetensors
Model size
70.5M params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train erfanzar/LLamaStory-70M