LLamaStory-70M is a LLama Model Pre-trained on a story-generation dataset

About Training:

EasyDel Platform Used
TPU-v4
batch-size 2048
max positioning embedding 512
12 Epochs (yet)

this model will be used to Debug 4 and 8 bit training and inference in JAX and Rust with EasyDel

Downloads last month: 24

Safetensors

Model size

70.5M params

Tensor type

FP16

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

erfanzar
/

LLamaStory-70M

Dataset used to train erfanzar/LLamaStory-70M