LLamaStory-70M is a LLama Model Pre-trained on a story-generation dataset
About Training:
- EasyDel Platform Used
- TPU-v4
- batch-size 2048
- max positioning embedding 512
- 12 Epochs (yet)
this model will be used to Debug 4 and 8 bit training and inference in JAX and Rust with EasyDel
- Downloads last month
- 24
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.